[U-Boot-Users] [PATCH] ppc4xx: Refactor ECC POST for AMCC Denali core

Stefan Roese sr at denx.de
Mon Jan 14 17:40:09 CET 2008


Hi Jerry,

On Monday 14 January 2008, Jerry Van Baren wrote:
> Larry Johnson wrote:
> > The ECC POST reported intermittent failures running after power-up on
> > the Korat PPC440EPx board.  Even when the test passed, the debugging
> > output occasionally reported additional unexpected ECC errors.
> >
> > This refactoring had two main objectives: (1) minimize the code executed
> > with ECC enabled during the tests, and (2) add more checking of the
> > results so any unexpected ECC errors would cause the test to fail.
> >
> > So far, the refactored test has not reported any intermittent failures.
> > Further, synchronization instructions appear no longer to be require, so
> > have been removed.  If intermittent failures do occur in the future, the
> > refactoring should make the causes easier to identify.
>
> WHOOP, WHOOP, WHOOP, red alert!  "[S]ynchronization instructions appear
> no longer to be require[d], so have been removed".

Yes, this sounds suspicious. Thanks for jumping in.

> Synchronization instructions either *ARE* required or *ARE NOT*
> required, there is no "appear".  When sync instructions appear to not be
> required, but actually are required, that is when really obscure bugs
> start happening in the dead of winter off the coast of Alaska / Siberia
> and your boss asks you if you have warm clothes.

:)

> I am not familiar with the 4xx family or the PowerPC core that is used
> in it but...
>
> [snip]
>
> > -static int test_ecc(unsigned long ecc_addr)
> > +static int test_ecc(uint32_t ecc_addr)
> >  {
> > -	unsigned long value;
> > -	volatile unsigned *const ecc_mem = (volatile unsigned *) ecc_addr;
> > -	int pret;
> > +	uint32_t value;
> > +	volatile uint32_t *const ecc_mem = (volatile uint32_t *)ecc_addr;
> >  	int ret = 0;
> >
> > -	sync();
> > -	eieio();
> >  	WATCHDOG_RESET();
>
> The combination of "sync" and "eieio" is a strong indication of someone
> sprinkling pixie dust rather than understanding the problem.
>
> "Sync" forces all pending I/O (read/write) operations to be completed
> all the way to memory/hardware register before the instruction
> continues.  Sync guarantees *WHEN* the I/O will complete: NOW.  This is
> a big hammer: it can cause a significant performance hit because it
> stalls the processor BUT it is guaranteed effective (except for the
> places that need an both an isync and a sync combination - thankfully, I
> believe that is only needed in special cases when playing with the
> processor's control registers).
>
> "Eieio" (enforce in-order execution of I/O) is a barrier that says all
> I/O that goes before it must be completed before any I/O that goes after
> it is started.  It *DOES NOT* guarantee *WHEN* the preceding
> reads/writes will be completed.  Theoretically, the bus interface unit
> (BIU) could hold the whole shootin' match for 10 minutes before it does
> the preceding I/O followed by the succeeding I/O.  Eieio is much less
> draconian to the processor than sync (which is why eieios are preferred)
> but an eieio may or may not cause the intended synchronizing result if
> you are relying on a write or read causing the proper effect *NOW*.
> Note that eieios are NOPs to processor cores that don't reorder I/O.
>
> Some PowerPC cores (e.g. the 74xx family) can reorder reads and writes
> in the bus interface unit (some cores, such as the 603e, do *not*
> reorder reads and writes).  This is a performance enhancement... writes
> (generally) are non-blocking to the processor core where a read causes
> the processor to have to wait for the data (which cascades into pipeline
> stalls and performance hits).  The bus is a highly oversubscribed
> resource (core speed / bus speed can be 8x or more).  As a result, you
> want to get reads done ASAP (if possible) and thus it is beneficial to
> move a read ahead of a write.
>
> As you should have picked up by now, a sync (forcing all I/O to
> complete) followed by eieio is silly - the eieio is superfluous.  Seeing
> syncs/isyncs/eieios sprinkled in code is an indication that the author
> didn't understand what was going on and, as a result, kept hitting the
> problem with a bigger and bigger hammer until it appeared to have gone
> away.

Now I'm glad that I'm not the author of this code. ;) But I admit that I did 
use this "hammer" in the past.

> Besides read/write reordering problems, the bus interface unit (BIU) can
> "short circuit" a read that follows a write to the same address.  This
> is very likely to be implemented in a given core - it offers a very good
> speed up traded off against a modest increase in complexity to the BIU.
>   The problem is (for instance), if you configure your EDC to store an
> invalid EDC flag, do a write to a test location (which gets held in the
> BIU because the bus is busy), followed by a read of the test location
> (expecting to see an EDC failure), the BIU could return the queued *but
> unwritten* write value.
>
> OK, enough lecturing...
>
> Repeated disclaimer: What I write here is applicable for more complex
> PowerPC implementations.  It may not be applicable for the particular
> 4xx core you are running on.  I am not familiar with the 4xx core.

From what I see, the ECC test code uses in_be32() and friends to access the 
memory. And these access functions have all necessary barriers already built 
into. So most likely the additional barriers were never necessary at all. Or 
perhaps the code was changed from using pointer access to in_be32() access.

Nevertheless the changes from Larry are looking good to me. But I also 
forwarded them to the original author of the code for review.

Thanks again for your comments.

/me goes to mark jvb's mail as important to easier find it as reference. :)

Best regards,
Stefan

=====================================================================
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: +49-8142-66989-0 Fax: +49-8142-66989-80  Email: office at denx.de
=====================================================================




More information about the U-Boot mailing list