[U-Boot] [PATCH 3/7] 83xx/85xx/86xx: Add ECC support

Ira W. Snyder iws at ovro.caltech.edu
Tue Nov 10 18:53:50 CET 2009


On Tue, Nov 10, 2009 at 11:36:44AM -0600, Peter Tyser wrote:
> > Ok, here are my results, this is on a 8349EMDS-derived board. My
> > 8349EMDS eval board doesn't have ECC memory.
> > 
> > 1) It might be nice to have something to print the current injection
> > registers. It is not a big deal, anyone using this should be an expert
> > anyway.
> 
> Thanks for the feedback.  I can add a printing of the current injection
> values when "ecc inject" is ran if others would like.
> 
> > 2) ecc inject off didn't seem to work, see the following capture:
> > 
> > => ecc info
> > No ECC errors have occurred
> > => ecc inject low 0x1
> > => ecc info
> > 
> > WARNING: ECC error in DDR Controller 0
> >         Addr:   0x0_0ff7ae40
> >         Data:   0x0fffdf9c_0ff7aed1     ECC:    0x81
> >         Expect: 0x0fffdf9c_0ff7aed0     ECC:    0x81
> >         Net:    DATA0
> >         Syndrome: 0x3b
> >         Single-Bit errors: 0x1e
> >         Attrib: 0x01002001
> >         Detect: 0x80000004 (MME, SBE) 
> > 
> > => ecc inject off
> > 
> > # Ok, now error injection is off, I still expect some errors to be
> > # present in the error registers
> > 
> > => ecc info
> > 
> > WARNING: ECC error in DDR Controller 0
> >         Addr:   0x0_0ff7ae1c
> >         Data:   0x0fffdf9c_0ff7d2a1     ECC:    0xe4
> >         Expect: 0x0fffdf9c_0ff7d2a0     ECC:    0xe4
> >         Net:    DATA0
> >         Syndrome: 0x3b
> >         Single-Bit errors: 0xd1
> >         Attrib: 0x01003001
> >         Detect: 0x80000004 (MME, SBE) 
> > 
> > # And there was the error. Now, I don't expect any more errors to
> > # be present, after all, injection is disabled.
> > #
> > # But there is one! Why?
> 
> I believe what's happening is:
> 1. You turn error injection on
> 2. Every time you perform a DRAM write, the value written has an ECC
> error
> 3. You write to DRAM lots of times, in lots of locations
> 4. You turn error injection off
> 5. There are still lots of ECC errors residing in DRAM that you discover
> later when you read from "corrupted" memory locations
> 
> So in theory, unless you scrub your memory, you might uncover lots more
> ECC errors later.
> 
> As an easily reproducible example try:
> > ecc inject low 1; mw.l 0x100000 0xbeefba11 0x800000; ecc inject off
> > ecc info
> > ecc info
> > md 0x100000
> > ecc info
> > ecc info
> > md 0x200000
> ...
> 
> The majority of the above ecc errors could be cleared by running the
> following command with ecc injection off:
> mw.l 0x100000 0xbeefba11 0x800000
> 
> 
> > => ecc info
> > 
> > WARNING: ECC error in DDR Controller 0
> >         Addr:   0x0_0fff8a0c
> >         Data:   0x0fff8a00_0fff8a01     ECC:    0xff
> >         Expect: 0x0fff8a00_0fff8a00     ECC:    0xff
> >         Net:    DATA0
> >         Syndrome: 0x3b
> >         Single-Bit errors: 0x04
> >         Attrib: 0x01003001
> >         Detect: 0x00000000
> > => 
> > 
> > # Note that I keep seeing ecc errors until I run the command:
> > # ecc inject low 0
> 
> Hmm...  "ecc inject off" should have the same effect as "ecc inject low
> 0".  Is there a chance some of the ECC errors still remaining in DRAM
> are the culprit?
> 
> > # Why did it take two runs of ecc info to clear all of the errors?
> 
> This is probably the same issue as above - lots errors are injected and
> there's no saying when exactly they'll turn up.
> 
> > Other than the above strangeness, everything is working great on my 83xx
> > board. I think the new output is pretty nice. It serves my purposes
> > equally well to the old code.
> 
> Thanks for trying the changes out,

Ok, this makes perfect sense. I didn't think about the possibility of
latent memory errors. :)

Here is a run using your instructions above. Keeping the possibility of
latent memory errors in mind, the behavior seems correct to me. You're
free to add my Tested-by if you'd like.

=> ecc inject low 1
=> mw.l 0x100000 0xbeefba11 0x800000
=> ecc inject off
=> ecc info

WARNING: ECC error in DDR Controller 0
        Addr:   0x0_0ff7ae40
        Data:   0x0fffdf9c_0ff7aed1     ECC:    0x81
        Expect: 0x0fffdf9c_0ff7aed0     ECC:    0x81
        Net:    DATA0
        Syndrome: 0x3b
        Single-Bit errors: 0x56
        Attrib: 0x01002001
        Detect: 0x80000004 (MME, SBE) 

=> ecc info

WARNING: ECC error in DDR Controller 0
        Addr:   0x0_0ff7ad08
        Data:   0x0ffd594c_0000087f     ECC:    0x91
        Expect: 0x0ffd594c_0000087e     ECC:    0x91
        Net:    DATA0
        Syndrome: 0x3b
        Single-Bit errors: 0x01
        Attrib: 0x01003001
        Detect: 0x00000000
=> ecc info
No ECC errors have occurred
=> ecc info
No ECC errors have occurred
=> ecc info
No ECC errors have occurred
=> md 0x100000 10
00100000: beefba11 beefba11 beefba11 beefba11    ................
00100010: beefba11 beefba11 beefba11 beefba11    ................
00100020: beefba11 beefba11 beefba11 beefba11    ................
00100030: beefba11 beefba11 beefba11 beefba11    ................
=> ecc info      

WARNING: ECC error in DDR Controller 0
        Addr:   0x0_0010003c
        Data:   0xbeefba11_beefba10     ECC:    0x7b
        Expect: 0xbeefba11_beefba11     ECC:    0x7b
        Net:    DATA0
        Syndrome: 0x3b
        Single-Bit errors: 0x13
        Attrib: 0x01002001
        Detect: 0x00000000
=> ecc info
No ECC errors have occurred
=> md 0x200000 10
00200000: beefba11 beefba11 beefba11 beefba11    ................
00200010: beefba11 beefba11 beefba11 beefba11    ................
00200020: beefba11 beefba11 beefba11 beefba11    ................
00200030: beefba11 beefba11 beefba11 beefba11    ................
=> ecc info

WARNING: ECC error in DDR Controller 0
        Addr:   0x0_0020003c
        Data:   0xbeefba11_beefba10     ECC:    0x7b
        Expect: 0xbeefba11_beefba11     ECC:    0x7b
        Net:    DATA0
        Syndrome: 0x3b
        Single-Bit errors: 0x10
        Attrib: 0x01002001
        Detect: 0x00000000
=> ecc info
No ECC errors have occurred
=> mw.l 0x100000 0xbeefba11 0x800000
=> ecc info

WARNING: ECC error in DDR Controller 0
        Addr:   0x0_001007c8
        Data:   0xbeefba11_beefba10     ECC:    0x7b
        Expect: 0xbeefba11_beefba11     ECC:    0x7b
        Net:    DATA0
        Syndrome: 0x3b
        Single-Bit errors: 0x06
        Attrib: 0x01003001
        Detect: 0x80000004 (MME, SBE) 

=> ecc info
No ECC errors have occurred
=> ecc info
No ECC errors have occurred

Ira


More information about the U-Boot mailing list