[U-Boot] [PATCH 3/7] 83xx/85xx/86xx: Add ECC support
Ira W. Snyder
iws at ovro.caltech.edu
Tue Nov 10 18:53:50 CET 2009
On Tue, Nov 10, 2009 at 11:36:44AM -0600, Peter Tyser wrote:
> > Ok, here are my results, this is on a 8349EMDS-derived board. My
> > 8349EMDS eval board doesn't have ECC memory.
> >
> > 1) It might be nice to have something to print the current injection
> > registers. It is not a big deal, anyone using this should be an expert
> > anyway.
>
> Thanks for the feedback. I can add a printing of the current injection
> values when "ecc inject" is ran if others would like.
>
> > 2) ecc inject off didn't seem to work, see the following capture:
> >
> > => ecc info
> > No ECC errors have occurred
> > => ecc inject low 0x1
> > => ecc info
> >
> > WARNING: ECC error in DDR Controller 0
> > Addr: 0x0_0ff7ae40
> > Data: 0x0fffdf9c_0ff7aed1 ECC: 0x81
> > Expect: 0x0fffdf9c_0ff7aed0 ECC: 0x81
> > Net: DATA0
> > Syndrome: 0x3b
> > Single-Bit errors: 0x1e
> > Attrib: 0x01002001
> > Detect: 0x80000004 (MME, SBE)
> >
> > => ecc inject off
> >
> > # Ok, now error injection is off, I still expect some errors to be
> > # present in the error registers
> >
> > => ecc info
> >
> > WARNING: ECC error in DDR Controller 0
> > Addr: 0x0_0ff7ae1c
> > Data: 0x0fffdf9c_0ff7d2a1 ECC: 0xe4
> > Expect: 0x0fffdf9c_0ff7d2a0 ECC: 0xe4
> > Net: DATA0
> > Syndrome: 0x3b
> > Single-Bit errors: 0xd1
> > Attrib: 0x01003001
> > Detect: 0x80000004 (MME, SBE)
> >
> > # And there was the error. Now, I don't expect any more errors to
> > # be present, after all, injection is disabled.
> > #
> > # But there is one! Why?
>
> I believe what's happening is:
> 1. You turn error injection on
> 2. Every time you perform a DRAM write, the value written has an ECC
> error
> 3. You write to DRAM lots of times, in lots of locations
> 4. You turn error injection off
> 5. There are still lots of ECC errors residing in DRAM that you discover
> later when you read from "corrupted" memory locations
>
> So in theory, unless you scrub your memory, you might uncover lots more
> ECC errors later.
>
> As an easily reproducible example try:
> > ecc inject low 1; mw.l 0x100000 0xbeefba11 0x800000; ecc inject off
> > ecc info
> > ecc info
> > md 0x100000
> > ecc info
> > ecc info
> > md 0x200000
> ...
>
> The majority of the above ecc errors could be cleared by running the
> following command with ecc injection off:
> mw.l 0x100000 0xbeefba11 0x800000
>
>
> > => ecc info
> >
> > WARNING: ECC error in DDR Controller 0
> > Addr: 0x0_0fff8a0c
> > Data: 0x0fff8a00_0fff8a01 ECC: 0xff
> > Expect: 0x0fff8a00_0fff8a00 ECC: 0xff
> > Net: DATA0
> > Syndrome: 0x3b
> > Single-Bit errors: 0x04
> > Attrib: 0x01003001
> > Detect: 0x00000000
> > =>
> >
> > # Note that I keep seeing ecc errors until I run the command:
> > # ecc inject low 0
>
> Hmm... "ecc inject off" should have the same effect as "ecc inject low
> 0". Is there a chance some of the ECC errors still remaining in DRAM
> are the culprit?
>
> > # Why did it take two runs of ecc info to clear all of the errors?
>
> This is probably the same issue as above - lots errors are injected and
> there's no saying when exactly they'll turn up.
>
> > Other than the above strangeness, everything is working great on my 83xx
> > board. I think the new output is pretty nice. It serves my purposes
> > equally well to the old code.
>
> Thanks for trying the changes out,
Ok, this makes perfect sense. I didn't think about the possibility of
latent memory errors. :)
Here is a run using your instructions above. Keeping the possibility of
latent memory errors in mind, the behavior seems correct to me. You're
free to add my Tested-by if you'd like.
=> ecc inject low 1
=> mw.l 0x100000 0xbeefba11 0x800000
=> ecc inject off
=> ecc info
WARNING: ECC error in DDR Controller 0
Addr: 0x0_0ff7ae40
Data: 0x0fffdf9c_0ff7aed1 ECC: 0x81
Expect: 0x0fffdf9c_0ff7aed0 ECC: 0x81
Net: DATA0
Syndrome: 0x3b
Single-Bit errors: 0x56
Attrib: 0x01002001
Detect: 0x80000004 (MME, SBE)
=> ecc info
WARNING: ECC error in DDR Controller 0
Addr: 0x0_0ff7ad08
Data: 0x0ffd594c_0000087f ECC: 0x91
Expect: 0x0ffd594c_0000087e ECC: 0x91
Net: DATA0
Syndrome: 0x3b
Single-Bit errors: 0x01
Attrib: 0x01003001
Detect: 0x00000000
=> ecc info
No ECC errors have occurred
=> ecc info
No ECC errors have occurred
=> ecc info
No ECC errors have occurred
=> md 0x100000 10
00100000: beefba11 beefba11 beefba11 beefba11 ................
00100010: beefba11 beefba11 beefba11 beefba11 ................
00100020: beefba11 beefba11 beefba11 beefba11 ................
00100030: beefba11 beefba11 beefba11 beefba11 ................
=> ecc info
WARNING: ECC error in DDR Controller 0
Addr: 0x0_0010003c
Data: 0xbeefba11_beefba10 ECC: 0x7b
Expect: 0xbeefba11_beefba11 ECC: 0x7b
Net: DATA0
Syndrome: 0x3b
Single-Bit errors: 0x13
Attrib: 0x01002001
Detect: 0x00000000
=> ecc info
No ECC errors have occurred
=> md 0x200000 10
00200000: beefba11 beefba11 beefba11 beefba11 ................
00200010: beefba11 beefba11 beefba11 beefba11 ................
00200020: beefba11 beefba11 beefba11 beefba11 ................
00200030: beefba11 beefba11 beefba11 beefba11 ................
=> ecc info
WARNING: ECC error in DDR Controller 0
Addr: 0x0_0020003c
Data: 0xbeefba11_beefba10 ECC: 0x7b
Expect: 0xbeefba11_beefba11 ECC: 0x7b
Net: DATA0
Syndrome: 0x3b
Single-Bit errors: 0x10
Attrib: 0x01002001
Detect: 0x00000000
=> ecc info
No ECC errors have occurred
=> mw.l 0x100000 0xbeefba11 0x800000
=> ecc info
WARNING: ECC error in DDR Controller 0
Addr: 0x0_001007c8
Data: 0xbeefba11_beefba10 ECC: 0x7b
Expect: 0xbeefba11_beefba11 ECC: 0x7b
Net: DATA0
Syndrome: 0x3b
Single-Bit errors: 0x06
Attrib: 0x01003001
Detect: 0x80000004 (MME, SBE)
=> ecc info
No ECC errors have occurred
=> ecc info
No ECC errors have occurred
Ira
More information about the U-Boot
mailing list