[U-Boot] Strange NAND issue on a P1014

Scott Wood scottwood at freescale.com
Thu Aug 15 22:35:26 CEST 2013


On Thu, 2013-08-15 at 19:35 +0000, ANDY KENNEDY wrote:
> > -----Original Message-----
> > From: Scott Wood [mailto:scottwood at freescale.com]
> > Sent: Thursday, August 15, 2013 2:09 PM
> > 
> > On Thu, 2013-08-15 at 18:34 +0000, ANDY KENNEDY wrote:
> > > All,
> > >
> > > We are attempting to set up a NAND chip on our board through u-Boot.
> > > Strange things are happening.
> > 
> > What sort of "strange things"?
> 
> e.g. NAND dropping out. 

Again, could you be more specific?  Is there a particular error message
you're getting?

Have you double checked your NAND timings?

>  Command lines that are > 1000 bytes getting
> truncated, etc.

Have you increased CONFIG_SYS_CBSIZE?  Though I'm not sure why the
behavior of exceeding CONFIG_SYS_CBSIZE would be unpredictable, unless
there's some bounds checking bug.

> > >  During our debugging (of release
> > > 2013.04), we found the issue seemed to be in the file
> > > drivers/mtd/nand/nand_base.c file around line 2640:
> > >
> > > 	chip->cmdfunc(mtd, NAND_CMD_READID, 0x00, -1);
> > >
> > > An interesting comment is below this line:
> > >
> > > 	/* Try again to make sure, as some systems the bus-hold or other
> > >        * interface concerns can cause random data which looks like a
> > >        * possibly credible NAND flash to appear. If the two results do
> > >        * not match, ignore the device completely.
> > >        */
> > >
> > > Stranger still is that adding in a putc('x') makes the problem go away
> > > (tested via cold boot ~ 20 times, warm boot ~ 10 times).  In fact,
> > > adding in a dummy function and calling it seems to do just as well.
> > >
> > > Other information is that we had issues with long command lines in
> > > u-boot.  To "fix" this (a serious hack), we adjusted config.mk's
> > > optimization level to -O1 from -Os.  It seems the putting this back to
> > > -Os makes the problem *better* but does seem to move it from a cold to a
> > > warm start issue.
> > 
> > Long command lines?  What sort of "issues"?
> 
> Long command lines = 1000 bytes or more.  Issues:  command lines get
> truncated, NAND dropping out from cold start to cold start (nothing else
> changing), doing a "reset" from within u-Boot detects a non-detected flash
> OR doesn't detect a detected flash, etc.

"detects a non-detected flash"?  Do you mean detects a flash that isn't
really there?

> > Please double check your DDR setup.  It sounds like you may have general
> > flakiness rather than a specific issue with NAND or command lines.  Have
> > you seen this on more than one board?
> 
> We had one of your guys sit with us and go through all the DDR settings
> (actually, it was one guy here and another on the phone with us -- some
> engineer by the name of ??Leigh/Lee??, I cannot recall).  So, if it is
> wrong, it is your fault ;)!
> 
> BTW, we have hundreds of systems currently running (some even in the field)
> and we haven't seen any other problems.  Several of us run Linux on these
> pretty much all day long and we have yet to see anything tank.  I'm not
> convinced that it is a DDR issue (though, you may be 100% correct).

It may not be the DDR -- it's just the first thing to check when you see
unpredictable weirdness, especially in multiple different functional
areas in code that is well-established.
 
> > > The default configuration file for the P1014 we modified to address our
> > > specific NAND flash (Linux reports this as: NAND device: Manufacturer
> > > ID: 0x2c, Chip ID: 0xf1 (Micron MT29F1G08ABADAWP).  This is a
> > > replacement for an end of life dumb NAND.  We are configuring this chip
> > > to be in the dumb, non-embedded ECC mode.).
> > 
> > What does "dumb, non-embedded ECC mode" mean?
> 
> Now days, most of the NAND flash chips have self-managing ECC.  These
> generally have 2048 write pages, 128K erase pages, (8-bit) 64K OOB areas,
> etc.
> 
> This one does not conform to such "general" practices.  The fact of the
> matter is that we had to use the IFC to handle the ECC for us as this
> chip uses a 4-bit OOB area (thus JFFS2 won't read/write to it). So, 
> we disable the internal ECC of the chip and allow EVERYTHING (refresh, etc)
> to be done by the IFC.  This chip has been placed into what Micron called
> "raw mode", IIRC.

IFC doing the ECC is what I'd consider normal usage...  I'm not familiar
with non-raw modes.

What do you mean by a 4-bit OOB area?  Searching for this chip shows it
to have 2048 byte pages, and 64 bytes of OOB per page.  NOP is 4.  This
chip should work with JFFS2.

-Scott





More information about the U-Boot mailing list