[U-Boot] [PATCH v2] nand: Hack to support 4k page in fsl_elbc_nand

Fri Jun 29 04:13:00 CEST 2012

On Thu, Jun 28, 2012 at 10:36 PM, Scott Wood <scottwood at freescale.com> wrote:
> On 06/28/2012 03:47 PM, Rafael Beims wrote:
>> Freescale FCM controller has a 2K size limitation of buffer RAM. In order
>> to support the Nand flash chip with pagesize larger than 2K bytes,
>> we read/write 2k data repeatedly by issuing FIR_OP_RB/FIR_OP_WB and save
>> them to a large buffer.
>> Because of this, the in flash layout of the oob is different from the
>> default for 4096kiB page sizes. Therefore, we need to migrate the
>> factory bad block markers from the original position to the new layout.
>>
>> Signed-off-by: Shengzhou Liu <Shengzhou.Liu at freescale.com>
>> Signed-off-by: Liu Shuo <b35362 at freescale.com>
>> Signed-off-by: Rafael Beims <rafael.beims at datacom.ind.br>
>> ---
>> Changes in v2:
>>  - Added check to disallow the migration code to run in devices with
>>  page size <= 2048
>>
>>  drivers/mtd/nand/fsl_elbc_nand.c |  447 +++++++++++++++++++++++++++++++++++---
>>  1 files changed, 419 insertions(+), 28 deletions(-)
>
> Thanks for working on this!  I've been meaning to for a while, but have
> been busy with other things.
>
>> diff --git a/drivers/mtd/nand/fsl_elbc_nand.c b/drivers/mtd/nand/fsl_elbc_nand.c
>> index 9076ad4..3b6bb1d 100644
>> --- a/drivers/mtd/nand/fsl_elbc_nand.c
>> +++ b/drivers/mtd/nand/fsl_elbc_nand.c
>> @@ -76,15 +76,17 @@ struct fsl_elbc_ctrl {
>>
>>       /* device info */
>>       fsl_lbc_t *regs;
>> -     u8 __iomem *addr;        /* Address of assigned FCM buffer        */
>> -     unsigned int page;       /* Last page written to / read from      */
>> -     unsigned int read_bytes; /* Number of bytes read during command   */
>> -     unsigned int column;     /* Saved column from SEQIN               */
>> -     unsigned int index;      /* Pointer to next byte to 'read'        */
>> -     unsigned int status;     /* status read from LTESR after last op  */
>> -     unsigned int mdr;        /* UPM/FCM Data Register value           */
>> -     unsigned int use_mdr;    /* Non zero if the MDR is to be set      */
>> -     unsigned int oob;        /* Non zero if operating on OOB data     */
>> +     u8 __iomem *addr;           /* Address of assigned FCM buffer       */
>> +     unsigned int page;          /* Last page written to / read from     */
>> +     unsigned int read_bytes;    /* Number of bytes read during command  */
>> +     unsigned int column;        /* Saved column from SEQIN              */
>> +     unsigned int index;         /* Pointer to next byte to 'read'       */
>> +     unsigned int status;        /* status read from LTESR after last op */
>> +     unsigned int mdr;           /* UPM/FCM Data Register value          */
>> +     unsigned int use_mdr;       /* Non zero if the MDR is to be set     */
>> +     unsigned int oob;           /* Non zero if operating on OOB data    */
>> +     char *buffer;               /* Just used when pagesize is greater   */
>> +                                 /* than FCM RAM 2K limitation           */
>>  };
>>
>>  /* These map to the positions used by the FCM hardware ECC generator */
>> @@ -131,6 +133,15 @@ static struct nand_bbt_descr largepage_memorybased = {
>>       .pattern = scan_ff_pattern,
>>  };
>>
>> +static u8 migrated_pattern[] = { 0xde, 0xad, 0xde, 0xad };
>
> Let's generate a proper random number here -- or at least a meaningful
> ASCII string.   Things like the above are too common and invite magic
> number collision.
>

Will do it.

>> +static int fsl_elbc_migrate_badblocks(struct mtd_info *mtd,
>> +                                   struct nand_bbt_descr *bd)
>> +{
>> +     struct nand_chip *this = mtd->priv;
>> +     int len, numblocks, i;
>> +     int startblock;
>> +     loff_t from;
>> +     uint8_t *newoob, *buf;
>> +
>> +     struct fsl_elbc_mtd *priv = this->priv;
>> +     struct fsl_elbc_ctrl *ctrl = priv->ctrl;
>> +
>> +     int num_subpages = mtd->writesize / 2048-1;
>> +     len = mtd->writesize + mtd->oobsize;
>> +     numblocks = this->chipsize >> (this->phys_erase_shift - 1);
>> +     startblock = 0;
>> +     from = 0;
>> +
>> +     newoob = vmalloc(mtd->oobsize);
>
> Even if this were Linux, oobsize should never be big enough to need vmalloc.
>
>> +     memset(newoob, 0xff, mtd->writesize+mtd->oobsize);
>
> You're writing beyond the end of that buffer.

I should have reviewed this code better... Will fix that.

>
>> +     for (i = startblock; i < numblocks; i += 2) {
>> +             int page = (from >> this->page_shift) & this->pagemask;
>> +             fsl_elbc_cmdfunc(mtd, NAND_CMD_READ0, 0, page);
>> +
>> +             /* As we are already using the hack to read the bytes
>> +              * from NAND, the original badblock marker is offset
>> +              * from its original location in the internal buffer.
>> +              * This is because the hack reads 2048 + 64 and already
>> +              * positions the spare in the correct region
>> +              * (after the 4096 offset)
>> +              */
>> +             uint8_t *badblock_pattern = (ctrl->buffer+
>> +                             mtd->writesize-(num_subpages*64))+bd->offs;
>
> Spaces around operators.
>
> I think that should be (num_subpages - 1) * 64.
>
OK.

>> +             if (fsl_elbc_check_pattern(mtd, badblock_pattern, bd)) {
>> +                     printf("Badblock found at %08x, migrating...\n", page);
>> +                     memcpy(newoob, badblock_pattern , bd->len);
>
> No space before ,
>
>> +static int fsl_elbc_write_migration_marker(struct mtd_info *mtd,
>> +                     uint8_t *buf, int len, struct nand_bbt_descr *bd)
>> +{
>> +     struct nand_chip *this = mtd->priv;
>> +     struct nand_bbt_descr *td = this->bbt_td;
>> +     int startblock;
>> +     int dir, i;
>> +     int blocks_to_write = 2;
>> +
>> +     /* Start below maximum bbt */
>> +     startblock = (mtd->size >> this->phys_erase_shift) - td->maxblocks;
>> +     dir = -1;
>
> If you want startblock below the bbt area, I think you're off by one.
>
>> +     for (i = 0; i < bd->maxblocks && blocks_to_write != 0; i++) {
>> +             int actblock = startblock + dir*i;
>> +             loff_t offs = (loff_t)actblock << this->phys_erase_shift;
>> +             int page = (offs >> this->page_shift) & this->pagemask;
>> +
>> +             /* Avoid badblocks writing the marker... */
>> +             if (!fsl_elbc_scan_read_raw_oob(mtd, buf, page, mtd->writesize,
>> +                                     &largepage_memorybased)) {
>> +
>> +                     /* We are reusing this buffer, reset it */
>> +                     memset(buf, 0xff, len);
>> +                     memcpy(buf+bd->offs, bd->pattern, bd->len);
>> +
>> +                     /*  Mark the block as bad the same way that
>> +                      *  it's done for the bbt. This should avoid
>> +                      *  this block being overwritten
>> +                      */
>> +                     memset(buf+this->badblockpos, 0x02, 1);
>> +
>> +                     fsl_elbc_cmdfunc(mtd, NAND_CMD_SEQIN,
>> +                                             mtd->writesize, page);
>> +                     fsl_elbc_write_buf(mtd, buf, mtd->oobsize);
>> +                     fsl_elbc_cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
>> +                     blocks_to_write--;
>> +                     printf("Wrote migration marker to offset: %x\n", page);
>> +             }
>
> Why are we writing the marker twice?
>

Should have gone out of the loop after writing the first block. The
loop is for finding a good block to write...

>> +static int fsl_elbc_scan_bbt(struct mtd_info *mtd)
>> +{
>> +     struct nand_chip *this = mtd->priv;
>> +     struct nand_bbt_descr *bd = &largepage_migrated;
>> +     struct nand_bbt_descr *td = this->bbt_td;
>> +     int len;
>> +     int startblock, block, dir;
>> +     uint8_t *buf;
>> +     int migrate = 0;
>> +
>> +     if (mtd->writesize > 2048) {
>> +             /* Start below maximum bbt */
>> +             startblock = (mtd->size >> this->phys_erase_shift) - td->maxblocks;
>
> Again, off by one.
>
>> +             dir = -1;
>
> dir never seems to get set to anything else, so why a variable?
>
> I think we should start with the last block, and search backwards until
> we find it, or until we exceed some reasonable limit, such as
> td->maxblocks * 2 (Is that really enough?  How many contiguous bad
> blocks can we see?).  Actually writing a joint bbt/migration marker
> (which was requested in earlier discussion to avoid wasting an erase
> block, but I don't want it to be mandatory) can come later, but we want
> to recognize it now.
>

It should be simple enough to change that and avoid starting after the
last bbt block.

>> +             /* Allocate a temporary buffer for one eraseblock incl. oob */
>> +             len = (1 << this->phys_erase_shift);
>> +             len += (len >> this->page_shift) * mtd->oobsize;
>> +             buf = vmalloc(len);
>
> This code isn't going to be shared with Linux, so just malloc().
> Likewise printf(...) instead of printk(KERN_ERR ...).
>
> BTW, should mention at least in the changelog that this is partially
> derived from code in nand_bbt.c.  Also, using this code means we need to
> remove the "or later" from fsl_elbc_nand.c, because nand_bbt.c is GPL v2
> only.  Wolfgang probably won't be pleased.  It might be better to just
> write it from scratch -- I'm not sure how much it really helps to be
> mimicking the bbt code here (without actually being able to share it),
> versus straightforwardly coding this specific task.
>

I'm not shure I follow here. Is the use of the bbt descriptor a
problem? Aside from that, the code indeed is somewhat based on the
code on bbt (I used nand_bbt.c extensively for example code), but much
of it is just plain running through the blocks and searching for
patterns.
I just want to understand what needs to be different, so I can work on it.

>> +             if (!buf) {
>> +                     printk(KERN_ERR "fsl_elbc_nand: Out of memory\n");
>> +                     kfree(this->bbt);
>> +                     this->bbt = NULL;
>> +                     return -ENOMEM;
>> +             }
>
> Why are we freeing the bbt here?  When did we allocate it?

Right. This should not be here...

>
>> +             for (block = 0; block < td->maxblocks; block++) {
>> +                     int actblock = startblock + dir * block;
>> +                     loff_t offs = (loff_t)actblock << this->phys_erase_shift;
>> +                     int page = (offs >> this->page_shift) & this->pagemask;
>> +
>> +                     migrate = fsl_elbc_scan_read_raw_oob(mtd, buf, page,
>> +                                     mtd->writesize, &largepage_migrated);
>> +
>> +                     /* We found the migration marker, get out of here */
>> +                     if (migrate == 0)
>> +                             break;
>> +             }
>> +
>> +             if (migrate) {
>> +                     printf("Moving factory marked badblocks to new oob\n");
>> +                     fsl_elbc_migrate_badblocks(mtd, &largepage_memorybased);
>> +                     fsl_elbc_write_migration_marker(mtd, buf, len,
>> +                                                     &largepage_migrated);
>> +             }
>> +
>> +             vfree(buf);
>> +     }
>> +     /* Now that we checked and possibly migrated badblock
>> +      * markers, continue with default bbt scanning */
>
> /*
>  * U-Boot multiline comment style
>  * is like this.
>  */
>

OK.

> Again, thanks for working on this.  To properly test it I need to get
> raw reads/writes working properly with eLBC, though.  Once I do that,
> I'll fix this patch up (unless you want to do a v3 before then).
>
> -Scott
>

No problem. I'm very happy to be able to contribute somehow.
Aren't raw writes and reads working correctly with this driver? I used
them also to do my tests, as I already had scrubbed my nand chip
before starting this. What's the problem?
I should be able to get a "pristine" nand chip installed in my test
board, but this would take some time also, and being able to simulate
bad blocks using raw writes is very good for development.

-- Rafael