[U-Boot] [PATCH v2] nand: Hack to support 4k page in fsl_elbc_nand

Sat Jun 30 01:06:38 CEST 2012

On 06/28/2012 09:13 PM, Rafael Beims wrote:
> On Thu, Jun 28, 2012 at 10:36 PM, Scott Wood <scottwood at freescale.com> wrote:
>> On 06/28/2012 03:47 PM, Rafael Beims wrote:
>>> Freescale FCM controller has a 2K size limitation of buffer RAM. In order
>>> to support the Nand flash chip with pagesize larger than 2K bytes,
>>> we read/write 2k data repeatedly by issuing FIR_OP_RB/FIR_OP_WB and save
>>> them to a large buffer.
>>> Because of this, the in flash layout of the oob is different from the
>>> default for 4096kiB page sizes. Therefore, we need to migrate the
>>> factory bad block markers from the original position to the new layout.
>>>
>>> Signed-off-by: Shengzhou Liu <Shengzhou.Liu at freescale.com>
>>> Signed-off-by: Liu Shuo <b35362 at freescale.com>
>>> Signed-off-by: Rafael Beims <rafael.beims at datacom.ind.br>
>>> ---
>>> Changes in v2:
>>>  - Added check to disallow the migration code to run in devices with
>>>  page size <= 2048
>>>
>>>  drivers/mtd/nand/fsl_elbc_nand.c |  447 +++++++++++++++++++++++++++++++++++---
>>>  1 files changed, 419 insertions(+), 28 deletions(-)
>>
>> Thanks for working on this!  I've been meaning to for a while, but have
>> been busy with other things.
>>
>>> diff --git a/drivers/mtd/nand/fsl_elbc_nand.c b/drivers/mtd/nand/fsl_elbc_nand.c
>>> index 9076ad4..3b6bb1d 100644
>>> --- a/drivers/mtd/nand/fsl_elbc_nand.c
>>> +++ b/drivers/mtd/nand/fsl_elbc_nand.c
>>> @@ -76,15 +76,17 @@ struct fsl_elbc_ctrl {
>>>
>>>       /* device info */
>>>       fsl_lbc_t *regs;
>>> -     u8 __iomem *addr;        /* Address of assigned FCM buffer        */
>>> -     unsigned int page;       /* Last page written to / read from      */
>>> -     unsigned int read_bytes; /* Number of bytes read during command   */
>>> -     unsigned int column;     /* Saved column from SEQIN               */
>>> -     unsigned int index;      /* Pointer to next byte to 'read'        */
>>> -     unsigned int status;     /* status read from LTESR after last op  */
>>> -     unsigned int mdr;        /* UPM/FCM Data Register value           */
>>> -     unsigned int use_mdr;    /* Non zero if the MDR is to be set      */
>>> -     unsigned int oob;        /* Non zero if operating on OOB data     */
>>> +     u8 __iomem *addr;           /* Address of assigned FCM buffer       */
>>> +     unsigned int page;          /* Last page written to / read from     */
>>> +     unsigned int read_bytes;    /* Number of bytes read during command  */
>>> +     unsigned int column;        /* Saved column from SEQIN              */
>>> +     unsigned int index;         /* Pointer to next byte to 'read'       */
>>> +     unsigned int status;        /* status read from LTESR after last op */
>>> +     unsigned int mdr;           /* UPM/FCM Data Register value          */
>>> +     unsigned int use_mdr;       /* Non zero if the MDR is to be set     */
>>> +     unsigned int oob;           /* Non zero if operating on OOB data    */
>>> +     char *buffer;               /* Just used when pagesize is greater   */
>>> +                                 /* than FCM RAM 2K limitation           */
>>>  };
>>>
>>>  /* These map to the positions used by the FCM hardware ECC generator */
>>> @@ -131,6 +133,15 @@ static struct nand_bbt_descr largepage_memorybased = {
>>>       .pattern = scan_ff_pattern,
>>>  };
>>>
>>> +static u8 migrated_pattern[] = { 0xde, 0xad, 0xde, 0xad };
>>
>> Let's generate a proper random number here -- or at least a meaningful
>> ASCII string.   Things like the above are too common and invite magic
>> number collision.
>>
> 
> Will do it.
> 
>>> +static int fsl_elbc_migrate_badblocks(struct mtd_info *mtd,
>>> +                                   struct nand_bbt_descr *bd)
>>> +{
>>> +     struct nand_chip *this = mtd->priv;
>>> +     int len, numblocks, i;
>>> +     int startblock;
>>> +     loff_t from;
>>> +     uint8_t *newoob, *buf;
>>> +
>>> +     struct fsl_elbc_mtd *priv = this->priv;
>>> +     struct fsl_elbc_ctrl *ctrl = priv->ctrl;
>>> +
>>> +     int num_subpages = mtd->writesize / 2048-1;
>>> +     len = mtd->writesize + mtd->oobsize;
>>> +     numblocks = this->chipsize >> (this->phys_erase_shift - 1);
>>> +     startblock = 0;
>>> +     from = 0;
>>> +
>>> +     newoob = vmalloc(mtd->oobsize);
>>
>> Even if this were Linux, oobsize should never be big enough to need vmalloc.
>>
>>> +     memset(newoob, 0xff, mtd->writesize+mtd->oobsize);
>>
>> You're writing beyond the end of that buffer.
> 
> I should have reviewed this code better... Will fix that.
> 
>>
>>> +     for (i = startblock; i < numblocks; i += 2) {
>>> +             int page = (from >> this->page_shift) & this->pagemask;
>>> +             fsl_elbc_cmdfunc(mtd, NAND_CMD_READ0, 0, page);
>>> +
>>> +             /* As we are already using the hack to read the bytes
>>> +              * from NAND, the original badblock marker is offset
>>> +              * from its original location in the internal buffer.
>>> +              * This is because the hack reads 2048 + 64 and already
>>> +              * positions the spare in the correct region
>>> +              * (after the 4096 offset)
>>> +              */
>>> +             uint8_t *badblock_pattern = (ctrl->buffer+
>>> +                             mtd->writesize-(num_subpages*64))+bd->offs;
>>
>> Spaces around operators.
>>
>> I think that should be (num_subpages - 1) * 64.
>>
> OK.
> 
>>> +             if (fsl_elbc_check_pattern(mtd, badblock_pattern, bd)) {
>>> +                     printf("Badblock found at %08x, migrating...\n", page);
>>> +                     memcpy(newoob, badblock_pattern , bd->len);
>>
>> No space before ,
>>
>>> +static int fsl_elbc_write_migration_marker(struct mtd_info *mtd,
>>> +                     uint8_t *buf, int len, struct nand_bbt_descr *bd)
>>> +{
>>> +     struct nand_chip *this = mtd->priv;
>>> +     struct nand_bbt_descr *td = this->bbt_td;
>>> +     int startblock;
>>> +     int dir, i;
>>> +     int blocks_to_write = 2;
>>> +
>>> +     /* Start below maximum bbt */
>>> +     startblock = (mtd->size >> this->phys_erase_shift) - td->maxblocks;
>>> +     dir = -1;
>>
>> If you want startblock below the bbt area, I think you're off by one.
>>
>>> +     for (i = 0; i < bd->maxblocks && blocks_to_write != 0; i++) {
>>> +             int actblock = startblock + dir*i;
>>> +             loff_t offs = (loff_t)actblock << this->phys_erase_shift;
>>> +             int page = (offs >> this->page_shift) & this->pagemask;
>>> +
>>> +             /* Avoid badblocks writing the marker... */
>>> +             if (!fsl_elbc_scan_read_raw_oob(mtd, buf, page, mtd->writesize,
>>> +                                     &largepage_memorybased)) {
>>> +
>>> +                     /* We are reusing this buffer, reset it */
>>> +                     memset(buf, 0xff, len);
>>> +                     memcpy(buf+bd->offs, bd->pattern, bd->len);
>>> +
>>> +                     /*  Mark the block as bad the same way that
>>> +                      *  it's done for the bbt. This should avoid
>>> +                      *  this block being overwritten
>>> +                      */
>>> +                     memset(buf+this->badblockpos, 0x02, 1);
>>> +
>>> +                     fsl_elbc_cmdfunc(mtd, NAND_CMD_SEQIN,
>>> +                                             mtd->writesize, page);
>>> +                     fsl_elbc_write_buf(mtd, buf, mtd->oobsize);
>>> +                     fsl_elbc_cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
>>> +                     blocks_to_write--;
>>> +                     printf("Wrote migration marker to offset: %x\n", page);
>>> +             }
>>
>> Why are we writing the marker twice?
>>
> 
> Should have gone out of the loop after writing the first block. The
> loop is for finding a good block to write...

This raises the question of whether we should write the marker multiple
times, or otherwise deal with the possibility that the marker gets
bitflipped.  We don't want the migration to run again, finding bogus bad
block markes in what is now being used as a data area.

>>> +             /* Allocate a temporary buffer for one eraseblock incl. oob */
>>> +             len = (1 << this->phys_erase_shift);
>>> +             len += (len >> this->page_shift) * mtd->oobsize;
>>> +             buf = vmalloc(len);
>>
>> This code isn't going to be shared with Linux, so just malloc().
>> Likewise printf(...) instead of printk(KERN_ERR ...).
>>
>> BTW, should mention at least in the changelog that this is partially
>> derived from code in nand_bbt.c.  Also, using this code means we need to
>> remove the "or later" from fsl_elbc_nand.c, because nand_bbt.c is GPL v2
>> only.  Wolfgang probably won't be pleased.  It might be better to just
>> write it from scratch -- I'm not sure how much it really helps to be
>> mimicking the bbt code here (without actually being able to share it),
>> versus straightforwardly coding this specific task.
>>
> 
> I'm not shure I follow here. Is the use of the bbt descriptor a
> problem? Aside from that, the code indeed is somewhat based on the
> code on bbt (I used nand_bbt.c extensively for example code), but much
> of it is just plain running through the blocks and searching for
> patterns.
> I just want to understand what needs to be different, so I can work on it.

The license on nand_bbt.c is GPLv2 only.  The license on fsl_elbc_nand.c
is GPLv2 or later.  So if we do use nand_bbt derived code, we need to
change the license on fsl_elbc_nand.c, or move this code into a
different v2-only file.  The only reason I suggested that it might be
better to just write new code is that the bbt code seems to be an
imperfect fit, and a bit more complicated than it needs to be.

>> Again, thanks for working on this.  To properly test it I need to get
>> raw reads/writes working properly with eLBC, though.  Once I do that,
>> I'll fix this patch up (unless you want to do a v3 before then).
>>
>> -Scott
>>
> 
> No problem. I'm very happy to be able to contribute somehow.
> Aren't raw writes and reads working correctly with this driver? I used
> them also to do my tests, as I already had scrubbed my nand chip
> before starting this. What's the problem?

The problem is ECC will still be performed.  We need to know before we
start that it will be a raw access, but currently the generic layer only
lets us know by calling a different page transfer function, after
issuing the command.

-Scott