[EXTERNAL] Re: [PATCH] mtd: nand: pxa3xx: Incorrect bitflip return on page read

Ravi Minnikanti rminnikanti at marvell.com
Tue May 21 16:30:54 CEST 2024


Hi,

Can you please merge this PR, if there are no more review comments?

Thanks,
Ravi
On 5/6/24 11:28, Michael Nazzareno Trimarchi wrote:

> ----------------------------------------------------------------------
> Hi Ravi
> 
> On Mon, May 6, 2024 at 7:33 PM Ravi Minnikanti <rminnikanti at marvell.com> wrote:
>>
>> On 5/6/24 00:35, Michael Nazzareno Trimarchi wrote:
>>> Hi Ravi
>>>
>>> On Tue, Apr 30, 2024 at 6:25 AM Ravi Minnikanti <rminnikanti at marvell.com> wrote:
>>>>
>>>> On 4/29/24 09:59, Michael Nazzareno Trimarchi wrote:
>>>>
>>>>> ----------------------------------------------------------------------
>>>>> On Mon, Apr 29, 2024 at 6:22 PM Chris Packham <judge.packham at gmail.com> wrote:
>>>>>>
>>>>>> On Sun, Apr 28, 2024 at 4:15 AM Ravi Minnikanti <rminnikanti at marvell.com> wrote:
>>>>>>>
>>>>>>> Once a page is read with higher bitflips all subsequent reads
>>>>>>> are returning the same bitflip value even though they have none.
>>>>>>> max_bitflip variable is not being reset to 0 across page reads.
>>>>>>>
>>>>>>> This is causing problems like incorrectly
>>>>>>> marking erase blocks bad by UBI and causing read failures.
>>>>>>>
>>>>>>> Verified the change with both MTD reads and UBI.
>>>>>>> This change is inline with other NFC drivers.
>>>>>>>
>>>>>>> Sample error log where a block is marked bad incorrectly:
>>>>>>>
>>>>>>> ubi0: fixable bit-flip detected at PEB 125
>>>>>>> ubi0: run torture test for PEB 125
>>>>>>> ubi0: fixable bit-flip detected at PEB 125
>>>>>>> ubi0 error: torture_peb: read problems on freshly erased PEB 125,
>>>>>>> must be bad
>>>>>>> ubi0 error: erase_worker: failed to erase PEB 125, error -5
>>>>>>> ubi0: mark PEB 125 as bad
>>>>>>>
>>>>>>> Signed-off-by: rminnikanti <rminnikanti at marvell.com>
>>>>>>
>>>>>> Looks good to me
>>>>>>
>>>>>> Reviewed-by: Chris Packham <judge.packham at gmail.com>
>>>>>>
>>>>>>> ---
>>>>>>>  drivers/mtd/nand/raw/pxa3xx_nand.c | 5 +++++
>>>>>>>  1 file changed, 5 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/mtd/nand/raw/pxa3xx_nand.c b/drivers/mtd/nand/raw/pxa3xx_nand.c
>>>>>>> index 1d9a6d107b..d2a4faad56 100644
>>>>>>> --- a/drivers/mtd/nand/raw/pxa3xx_nand.c
>>>>>>> +++ b/drivers/mtd/nand/raw/pxa3xx_nand.c
>>>>>>> @@ -800,6 +800,11 @@ static void prepare_start_command(struct pxa3xx_nand_info *info, int command)
>>>>>>>         info->ecc_err_cnt       = 0;
>>>>>>>         info->ndcb3             = 0;
>>>>>>>         info->need_wait         = 0;
>>>>>>> +       /*
>>>>>>> +        * Reset max_bitflips to zero. Once command is complete,
>>>>>>> +        * max_bitflips for this READ is returned in ecc.read_page()
>>>>>>> +        */
>>>>>>> +       info->max_bitflips      = 0;
>>>>>>>
>>>>>
>>>>> Why this should not be put to 0 in read_page instead on prepare_start_command?
>>>>>
>>>>> Michael
>>>>>
>>>>
>>>> ecc.read_page is invoked after the read command execution.
>>>> First chip->cmdfunc is executed with NAND_CMD_READ0 and then ecc.read_page is invoked
>>>> to read the page from buffer. So, by the time read_page is invoked, info->max_bitflips
>>>> must already have the bit flip value.
>>>>
>>>
>>> All the other implementation has a slight different way to handle.
>>> From what you said the reset should
>>> be done on for NAND_CMD_READ0 command and should be sufficient.
>>> Technically should be moved
>>> in switch and not unconditionally.
>>>
>>> Michael
>>>
>>
>> max_bitflip is not being reset to 0 across page reads.
>> Once a page is read with higher bitflips all subsequent reads
>> are returning the same bitflip value even though they have none.
>>
>> This is causing problems like incorrectly
>> marking erase blocks bad by UBI and read failures.
>>
>> Tested it with both MTD reads and UBI attach.
>> This change is inline with other NFC drivers.
>>
>> Sample error log where a block is marked bad incorrectly:
>>
>> ubi0: fixable bit-flip detected at PEB 125
>> ubi0: run torture test for PEB 125
>> ubi0: fixable bit-flip detected at PEB 125
>> ubi0 error: torture_peb: read problems on freshly erased PEB 125,
>> must be bad
>> ubi0 error: erase_worker: failed to erase PEB 125, error -5
>> ubi0: mark PEB 125 as bad
>>
>> Signed-off-by: rminnikanti <rminnikanti at marvell.com>
>> ---
>>  drivers/mtd/nand/raw/pxa3xx_nand.c | 5 +++++
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/mtd/nand/raw/pxa3xx_nand.c b/drivers/mtd/nand/raw/pxa3xx_nand.c
>> index 1d9a6d107b..97f250483f 100644
>> --- a/drivers/mtd/nand/raw/pxa3xx_nand.c
>> +++ b/drivers/mtd/nand/raw/pxa3xx_nand.c
>> @@ -803,6 +803,11 @@ static void prepare_start_command(struct pxa3xx_nand_info *info, int command)
>>
>>         switch (command) {
>>         case NAND_CMD_READ0:
>> +               /*
>> +                * Reset max_bitflips to zero. Once command is complete,
>> +                * max_bitflips for this READ is returned in ecc.read_page()
>> +                */
>> +               info->max_bitflips = 0;
>>         case NAND_CMD_READOOB:
>>         case NAND_CMD_PAGEPROG:
>>                 if (!info->force_raw)
>> --
>> 2.17.1
>>
>> Thanks Michael. Fixed it.
>>
>>>> Thanks, Ravi.
>>>>
>>>>>>>         switch (command) {
>>>>>>>         case NAND_CMD_READ0:
>>>>>>> --
>>>>>>> 2.17.1
>>>>>
>>>>
>>>
>>>
> 
> Acked-by: Michael Trimarchi <michael at amarulasolutions.com>
> 
> 
> 


More information about the U-Boot mailing list