[PATCH] mtd: nand: pxa3xx: Incorrect bitflip return on page read

Ravi Minnikanti rminnikanti at marvell.com
Mon May 6 19:32:58 CEST 2024


On 5/6/24 00:35, Michael Nazzareno Trimarchi wrote:
> Hi Ravi
> 
> On Tue, Apr 30, 2024 at 6:25 AM Ravi Minnikanti <rminnikanti at marvell.com> wrote:
>>
>> On 4/29/24 09:59, Michael Nazzareno Trimarchi wrote:
>>
>>> ----------------------------------------------------------------------
>>> On Mon, Apr 29, 2024 at 6:22 PM Chris Packham <judge.packham at gmail.com> wrote:
>>>>
>>>> On Sun, Apr 28, 2024 at 4:15 AM Ravi Minnikanti <rminnikanti at marvell.com> wrote:
>>>>>
>>>>> Once a page is read with higher bitflips all subsequent reads
>>>>> are returning the same bitflip value even though they have none.
>>>>> max_bitflip variable is not being reset to 0 across page reads.
>>>>>
>>>>> This is causing problems like incorrectly
>>>>> marking erase blocks bad by UBI and causing read failures.
>>>>>
>>>>> Verified the change with both MTD reads and UBI.
>>>>> This change is inline with other NFC drivers.
>>>>>
>>>>> Sample error log where a block is marked bad incorrectly:
>>>>>
>>>>> ubi0: fixable bit-flip detected at PEB 125
>>>>> ubi0: run torture test for PEB 125
>>>>> ubi0: fixable bit-flip detected at PEB 125
>>>>> ubi0 error: torture_peb: read problems on freshly erased PEB 125,
>>>>> must be bad
>>>>> ubi0 error: erase_worker: failed to erase PEB 125, error -5
>>>>> ubi0: mark PEB 125 as bad
>>>>>
>>>>> Signed-off-by: rminnikanti <rminnikanti at marvell.com>
>>>>
>>>> Looks good to me
>>>>
>>>> Reviewed-by: Chris Packham <judge.packham at gmail.com>
>>>>
>>>>> ---
>>>>>  drivers/mtd/nand/raw/pxa3xx_nand.c | 5 +++++
>>>>>  1 file changed, 5 insertions(+)
>>>>>
>>>>> diff --git a/drivers/mtd/nand/raw/pxa3xx_nand.c b/drivers/mtd/nand/raw/pxa3xx_nand.c
>>>>> index 1d9a6d107b..d2a4faad56 100644
>>>>> --- a/drivers/mtd/nand/raw/pxa3xx_nand.c
>>>>> +++ b/drivers/mtd/nand/raw/pxa3xx_nand.c
>>>>> @@ -800,6 +800,11 @@ static void prepare_start_command(struct pxa3xx_nand_info *info, int command)
>>>>>         info->ecc_err_cnt       = 0;
>>>>>         info->ndcb3             = 0;
>>>>>         info->need_wait         = 0;
>>>>> +       /*
>>>>> +        * Reset max_bitflips to zero. Once command is complete,
>>>>> +        * max_bitflips for this READ is returned in ecc.read_page()
>>>>> +        */
>>>>> +       info->max_bitflips      = 0;
>>>>>
>>>
>>> Why this should not be put to 0 in read_page instead on prepare_start_command?
>>>
>>> Michael
>>>
>>
>> ecc.read_page is invoked after the read command execution.
>> First chip->cmdfunc is executed with NAND_CMD_READ0 and then ecc.read_page is invoked
>> to read the page from buffer. So, by the time read_page is invoked, info->max_bitflips
>> must already have the bit flip value.
>>
> 
> All the other implementation has a slight different way to handle.
> From what you said the reset should
> be done on for NAND_CMD_READ0 command and should be sufficient.
> Technically should be moved
> in switch and not unconditionally.
> 
> Michael
> 

max_bitflip is not being reset to 0 across page reads.
Once a page is read with higher bitflips all subsequent reads
are returning the same bitflip value even though they have none.

This is causing problems like incorrectly
marking erase blocks bad by UBI and read failures.

Tested it with both MTD reads and UBI attach.
This change is inline with other NFC drivers.

Sample error log where a block is marked bad incorrectly:

ubi0: fixable bit-flip detected at PEB 125
ubi0: run torture test for PEB 125
ubi0: fixable bit-flip detected at PEB 125
ubi0 error: torture_peb: read problems on freshly erased PEB 125,
must be bad
ubi0 error: erase_worker: failed to erase PEB 125, error -5
ubi0: mark PEB 125 as bad

Signed-off-by: rminnikanti <rminnikanti at marvell.com>
---
 drivers/mtd/nand/raw/pxa3xx_nand.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/mtd/nand/raw/pxa3xx_nand.c b/drivers/mtd/nand/raw/pxa3xx_nand.c
index 1d9a6d107b..97f250483f 100644
--- a/drivers/mtd/nand/raw/pxa3xx_nand.c
+++ b/drivers/mtd/nand/raw/pxa3xx_nand.c
@@ -803,6 +803,11 @@ static void prepare_start_command(struct pxa3xx_nand_info *info, int command)
 
 	switch (command) {
 	case NAND_CMD_READ0:
+		/*
+		 * Reset max_bitflips to zero. Once command is complete,
+		 * max_bitflips for this READ is returned in ecc.read_page()
+		 */
+		info->max_bitflips = 0;
 	case NAND_CMD_READOOB:
 	case NAND_CMD_PAGEPROG:
 		if (!info->force_raw)
-- 
2.17.1

Thanks Michael. Fixed it.

>> Thanks, Ravi.
>>
>>>>>         switch (command) {
>>>>>         case NAND_CMD_READ0:
>>>>> --
>>>>> 2.17.1
>>>
>>
> 
> 


More information about the U-Boot mailing list