u-boot does not detect and correct bitflip when loading kernel from NAND

jackie at liwest.at jackie at liwest.at
Wed Aug 14 14:41:04 CEST 2024


Hi to all and especially to Albert ARIBAUD, who is I think the maintainer of my board.
I’m new to U-Boot and I’m trying to solve an issue regarding NAND Flash.

Issue:

After system reboot U-Boot loads the same Kernel from NAND into RAM without any error message, whoever the Kernel says ‘Verifying Checksum ... Bad Data CRC’

Configuration:

Phytec pcm052 (Vybrid vf610), Micron MT29F4G16 …., U-Boot Timesys 2013.07, Kernel 3.13

Kernel uses mtdparts to create partitions defined in the device tree, FDT NAND uses ECC hw, on top sits UBIFS

Observations:
1. Loading both, the Kernel from Flash into RAM and the corresponding original copy from SD-Card revealed one single bitflip.
2. Writing the correct byte into Flash lets the Kernel start
3. Timesys U-Boot 2016.09 seems to recognize the bitflip, because after loading the Kernel starts to work, but the scheduler of the MQX OS does not work correctly. You introduced a new feature into U-Boot to load and start an image for the M4 core. Maybe this does some initialization which causes some problems in MQX. Debugging the OS shows that the scheduler starts before the IDLE task has been queued in.
4. After booting from the 2nd Kernel and a reflashing the partition of 1st one, lets it start without any problems
5. This issue happens after month and years after flashing and running the system 24/7
Assumptions about the root cause:
1. Different ECC calculations in U-Boot and Kernel
2. Different OOB or Bad Block handling
Questions:

Are my assumptions correct am I missing something?
Is this a known issue?
How can I compare the setting U-Boot/Kernel?
E.g. Which ECC does U-Boot use (HW NFC or ‘on die’ ,  SW,  Algorithm BCH ?, how many bit, …)?

Many thanks in advance

Tanja


More information about the U-Boot mailing list