[U-Boot] [PATCH v6 00/18] GPT over MTD

Tue May 16 09:28:37 UTC 2017

On Fri, May 12, 2017 at 04:09:08PM +0000, Patrick DELAUNAY wrote:
> Hi Maxime 
> 
> > From: Maxime Ripard [mailto:maxime.ripard at free-electrons.com]
> > Sent: jeudi 11 mai 2017 16:46
> > 
> > On Thu, May 11, 2017 at 09:19:16AM +0000, Patrick DELAUNAY wrote:
> > > Hi Maxime,
> > >
> > > > From: Maxime Ripard [mailto:maxime.ripard at free-electrons.com]
> > > > Sent: jeudi 11 mai 2017 10:20
> > > >
> > > > Hi,
> > > >
> > > > On Thu, May 11, 2017 at 09:51:50AM +0200, Patrick Delaunay wrote:
> > > > > I have a request to support GPT over MTD to have the MTD
> > > > > informations without U-Boot
> > environment(CONFIG_ENV_IS_NOWHERE is a
> > > > > other requirement of my project to manage several board
> > > > > configuration with the same defconfig; boot from NAND or NOR or
> > SDCARD).
> > > >
> > > > What would happen if you have a bad block in the middle of the
> > > > primary or secondary GPT headers (or both)?
> > > >
> > > > Maxime
> > > >
> > >
> > > All Bad block are skipped....
> > > => primary GPT header is located at the beginning of first good block
> > > => backup GPT header is located at the end the of last good block
> > >
> > > And gpt create will failed if the erase block command (for primary or
> > > backup GPT) produce a new bad block.
> > 
> > Right, but what happens if your block becomes bad or too corrupted after it's
> > been written.
> > 
> > You mention in your Drawbacks section that if you erase the block and is now
> > detected to be bad, u-boot will have to act upon it. But that can happen
> > outside of U-Boot as well, or not directly to this block, by reads or writes
> > disturb... In this case, your GPT header is gone, and you have no way to
> > recover from it.
> 
> Yes, I known that it is the main issue for my proposal: the
> management of NAND bad block.  But it is the same for all the binary
> in boot stage (SPL / U-Boot / U-Boot env) in NAND.

U-Boot environment can be stored in UBI, and iirc U-Boot too. And
usually the SPLs can be stored at multiple offsets to reduce the
risks.

> For my point of view, erase should done only when GPT header need to
> updated
>
> => for first flashing / complete update of NAND
> => for the refresh of the GPT header
> So the NAND block becomes BAD only in this 2 cases.
> 
> If a read or write disturb for GPT block occur, it should be
> detected by NAND ECC => when unrecoverable error occur for one boot
> the backup GPT header should be used PS: Perhaps need to do
> something in U-Boot to refresh primary GPT header from backup header
> informaiton

Like I was saying, the issue really isn't in U-Boot itself, but when
U-Boot isn't there anymore to deal with those issues.

> The expected strategy is to read the boot partitions
> (all of them , GPT header , SPL and U-Boot) periodically outside of U-Boot 
> and if ECC read are too important refresh them : 
> read the partition and write it again in RAW mode (skip bad block)
> 
> So if the partitioning is correctly managed
> (with reserve some tank of good block for partition refresh)
> The GPT can de refreshed in raw mode without breaking the other partition
> 
> My idea is : to prepare a partionning with tank of good block
> 3 is this example :
> 
> 0  => MBR + primary GPT
> 1 => tank block
> 2 => tank block
> 3 => tank block
> --------------------- MTD1
> 4 => SPL 
> 5 => tank block
> 6 => tank block
> 7 => tank block
> ----------------------MTD2
> 8 => U-Boot  1/3
> 9 => U-Boot  2/3
> 10 => U-Boot  3/3
> 11 => tank block
> 12 => tank block
> 13 => tank block
> ----------------------MTD3
> 14 => UBI for kernel
> ......
> N-8 => last usable LBA
> ---------------------- 
> N-7 => tank block
> N-6 => tank block
> N-5 => tank block
> N-4 => backup GPT
> N-3 => BBT (marked as bad)
> N-2 => BBT (marked as bad)
> N-1 => BBT (marked as bad)
> N => BBT (marked as bad)
> 
> Block 0  and N-4 will be refreshed when ECC errors reach the threshold for any partition
> => refresh = read + erase + write (skip bad block)
> => if the erase command detect a bad block, the next tank block is used.
> 
> 0  => Bad block (NEW)
> 1 => MBR + primary GPT
> 2 => tank block
> 3 => tank block
> 
> Same for backup GPT but with inversed way
> 
> N-7 => tank block
> N-6 => tank block
> N-5 => backup GPT
> N-4 => bad block (NEW)
> 
> The number of tank block need to be choose correctly (To be defined
> with NAND information)

This really feels like you're trying to reinvent the UBI-partitions
wheel, but yeah, that would make it a bit more robust.

> PS: it is the same for SPL and U-Boot partition
> 
> Do you know better solution to handle read disturb issue on boot
> partitions than refresh ?

Unfortunately, I don't, but the biggest concern I have is that you
have no way to tell that your system is basically in self-destruct
mode outside of U-Boot. And that's an issue because:
  A) U-Boot will be there for (hopefully) less than a second
  B) An user getting a brand new system will have no idea that it
     needs to install something else so that his entire system do not
     fall apart.

Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: not available
URL: <http://lists.denx.de/pipermail/u-boot/attachments/20170516/1d35edef/attachment.sig>