The contradictory nature of spl_nand_fit_read
Sean Anderson
sean.anderson at seco.com
Mon Apr 4 18:26:51 CEST 2022
On 4/3/22 11:56 AM, Dario Binacchi wrote:
> Hi Sean,
>
>> Il 01/04/2022 23:43 Michael Nazzareno Trimarchi <michael at amarulasolutions.com> ha scritto:
>>
>>
>> Hi Sean
>>
>> On Fri, Apr 1, 2022 at 8:53 PM Sean Anderson <sean.anderson at seco.com> wrote:
>> >
>> >
>> >
>> > On 4/1/22 2:46 PM, Sean Anderson wrote:
>> > > Hi all,
>> > >
>> > > I don't understand how spl_nand_fit_read is supposed to work. This
>> > > function has been seemingly hacked together by multiple people over the
>> > > years, and it (presumably) (somehow) works despite mass confusion of
>> > > units. I tried to do some refactoring, but the contradictions make it
>> > > very difficult to determine what is a bug and what is intentional.
>> > >
>> > > Lets start with the basics. spl_nand_fit_read is used to implement
>> > > spl_load_info.load. I've reproduced the documentation for this function
>> > > below:
>> > >
>> > >> read() - Read from device
>> > >>
>> > >> @load: Information about the load state
>> > >> @sector: Sector number to read from (each @load->bl_len bytes)
>> > >> @count: Number of sectors to read
>> > >> @buf: Buffer to read into
>> > >> @return number of sectors read, 0 on error
>> > >
>> > > In particular, both @sector and @count are in units of load.bl_len. In
>> > > addition, the return value should be @count on success.
>> > >
>> > >> static ulong spl_nand_fit_read(struct spl_load_info *load, ulong offs,
>> > >> ulong size, void *dst)
>> > >> {
>> > >> int err;
>> > >> #ifdef CONFIG_SYS_NAND_BLOCK_SIZE
>> > >
>> > > The following logic was introduced in commit 9f6a14c47f ("spl: fit:
>> > > nand: fix fit loading in case of bad blocks"). However, at the time it
>> > > was not guarded by the above ifdef. nand_spl_adjust_offset is not
>> > > defined for all nand drivers. It is defined for at least mxs, am335x,
>> > > atmel, and simple. Additionally, some drivers (such as rockchip)
>> > > implement bad block skipping in nand_spl_load_image.
>> > >
>> > > It's not at all clear to me that there is common config which determines
>> > > whether nand_spl_adjust_offset is implemented. Nevertheless, in commit
>> > > aa0032f672 ("spl: fit: nand: allow for non-page-aligned elements"), the
>> > > above directive selects different code if CONFIG_SYS_NAND_BLOCK_SIZE is
>> > > defined.
>> > >
>> > >> ulong sector;
>> > >>
>> > >> sector = *(int *)load->priv;
>> > >
>> > > This is the offset passed to nand_spl_load_image. This
>> > > offset is in units of bytes, and is aligned to the block size. However,
>> > > it is *not* set if CONFIG_SPL_LOAD_IMX_CONTAINER is enabled. Oops.
>> > >
>> > >> offs = sector + nand_spl_adjust_offset(sector, offs - sector);
>> >
>> > I forgot to mention this, but this also adds sector to offs, but offs already includes the sector:
>> >
>> > > return spl_load_simple_fit(spl_image, &load, offset / bl_len, header);
>>
>> The spl_nand_fit_read was called from info->read that was in units of sectors
>> count = info->read(info, sector, sectors, fit);
>> debug("fit read sector %lx, sectors=%d, dst=%p, count=%lu, size=0x%lx\n",
>> sector, sectors, fit, count, size);
>>
>>
>> load.dev = NULL;
>> load.priv = &offset;
>> load.filename = NULL;
>> load.bl_len = 1;
Ah, here is the problem. If bl_len is always 1, then there is no problem
going back and forth between units. But aa0032f672 ("spl: fit: nand: allow
for non-page-aligned elements") means that bl_len is not always 1, so
now the conversion matters.
So I suppose my question is now for Tim: how did you test your patch?
>> load.read = spl_nand_fit_read;
>>
>> static ulong spl_nand_fit_read(struct spl_load_info *load, ulong offs,
>> ulong size, void *dst)
>> {
>> ulong sector;
>> int ret;
>>
>> sector = *(int *)load->priv;
>> offs = sector + nand_spl_adjust_offset(sector, offs - sector);
This line is ONLY ok if bl_len = 1
>> ret = nand_spl_load_image(offs, size, dst);
>> if (!ret)
>> return size;
>> else
>> return 0;
>> }
>>
>> Sorry it's a bit of time and I think that we are discussing about this
>> commit at the time was introduced
>> Michael
>>
>> >
>> > Which means that we add in sector twice (but with different units!)
>> >
>> > > The documentation for this this function is as follows
>> > >
>> > >> nand_spl_adjust_offset - Adjust offset from a starting sector
>> > >> @sector: Address of the sector
>> > >> @offs: Offset starting from @sector
>> > >>
>> > >> If one or more bad blocks are in the address space between @sector
>> > >> and @sector + @offs, @offs is increased by the NAND block size for
>> > >> each bad block found.
>> > >
>> > > In particular, note that offs is in units of bytes. However, we know
>> > > from above that at this point in the function, offs is in units of
>> > > bl_len. Oops.
>> > >
>> > >> #else
>> > >> offs *= load->bl_len;
>> > >> size *= load->bl_len;
>> > >
>> > > This code path adjusts both offs and size to be in units of
>> >
>> > *units of bytes
>> >
>> > >> #endif
>> > >> err = nand_spl_load_image(offs, size, dst);
>> > >
>> > > So by the time we get here, one code path is in units of bytes, and the
>> > > other is in units of bl_len. nand_spl_load_image seems to expect bytes
>> > > (though there is no documentation, so I can't be sure). Which of course
>> > > begs the question: how did the code introduced in 9f6a14c47f ever work?
>
> The commit 9f6a14c47f only added the code strictly necessary to skip a bad block
> that was in that memory area. I heavily tested it on a custom board that presented
> this problem and that without it it would not have been usable. The patch itself
> was also tested on SOCs that did not have bad block, therefore backward compatible.
> I think that your doubts arise from the patches following mine and that, however,
> I have not had the opportunity to test.
Have you tested v2021.07 or later?
--Sean
>> > >
>> > >> if (err)
>> > >> return 0;
>> > >>
>> > >> return size / load->bl_len;
>> > >
>> > > And of course, this should only be done if size is in units of bytes.
>> > >
>> > >> }
>> > >
>> > > I have some patches for what I think are bugs, but the logic of this
>> > > function is so confused that I am unable to determine if they actually
>> > > are bugs. I would greatly appreciate if some of the authors of the
>> > > mentioned commits could comment on their intent, and what they thought
>> > > the units of offs and size were.
>> > >
>> > > --Sean
>> > >
More information about the U-Boot
mailing list