The contradictory nature of spl_nand_fit_read

Michael Nazzareno Trimarchi michael at amarulasolutions.com
Fri Apr 1 23:43:16 CEST 2022


Hi Sean

On Fri, Apr 1, 2022 at 8:53 PM Sean Anderson <sean.anderson at seco.com> wrote:
>
>
>
> On 4/1/22 2:46 PM, Sean Anderson wrote:
> > Hi all,
> >
> > I don't understand how spl_nand_fit_read is supposed to work. This
> > function has been seemingly hacked together by multiple people over the
> > years, and it (presumably) (somehow) works despite mass confusion of
> > units. I tried to do some refactoring, but the contradictions make it
> > very difficult to determine what is a bug and what is intentional.
> >
> > Lets start with the basics. spl_nand_fit_read is used to implement
> > spl_load_info.load. I've reproduced the documentation for this function
> > below:
> >
> >>  read() - Read from device
> >>
> >>  @load: Information about the load state
> >>  @sector: Sector number to read from (each @load->bl_len bytes)
> >>  @count: Number of sectors to read
> >>  @buf: Buffer to read into
> >>  @return number of sectors read, 0 on error
> >
> > In particular, both @sector and @count are in units of load.bl_len. In
> > addition, the return value should be @count on success.
> >
> >> static ulong spl_nand_fit_read(struct spl_load_info *load, ulong offs,
> >>                             ulong size, void *dst)
> >> {
> >>      int err;
> >> #ifdef CONFIG_SYS_NAND_BLOCK_SIZE
> >
> > The following logic was introduced in commit 9f6a14c47f ("spl: fit:
> > nand: fix fit loading in case of bad blocks"). However, at the time it
> > was not guarded by the above ifdef. nand_spl_adjust_offset is not
> > defined for all nand drivers. It is defined for at least mxs, am335x,
> > atmel, and simple. Additionally, some drivers (such as rockchip)
> > implement bad block skipping in nand_spl_load_image.
> >
> > It's not at all clear to me that there is common config which determines
> > whether nand_spl_adjust_offset is implemented. Nevertheless, in commit
> > aa0032f672 ("spl: fit: nand: allow for non-page-aligned elements"), the
> > above directive selects different code if CONFIG_SYS_NAND_BLOCK_SIZE is
> > defined.
> >
> >>      ulong sector;
> >>
> >>      sector = *(int *)load->priv;
> >
> > This is the offset passed to nand_spl_load_image. This
> > offset is in units of bytes, and is aligned to the block size. However,
> > it is *not* set if CONFIG_SPL_LOAD_IMX_CONTAINER is enabled. Oops.
> >
> >>      offs = sector + nand_spl_adjust_offset(sector, offs - sector);
>
> I forgot to mention this, but this also adds sector to offs, but offs already includes the sector:
>
> > return spl_load_simple_fit(spl_image, &load, offset / bl_len, header);

The spl_nand_fit_read was called from info->read that was in units of sectors
count = info->read(info, sector, sectors, fit);
debug("fit read sector %lx, sectors=%d, dst=%p, count=%lu, size=0x%lx\n",
              sector, sectors, fit, count, size);


load.dev = NULL;
load.priv = &offset;
load.filename = NULL;
load.bl_len = 1;
load.read = spl_nand_fit_read;

static ulong spl_nand_fit_read(struct spl_load_info *load, ulong offs,
                               ulong size, void *dst)
{
        ulong sector;
        int ret;

        sector = *(int *)load->priv;
        offs = sector + nand_spl_adjust_offset(sector, offs - sector);
        ret = nand_spl_load_image(offs, size, dst);
        if (!ret)
                return size;
        else
                return 0;
}

Sorry it's a bit of time and I think that we are discussing about this
commit at the time was introduced

Michael

>
> Which means that we add in sector twice (but with different units!)
>
> > The documentation for this this function is as follows
> >
> >> nand_spl_adjust_offset - Adjust offset from a starting sector
> >> @sector:     Address of the sector
> >> @offs:       Offset starting from @sector
> >>
> >> If one or more bad blocks are in the address space between @sector
> >> and @sector + @offs, @offs is increased by the NAND block size for
> >> each bad block found.
> >
> > In particular, note that offs is in units of bytes. However, we know
> > from above that at this point in the function, offs is in units of
> > bl_len. Oops.
> >
> >> #else
> >>      offs *= load->bl_len;
> >>      size *= load->bl_len;
> >
> > This code path adjusts both offs and size to be in units of
>
> *units of bytes
>
> >> #endif
> >>      err = nand_spl_load_image(offs, size, dst);
> >
> > So by the time we get here, one code path is in units of bytes, and the
> > other is in units of bl_len. nand_spl_load_image seems to expect bytes
> > (though there is no documentation, so I can't be sure).  Which of course
> > begs the question: how did the code introduced in 9f6a14c47f ever work?
> >
> >>      if (err)
> >>              return 0;
> >>
> >>      return size / load->bl_len;
> >
> > And of course, this should only be done if size is in units of bytes.
> >
> >> }
> >
> > I have some patches for what I think are bugs, but the logic of this
> > function is so confused that I am unable to determine if they actually
> > are bugs. I would greatly appreciate if some of the authors of the
> > mentioned commits could comment on their intent, and what they thought
> > the units of offs and size were.
> >
> > --Sean
> >



-- 
Michael Nazzareno Trimarchi
Co-Founder & Chief Executive Officer
M. +39 347 913 2170
michael at amarulasolutions.com
__________________________________

Amarula Solutions BV
Joop Geesinkweg 125, 1114 AB, Amsterdam, NL
T. +31 (0)85 111 9172
info at amarulasolutions.com
www.amarulasolutions.com


More information about the U-Boot mailing list