[PATCH v3 4/5] blkmap: store type of blkmap device in corresponding structure

Tue Jan 21 07:43:18 CET 2025

On Tue, 21 Jan 2025 at 03:06, Tobias Waldekranz <tobias at waldekranz.com> wrote:
>
> On tis, jan 21, 2025 at 00:55, Sughosh Ganu <sughosh.ganu at linaro.org> wrote:
> > On Mon, 20 Jan 2025 at 21:50, Tobias Waldekranz <tobias at waldekranz.com> wrote:
> >>
> >> On mån, jan 20, 2025 at 21:10, Sughosh Ganu <sughosh.ganu at linaro.org> wrote:
> >> > On Mon, 20 Jan 2025 at 20:06, Tobias Waldekranz <tobias at waldekranz.com> wrote:
> >> >>
> >> >> On mån, jan 20, 2025 at 19:30, Sughosh Ganu <sughosh.ganu at linaro.org> wrote:
> >> >> > On Mon, 20 Jan 2025 at 17:55, Tobias Waldekranz <tobias at waldekranz.com> wrote:
> >> >> >>
> >> >> >> On mån, jan 20, 2025 at 16:20, Sughosh Ganu <sughosh.ganu at linaro.org> wrote:
> >> >> >> > Add information about the type of blkmap device in the blkmap
> >> >> >> > structure. Currently, the blkmap device is used for mapping to either
> >> >> >> > a memory based block device, or another block device (linear
> >> >> >> > mapping). Put information in the blkmap structure to identify if it is
> >> >> >> > associated with a memory or linear mapped device. Which can then be
> >> >> >> > used to take specific action based on the type of blkmap device.
> >> >> >>
> >> >> >> Is this restriction really necessary? Why should it not be possible to
> >> >> >> setup a block map like this:
> >> >> >>
> >> >> >>  myblkmap:
> >> >> >> .--------.      .-----.
> >> >> >> | slice0 +------> RAM |
> >> >> >> :--------:      '-----'     .-------.
> >> >> >> | slice1 +------------------> eMMC0 |
> >> >> >> :--------:      .-------.   '-------'
> >> >> >> | slice2 +------> eMMC1 |
> >> >> >> '........'      '-------'
> >> >> >>
> >> >> >> Linux's "device mapper", after which blkmaps are modeled, works in this
> >> >> >> way.  I.e. a blkmap is just a collection of slices, and it is up to each
> >> >> >> slice how its data is provided, meaning that the user is free to compose
> >> >> >> their virtual block device in whatever way they need.
> >> >> >
> >> >> > The blkmap structure, the way it is designed, is pointing to the
> >> >> > underlying block device. How can a single blkmap then be associated
> >> >>
> >> >> The `struct udevice *blk` from `struct blkmap` is a reference to the
> >> >> block device which represents the block map itself ("myblkmap" in the
> >> >> picture above), not any lower device.
> >> >
> >> > Okay. I got confused with the comment associated with that member,
> >> > which says, "Underlying block device". This I interpreted to be the
> >> > block device that is associated with the blkmap structure.
> >>
> >> Yeah I agree that it could be made clearer :)
> >>
> >> >>
> >> >> > with slices of different types? Would that not contravene with the
> >> >> > idea of a block device associating with a blkmap?
> >> >>
> >> >> For slices which are linear mappings (and are thus backed by some other
> >> >> underlying block device), their reference to that lower device ("eMMC0"
> >> >> and "eMMC1" above) is stored in the `struct udevice *blk` member of
> >> >> `struct blkmap_linear`.
> >> >
> >> > Okay. But then, the computation of the blocksize seems to be happening
> >> > at the blkmap device level, which again implies having the same set of
> >> > slices associated with the blkmap. Any reason why the blksize is not
> >> > taken from the block device associated with that slice? That would
> >> > make it clear that the slice mapping type is independent from the
> >> > parent blkmap device.
> >>
> >> In the original series, only linear mappings to devices which used block
> >> sizes of 512 was supported, precisely because otherwise you need to do
> >> proper translation to work in all cases.
> >>
> >> I tried to argue this point on the list back then:
> >> https://lore.kernel.org/u-boot/875y3wohrt.fsf@waldekranz.com/
> >> but I did not get my point across and the restriction was lifted anyway.
> >>
> >> >>
> >> >> Slices which are backed by memory does not have any reference to a lower
> >> >> device, but merely a pointer to the start of the mapping - `void *addr`
> >> >> in `struct blkmap_mem`.
> >> >>
> >> >> The overarching idea is that the block map does not have to know
> >> >> anything about the implementation of how any individual slice chooses to
> >> >> provide its data.  It only knows about their sizes and offsets.  Based
> >> >> on that information, it simply routes incoming read/write requests to
> >> >> the correct slice.
> >> >
> >> > Okay. I think, for my solution, I will just need to move type
> >> > identification to the slice, instead of the blkmap device.
> >> >
> >> >>
> >> >> >>
> >> >> >> Looking at the pmem patch that follows this one, I am not able to find
> >> >> >> anything that would motivate restricting the functionality either.
> >> >> >
> >> >> > The subsequent patch is adding the persistent memory node to the
> >> >> > device-tree. The pmem node that is to be added is the memory mapped
> >> >> > blkmap device. The logic does check for the type of the blkmap device
> >> >> > and then proceeds to add the pmem node only for the memory mapped
> >> >> > blkmaps.
> >> >>
> >> >> Sorry I am confused.  Why do you need a block map device to add the pmem
> >> >> node to the device tree?
> >> >
> >> > This is needed to include the RAM based block device information in
> >> > the device-tree as pmem node. The OS installer then uses this pmem
> >> > device as the block device which contains the installation packages,
> >> > and proceeds with the OS installation.
> >>
> >> But even if the user has not setup a blkmap, don't you want to inject
> >> the pmem node in the DT anyway?  All you need is the size and offset of
> >> the blob right?  Is that not available from `image_setup_libfdt()`?
> >
> > Not sure if I am getting your point. The image_setup_libfdt() function
> > is fixing up the devicetree before it gets passed on to the OS. In my
> > subsequent patch, the image_setup_libfdt() is calling the blkmap
> > helper function (added in that patch) to check for any memory mapped
> > blkmap devices, and add corresponding pmem nodes for those. The
> > current case where this is happening in U-Boot is as part of
> > EFI_HTTP_BOOT, where if an ISO or an img file has been obtained over
> > the network, it gets mounted as a blkmap device, and that gets
> > notified to the OS. So, if this happens to be an install image, the
> > kernel knows about it through the pmem node in the DT.
>
> Alright, but then it seems like the implementation of EFI_HTTP_BOOT is
> the one with enough context to known when to add the pmem node, no?
> Could it not use the event subsystem to register a fixup?  I.e. when it
> creates the block map, it also ought to know that the image must be
> protected.  I imagine something like (pseudo code, but you get the
> idea):
>
> +struct efi_pmem {
> +    void *base;
> +    size_t size;
> +}
> +
> +efi_add_pmem(void *_pmem, struct event *event)
> +{
> +    const struct event_ft_fixup *fixup = &event->data.ft_fixup;
> +    struct efi_pmem pmem = _pmem;
> +
> +    fdt_fixup_pmem_region(fixup->tree, pmem->addr, pmem->size);
> +    free(pmem);
> +}
>
>  efi_http_boot()
>  {
>      ...
>      bm = blkmap_ramdisk_create(imgbase, imgsize);
> +    if (needs_protection) {
> +        struct efi_pmem *pmem = xmalloc(sizeof(*pmem));
> +        pmem->base = imgbase;
> +        pmem->size = imgsize;
> +        event_register("efi_add_pmem", EVT_FT_FIXUP, efi_add_pmem, pmem);
> +    }
>     ...
>  }

The earlier version was written on similar lines, where the helper
function was getting the image address and size from the efi_http_boot
context structure. There was a review comment asking me to decouple
the solution from EFI_HTTP_BOOT, and add the pmem nodes after scanning
the blkmap devices. This would make it a little more generic, and not
tie it closely with EFI_HTTP_BOOT.

Also, one issue with using the event based framework is that the user
might choose to pass a different devicetree to the OS from the one on
which the fixup was done. Calling the helper function from
image_setup_libfdt() ensures that the fixup happens on the DT that
gets passed on to the OS.

>
>
> > If you are referring to a scenario where the memory based block device
> > does not get set up as a blkmap device, that will anyways require
> > explicit intervention of the user to add the pmem node, because
> > otherwise there is no way to find out existence of such a device. This
> > can then be done through an explicit command. But the EFI_HTTP_BOOT
> > use-case does require scanning for the memory base blkmap devices.
>
> With the approach above, I think we could get out of marking every
> configured blkmap as reserved (which might not be what the user wants),
> and instead let the creator of the device decide on whether this is
> needed or not.

I am not sure what you mean by marking the blkmap as reserved, but
these changes are simply marking the blkmap devices in the DT before
booting the OS. The kernel can then parse these pmem regions to check
for the availability of an installation.

-sughosh