[RFC 07/22] dm: blk: add UCLASS_PARTITION

AKASHI Takahiro takahiro.akashi at linaro.org
Thu Oct 28 10:52:17 CEST 2021


Hi Simon,

I'd like to resume this discussion.

On Thu, Oct 14, 2021 at 02:55:36PM -0600, Simon Glass wrote:
> Hi Takahiro,
> 
> On Thu, 14 Oct 2021 at 02:03, AKASHI Takahiro
> <takahiro.akashi at linaro.org> wrote:
> >
> > Simon,
> >
> > On Wed, Oct 13, 2021 at 12:05:58PM -0600, Simon Glass wrote:
> > > Hi Takahiro,
> > >
> > > On Tue, 12 Oct 2021 at 19:32, AKASHI Takahiro
> > > <takahiro.akashi at linaro.org> wrote:
> > > >
> > > > On Tue, Oct 12, 2021 at 11:14:17AM -0400, Tom Rini wrote:
> > > > > On Mon, Oct 11, 2021 at 10:14:00AM -0600, Simon Glass wrote:
> > > > > > Hi Heinrich,
> > > > > >
> > > > > > On Mon, 11 Oct 2021 at 09:02, Heinrich Schuchardt <xypron.glpk at gmx.de> wrote:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On 10/11/21 16:54, Simon Glass wrote:
> > > > > > > > Hi Takahiro,
> > > > > > > >
> > > > > > > > On Sun, 10 Oct 2021 at 20:29, AKASHI Takahiro
> > > > > > > > <takahiro.akashi at linaro.org> wrote:
> > > > > > > >>
> > > > > > > >> Heinrich,
> > > > > > > >>
> > > > > > > >> On Fri, Oct 08, 2021 at 10:23:52AM +0200, Heinrich Schuchardt wrote:
> > > > > > > >>>
> > > > > > > >>>
> > > > > > > >>> On 10/8/21 02:51, AKASHI Takahiro wrote:
> > > > > > > >>>> On Mon, Oct 04, 2021 at 12:27:59PM +0900, AKASHI Takahiro wrote:
> > > > > > > >>>>> On Fri, Oct 01, 2021 at 11:30:37AM +0200, Heinrich Schuchardt wrote:
> > > > > > > >>>>>>
> > > > > > > >>>>>>
> > > > > > > >>>>>> On 10/1/21 07:01, AKASHI Takahiro wrote:
> > > > > > > >>>>>>> UCLASS_PARTITION device will be created as a child node of
> > > > > > > >>>>>>> UCLASS_BLK device.
> > > > > > > >>>>>>>
> > > > > > > >>>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi at linaro.org>
> > > > > > > >>>>>>> ---
> > > > > > > >>>>>>>     drivers/block/blk-uclass.c | 111 +++++++++++++++++++++++++++++++++++++
> > > > > > > >>>>>>>     include/blk.h              |   9 +++
> > > > > > > >>>>>>>     include/dm/uclass-id.h     |   1 +
> > > > > > > >>>>>>>     3 files changed, 121 insertions(+)
> > > > > > > >>>>>>>
> > > > > > > >>>>>>> diff --git a/drivers/block/blk-uclass.c b/drivers/block/blk-uclass.c
> > > > > > > >>>>>>> index 83682dcc181a..dd7f3c0fe31e 100644
> > > > > > > >>>>>>> --- a/drivers/block/blk-uclass.c
> > > > > > > >>>>>>> +++ b/drivers/block/blk-uclass.c
> > > > > > > >>>>>>> @@ -12,6 +12,7 @@
> > > > > > > >>>>>>>     #include <log.h>
> > > > > > > >>>>>>>     #include <malloc.h>
> > > > > > > >>>>>>>     #include <part.h>
> > > > > > > >>>>>>> +#include <string.h>
> > > > > > > >>>>>>>     #include <dm/device-internal.h>
> > > > > > > >>>>>>>     #include <dm/lists.h>
> > > > > > > >>>>>>>     #include <dm/uclass-internal.h>
> > > > > > > >>>>>>> @@ -695,6 +696,44 @@ int blk_unbind_all(int if_type)
> > > > > > > >>>>>>>        return 0;
> > > > > > > >>>>>>>     }
> > > > > > > >>>>>>>
> > > > > > > >>>>>>> +int blk_create_partitions(struct udevice *parent)
> > > > > > > >>>>>>> +{
> > > > > > > >>>>>>> +     int part, count;
> > > > > > > >>>>>>> +     struct blk_desc *desc = dev_get_uclass_plat(parent);
> > > > > > > >>>>>>> +     struct disk_partition info;
> > > > > > > >>>>>>> +     struct disk_part *part_data;
> > > > > > > >>>>>>> +     char devname[32];
> > > > > > > >>>>>>> +     struct udevice *dev;
> > > > > > > >>>>>>> +     int ret;
> > > > > > > >>>>>>> +
> > > > > > > >>>>>>> +     if (!CONFIG_IS_ENABLED(PARTITIONS) ||
> > > > > > > >>>>>>> +         !CONFIG_IS_ENABLED(HAVE_BLOCK_DEVICE))
> > > > > > > >>>>>>> +             return 0;
> > > > > > > >>>>>>> +
> > > > > > > >>>>>>> +     /* Add devices for each partition */
> > > > > > > >>>>>>> +     for (count = 0, part = 1; part <= MAX_SEARCH_PARTITIONS; part++) {
> > > > > > > >>>>>>> +             if (part_get_info(desc, part, &info))
> > > > > > > >>>>>>> +                     continue;
> > > > > > > >>>>>>> +             snprintf(devname, sizeof(devname), "%s:%d", parent->name,
> > > > > > > >>>>>>> +                      part);
> > > > > > > >>>>>>> +
> > > > > > > >>>>>>> +             ret = device_bind_driver(parent, "blk_partition",
> > > > > > > >>>>>>> +                                      strdup(devname), &dev);
> > > > > > > >>>>>>> +             if (ret)
> > > > > > > >>>>>>> +                     return ret;
> > > > > > > >>>>>>> +
> > > > > > > >>>>>>> +             part_data = dev_get_uclass_plat(dev);
> > > > > > > >>>>>>> +             part_data->partnum = part;
> > > > > > > >>>>>>> +             part_data->gpt_part_info = info;
> > > > > > > >>>>>>> +             count++;
> > > > > > > >>>>>>> +
> > > > > > > >>>>>>> +             device_probe(dev);
> > > > > > > >>>>>>> +     }
> > > > > > > >>>>>>> +     debug("%s: %d partitions found in %s\n", __func__, count, parent->name);
> > > > > > > >>>>>>> +
> > > > > > > >>>>>>> +     return 0;
> > > > > > > >>>>>>> +}
> > > > > > > >>>>>>> +
> > > > > > > >>>>>>>     static int blk_post_probe(struct udevice *dev)
> > > > > > > >>>>>>>     {
> > > > > > > >>>>>>>        if (IS_ENABLED(CONFIG_PARTITIONS) &&
> > > > > > > >>>>>>> @@ -713,3 +752,75 @@ UCLASS_DRIVER(blk) = {
> > > > > > > >>>>>>>        .post_probe     = blk_post_probe,
> > > > > > > >>>>>>>        .per_device_plat_auto   = sizeof(struct blk_desc),
> > > > > > > >>>>>>>     };
> > > > > > > >>>>>>> +
> > > > > > > >>>>>>> +static ulong blk_part_read(struct udevice *dev, lbaint_t start,
> > > > > > > >>>>>>> +                        lbaint_t blkcnt, void *buffer)
> > > > > > > >>>>>>> +{
> > > > > > > >>>>>>> +     struct udevice *parent;
> > > > > > > >>>>>>> +     struct disk_part *part;
> > > > > > > >>>>>>> +     const struct blk_ops *ops;
> > > > > > > >>>>>>> +
> > > > > > > >>>>>>> +     parent = dev_get_parent(dev);
> > > > > > > >>>>>>
> > > > > > > >>>>>> What device type will the parent have if it is a eMMC hardware partition?
> > > > > > > >>>>>>
> > > > > > > >>>>>>> +     ops = blk_get_ops(parent);
> > > > > > > >>>>>>> +     if (!ops->read)
> > > > > > > >>>>>>> +             return -ENOSYS;
> > > > > > > >>>>>>> +
> > > > > > > >>>>>>> +     part = dev_get_uclass_plat(dev);
> > > > > > > >>>>>>
> > > > > > > >>>>>> You should check that we do not access the block device past the
> > > > > > > >>>>>> partition end:
> > > > > > > >>>>>
> > > > > > > >>>>> Yes, I will fix all of checks.
> > > > > > > >>>>>
> > > > > > > >>>>>> struct blk_desc *desc = dev_get_uclass_plat(parent);
> > > > > > > >>>>>> if ((start + blkcnt) * desc->blksz < part->gpt_part_info.blksz)
> > > > > > > >>>>>>          return -EFAULT.
> > > > > > > >>>>>>
> > > > > > > >>>>>>> +     start += part->gpt_part_info.start;
> > > > > > > >>>>
> > > > > > > >>>> A better solution is:
> > > > > > > >>>>           if (start >= part->gpt_part_info.size)
> > > > > > > >>>>                   return 0;
> > > > > > > >>>>
> > > > > > > >>>>           if ((start + blkcnt) > part->gpt_part_info.size)
> > > > > > > >>>>                   blkcnt = part->gpt_part_info.size - start;
> > > > > > > >>>>           start += part->gpt_part_info.start;
> > > > > > > >>>> instead of returning -EFAULT.
> > > > > > > >>>> (note that start and blkcnt are in "block".)
> > > > > > > >>>
> > > > > > > >>> What is your motivation to support an illegal access?
> > > > > > > >>>
> > > > > > > >>> We will implement the EFI_BLOCK_IO_PROTOCOL based on this function. The
> > > > > > > >>> ReadBlocks() and WriteBlocks() services must return
> > > > > > > >>> EFI_INVALID_PARAMETER if the read request contains LBAs that are not
> > > > > > > >>> valid.
> > > > > > > >>
> > > > > > > >> I interpreted that 'LBA' was the third parameter to ReadBlocks API,
> > > > > > > >> and that if the starting block is out of partition region, we should
> > > > > > > >> return an error (and if not, we still want to trim IO request to fit
> > > > > > > >> into partition size as other OS's API like linux does).
> > > > > > > >> Do you think it's incorrect?
> > > > > > > >
> > > > > > > > [..]
> > > > > > > >
> > > > > > > > Related to this patch I think that the partition type should be really
> > > > > > > > be a child of the media device:
> > > > > > > >
> > > > > > > > - MMC
> > > > > > > >      |- BLK
> > > > > > > >      |- PARTITION
> > > > > > > >         |- BLK
> > > > > > > >      |- PARTITION
> > > > > > > >         |- BLK
> > > > > > > >      |- PARTITION
> > > > > > > >         |- BLK
> > > > > > > >
> > > > > > > > It seems more natural to me that putting the partitions under the
> > > > > > > > top-level BLK device, so that BLK remains a 'terminal' device.
> > > > > > > >
> > > > > > > > The partition uclass is different from BLK, of course. It could
> > > > > > > > contain information about the partition such as its partition number
> > > > > > > > and UUID.
> > > > > > >
> > > > > > > Do you mean hardware partition here? Otherwise I would not know what BLK
> > > > > > > should model.
> > > > > >
> > > > > > I mean that (I think) we should not use BLK to model partitions. A BLK
> > > > > > should just be a block device.
> > > > > >
> > > > > > I don't see any difference between a partition and a hardware
> > > > > > partition. We presumably end up with a hierarchy though. Do we need a
> > > > > > HWPARTITION uclass so we can handle the hardware partitions
> > > > > > differently?
> > > > >
> > > > > Note that for eMMC devices, hardware partitions are different from
> > > > > partition-table partitions.  If you boot a system with an eMMC device up
> > > > > in Linux you typically get mmcblkN, mmcblkNboot0, mmcblkNboot1 and
> > > > > mmcblkNrpmb, each of which are hardware partitions.  It gets tricky in
> > > > > U-Boot in that you can access each of these with 'mmc dev N M' where M
> > > > > defaults to 0 and is the user partition (mmcblkN), 1/2 are boot0/boot1
> > > > > and 3 is the rpmb area.  The 'mmc' command also allows, when possible
> > > > > and implemented, configuring these partitions, again to the extent
> > > > > allowed, documented and implemented.
> > > >
> > > > Thank you. That is exactly what I tried to mention in my reply
> > > > at "part: call part_init() in blk_get_device_by_str() only for MMC"
> > >
> > > OK so it sounds like we agree that hwpartition and partition are
> > > different things.
> >
> > Yes.
> > Please note, IIUC, that
> > * MMC hw partitions on a device are mapped to one udevice, differentiating
> >   them by blk_desc->hwpart.
> > * Each NVME namespace on a device is mapped to a different udevice with
> >   a different blk_desc->devnum (and nvme_dev->ns_id).
> > * Each UFS partition (or which is, I suppose, equivalent to scsi LUN) on
> >   a device is mapped to a different udevice with a different blk_desc->devnum
> >   (and blk_desc->lun).
> >
> > So even though those type of devices have some kind of hardware partitions,
> > they are modelled differently in U-Boot.
> > (Obviously, I might be wrong here as I'm not quite familiar yet.)
> >
> > > >
> > > > ---8<---
> > > > # On the other hand, we have to explicitly switch "hw partitions"
> > > > # with blk_select_hwpart_devnum() on MMC devices even though we use
> > > > # the *same* udevice(blk_desc).
> > > > --->8---
> > > >
> > > > The problem with the current U-Boot driver model is that all of "mmcblkN,
> > > > mmcblkNboot0, mmcblkNboot1 and mmcblkNrpmb" will be linked to the same
> > > > udevice. We have to do "mmc dev N M" or call blk_select_hwpart[_devnum]()
> > > > to distinguish them.
> > >
> > > Here's our chance to rethink this. What should the device hierarchy be
> > > for an MMC device? I made a proposal further up the thread.
> >
> > Well,
> >
> > On Mon, Oct 11, 2021 at 11:41:02AM -0600, Simon Glass wrote:
> > > On Mon, 11 Oct 2021 at 10:53, Heinrich Schuchardt <xypron.glpk at gmx.de> wrote:
> >
> > > > >>> [..]
> >
> > > > >>> Related to this patch I think that the partition type should be really
> > > > >>> be a child of the media device:
> > > > >>>
> > > > >>> - MMC
> > > > >>>       |- BLK
> > > > >>>       |- PARTITION
> > > > >>>          |- BLK
> > > > >>>       |- PARTITION
> > > > >>>          |- BLK
> > > > >>>       |- PARTITION
> > > > >>>          |- BLK
> > > > >>>
> > > > >>> It seems more natural to me that putting the partitions under the
> > > > >>> top-level BLK device, so that BLK remains a 'terminal' device.
> > > > >>>
> > > > >>> The partition uclass is different from BLK, of course. It could
> > > > >>> contain information about the partition such as its partition number
> > > > >>> and UUID.
> >
> > Yeah, but there is always 1-to-1 mapping between a partition and
> > a block (for a partition), so I still wonder whether it makes sense
> > to model partitions in the way above.
> >
> > Alternatively, the following hierarchy also makes some sense.
> > (This is not what I have in my RFC though.)
> > - MMC
> > |- BLK (whole disk with part=0)
> > |- BLK (partition 1)
> > |- BLK (partition 2)
> > |- BLK (partition 3)
> >
> > or
> >
> > - MMC
> > |- DISK (whole disk)
> > ||- BLK (partition 0)
> > ||- BLK (partition 1)
> > ||- BLK (partition 2)
> > ||- BLK (partition 3)
> >
> > Here
> > MMC: provides read/write operations (via blk_ops)
> > DISK: holds a geometry of a whole disk and other info
> > BLK: partition info (+ blk_ops + geo) (part=0 means a while disk)
> 
> Where does this leave hwpart? Are we giving up on that?

No, not at all :)
I'm thinking of dealing with hw partitions as independent BLK devices.
This is already true for NVME (namespaces) and UFS (LUNs)(not sure, though).
For MMC, struct blk_desc has 'hwpart' field to indicate a hw partition and
Apparently, it will be easy to have different BLK devices with
different hwpart's.
(Then we will have to add a probe function for hw partitions.)

> Both of these make some sense to me, although I'm not sure what the
> second one buys us. Can you explain that? Is it to deal with hwpart?

So,

- MMC (bus controller)
|- BLK (device/hw partition:user data)
||- DISK (partition 0 == a whole device)
||- DISK (partition 1)
||- DISK (partition 2)
||- DISK (partition 3)
|- BLK (device/hw partition:boot0)
||- DISK (partition 0 == a whole device)
|- BLK (device/hw partition:boot0)
||- DISK (partition 0 == a whole device)
|- BLK (device/hw partition:rpmb) -- this is NOT a 'block' device, though.
||- DISK (partition 0 == a whole device)

    MMC: provides access methods (via blk_ops)
    BLK: represents a physical device and holds a geometry of the whole
         device and other info
    DISK: block-access entities with partition info
          (part=0 means a while disk)

    (MMC, BLK are of current implementation.)

To avoid confusion, UCLASS_PARTITION is renamed to UCLASS_DISK with
a little modified semantics. The name can be seen aligned with 'disk/'
directory for sw partitions.
Partition 0 expectedly behaves in the same way as an existing BLK.

With this scheme, I assume that we should thoroughly use new interfaces
    dev_read(struct udevice *dev, lbaint_t start,
                lbaint_t blkcnt, void *buffer);
    dev_write(struct udevice *dev, lbaint_t start,
                lbaint_t blkcnt, void *buffer);
for block-level operations with DISK devices.
                                ^^^^

The legacy interfaces with blk_desc's in BLK devices:
    blk_dread(struct blk_desc *block_dev, lbaint_t start,
                lbaint_t blkcnt, void *buffer)
    blk_dwrite(struct blk_desc *block_dev, lbaint_t start,
                lbaint_t blkcnt, void *buffer)l
are to be retained, at least, during the transition period
(mostly for existing filesystems and commands).

> The name 'disk' is pretty awful though, these days.

Think so?
Honestly, I'd like to rename BLK to DISK (or BLK_MEDIA) and
rename DISK to BLK to reflect their rolls :)

> If we want to iterate through all the partition tables across all
> devices, we could do that with a partition uclass. We could support
> different types of partition (s/w and h/w) with the same device
> driver.
> 
> I think conceptually it is cleaner to have a partition uclass but I do
> agree that it corresponds 100% to BLK, so maybe there is little value
> in practice. But which device holds the partition table in its
> dev_get_priv()?

Do you think that some device should have "partition table" info
in its inner data structure of udevice?
BLK-DISK relationship can represent a partition table in some way,
and MMC-BLK can model hw partitioning.

Thanks,
-Takahiro Akashi

> >
> > > > >> Do you mean hardware partition here? Otherwise I would not know what BLK
> > > > >> should model.
> > > > >
> > > > > I mean that (I think) we should not use BLK to model partitions. A BLK
> > > > > should just be a block device.
> > > >
> > > > That is fine. But this implies that a software partition is the child of
> > > > a block partition and not the other way round. So the tree should like:
> > > >
> > > > MMC
> > > > |- BLK (user hardware partition)
> > > > ||- PARTITION 1 (software partition)
> > > > ||- PARTITION 2 (software partition)
> > > > |...
> > > > ||- PARTITION n (software partition)
> > > > |- BLK (rpmb hardware partition)
> > > > |- BLK (boot0 hardware partition)
> > > > |- BLK (boot1 hardware partition)
> > >
> > > I presume you meant to include a BLK device under each PARTITION?
> > >
> > > But anyway, I was more thinking of this:
> > >
> > > MMC
> > > | HWPARTITION rpmb
> > > || BLK whole rpmb
> > > || PARTITION 1
> > > ||| BLK
> > > || PARTITION 2
> > > ||| BLK
> > > || PARTITION 3
> >
> > Do we have any reason to model a RPMB partition as a block device?
> > For linux, at least, mmcblkrpmb looks to be a character device.
> >
> > > ||| BLK
> > > | HWPARTITION boot0
> > > || BLK
> > > (maybe have PARTITION in here too?
> >
> > I don't know how boot partitions are used on a production system.
> > It's unlikely to have partitions on them given the purpose of "boot"
> > partitions?
> 
> That's true. So likely they will not be used.
> 
> >
> > > | HWPARTITION boot1
> > > (maybe have PARTITION in here too?
> > > || BLK
> > >
> > > >
> > > > >
> > > > > I don't see any difference between a partition and a hardware
> > > > > partition. We presumably end up with a hierarchy though. Do we need a
> > > > > HWPARTITION uclass so we can handle the hardware partitions
> > > > > differently?
> > > >
> > > > Software partitions are defined and discovered via partition tables.
> > > > Hardware partitions are defined in a hardware specific way.
> > > >
> > > > All software partitions map to HD() device tree nodes in UEFI.
> > > > An MMC device maps to an eMMC() node
> > > > MMC hardware partitions are mapped to Ctrl() nodes by EDK II. We should
> > > > do the same in U-Boot.
> > > > An SD-card maps to an SD() node.
> > > > An NVMe namespace maps to a NVMe() node.
> > > > An SCSI LUN maps to a Scsi() node.
> > > > SCSI channels of multiple channel controllers are mapped to Ctrl() nodes.
> > >
> > > I'm not quite sure about the terminology here. I'm not even talking
> > > about UEFI, really, just how best to model this stuff in U-Boot.
> >
> > In UEFI world, each efi_disk has its own device path to identify the device.
> > For example, here is a text representation of device path for a scsi disk
> > partition:
> >   /VenHw(e61d73b9-a384-4acc-aeab-82e828f3628b)/Scsi(0,0)/HD(1,GPT,ce86c5a7-b32a-488f-a346-88fe698e0edc,0x22,0x4c2a)
> >
> > which is set to be created from a corresponding udevice (more strictly
> > blkc_desc + part).
> >
> > So the issue Heinrich raised here is a matter of implementation of
> > this conversion (software partitions, and SCSI channels?) as well as
> > a modeling for some device type on U-Boot, i.e. MMC hardware partitions.
> 
> Yes I see that. It's just that we should get our house in order first,
> since these discussions didn't happen when the EFI layer was written 6
> years ago. If we have a good model for partitions (not just block
> devices) in U-Boot then it should be easier to map EFI onto it.
> 
> Regards,
> Simon
> 
> 
> >
> > -Takahiro Akashi
> >
> > > In U-Boot, UCLASS_SCSI should be a SCSI controller, not a device,
> > > right? I'm a little worried it is not modelled correctly. After all,
> > > what is the parent of a SCSI device?
> > >
> > > >
> > > > The simple file protocol is only provided by HD() nodes and not by nodes
> > > > representing hardware partitions. If the whole hardware partition is
> > > > formatted as a file system you would still create a HD() node with
> > > > partition number 0.
> > >
> > > Regards,
> > > Simon
> > ---
> > >
> > > >
> > > > When it comes to UEFI, I hope we can currently support hw partitions
> > > > in this way:
> > > >   => efidebug add boot -b 1 FOO mmc 0.1 /foo.bin ""
> > > > (".1" is a key, I have never tried this syntax though.)
> > > >
> > > > But probably its device path won't be properly formatted
> > > > as expected as Heinrich suggested.
> > > >
> > > > -Takahiro Akashi
> > > >
> > > >
> > > > > In terms of modeling, this is akin to how if you use a USB card reader
> > > > > that supports 4 different form-factor cards, you can end up with 4
> > > > > different devices showing up in Linux (if you have one of the nice card
> > > > > readers that supports multiple cards at once).
> > > > >
> > > > > --
> > > > > Tom
> > > >
> > > >
> > >
> > > Regards,
> > > Simon


More information about the U-Boot mailing list