[PATCH 0/4] Fix IOVA allocation in Apple dart iommu after global LMB mem map changes

Janne Grunau j at jannau.net
Wed Nov 6 08:13:37 CET 2024


On Wed, Nov 06, 2024 at 12:43:01AM +0530, Sughosh Ganu wrote:
> On Sun, 3 Nov 2024 at 23:06, Janne Grunau <j at jannau.net> wrote:
> >
> > On Sun, Nov 03, 2024 at 07:53:36PM +0530, Sughosh Ganu wrote:
> > > On Sat, 2 Nov 2024 at 16:00, Mark Kettenis <mark.kettenis at xs4all.nl> wrote:
> > > >
> > > > > From: Sughosh Ganu <sughosh.ganu at linaro.org>
> > > > > Date: Fri, 1 Nov 2024 17:17:22 +0530
> > > > >
> > > > > On Fri, 1 Nov 2024 at 15:47, Janne Grunau <j at jannau.net> wrote:
> > > > > >
> > > > > > The changes in "Make LMB memory map global and persistent" [1] break
> > > > > > mapping DMA memory in the USB xHCI driver when using the apple_dart
> > > > > > iommu present on Apple silicon systems.
> > > > > >
> > > > > > The IOVA space used by the u-boot driver (low 4GB) and physical memory
> > > > > > do not overlap. The physical memory on this systems starts depending on
> > > > > > the SoC either at 0x10_0000_0000 or 0x100_0000_0000. It make no sense to
> > > > > > manage these distinct regions in a single LMB map. In addition every
> > > > > > device has its own iommu and IO address space so sharing a single memory
> > > > > > map between all iommu instances is not necessary.
> > > > > >
> > > > > > To fix this issue restore the used subset (add, alloc and free) of the
> > > > > > previous pointer based LMB interface with "io_" as prefix.
> > > > > >
> > > > > > To ensure that low level lmb functions do not use the global LMB
> > > > > > variable reorder lib/lmb.c so that the variable is not visible.
> > > > > >
> > > > > > Tested with patches from my "Fix device removal order for Apple dart
> > > > > > iommu" series [2] to fix a separate issue.
> > > > > >
> > > > > > The cosmetic commit has two checkpatch warnings in existing code which I
> > > > > > ignored.
> > > > > >
> > > > > > [1] https://lore.kernel.org/u-boot/20240826115940.3233167-1-sughosh.ganu@linaro.org/
> > > > > > [2] https://lore.kernel.org/u-boot/20241031-iommu_apple_dart_ordering-v1-0-8a6877946d6b@jannau.net/
> > > > > >
> > > > > > Signed-off-by: Janne Grunau <j at jannau.net>
> > > > > > ---
> > > > > > Janne Grunau (4):
> > > > > >       lmb: Do not use global LMB variable in _lmb_free()
> > > > > >       lmb: cosmetic: reorder functions and global LMB variable
> > > > > >       lmb: Add basic io_lmb functionality
> > > > > >       iommu: apple: Manage IOVA separately from global LMB mem map
> > > > >
> > > > > Thanks for these changes. I am currently on leave, but will take a
> > > > > look at this in detail next week. But on a cursory glance, I have to
> > > > > say that I am not entirely satisfied with this solution as well. With
> > > > > these patches, we now have a disparity in terms on how we view the LMB
> > > > > memory map -- we have a global map for RAM address range, but a per
> > > > > device one for IOVA's. I understand that IOVA's are supposed to be
> > > > > per-device,
> > > >
> > > > The IVOA's are per-stream.  Multiple devices might share a single
> > > > stream.
> > > >
> > > > > but then I think that perhaps LMB is not the right way to
> > > > > solve this. This was fine earlier, when the entire LMB implementation
> > > > > in U-Boot was local, but that is not the case now. I wonder if there
> > > > > can be a different, non-LMB way to implement this.
> > > >
> > > > Of course there can be a different implementation.  But why add a
> > > > separate implementation for what is essentially the same problem:
> > > > allocate blocks out of an address space.
> > >
> > > My primary concern here is to prevent setting a precedent that it is
> > > okay to have a local, per caller lmb instance.
> >
> > I think the issue is thinking this is per caller. It is not (primarily)
> > by caller. It is one LMB instance per managed independent address space
> > block. Apple silicon SoCs have multiple independent iommus. Each iommu
> > can handle multiple address spaces (streams, only partially implemented
> > in the u-boot driver.
> >
> > > If it is allowed for
> > > one caller, it becomes difficult to say no to someone proposing a
> > > similar approach later. In any case, based on how the allocation logic
> > > works, with all the addresses being allocated from the IOMMU driver's
> > > map function callback, why can there not be a single instance of an
> > > iova lmb structure. That would at least ensure some kind of parity on
> > > how the lmb maps are maintained.
> >
> > There are multiple fully independent IOVA spaces. I doesn't make sense
> > to allocate them from a single pool.
> >
> > > > Before you changed the goal posts, lmb did exactly that.  The only
> > > > problem was that there were multiple lmb instances covering the same
> > > > "main memory" address space.
> > >
> > > I had provided an explanation of the situation that existed prior to
> > > these changes in a writeup [1]. And that writeup had clearly provided
> > > the two possible solutions to fix the issue with EFI allocated memory
> > > working with LMB. And it was decided through discussion on that
> > > writeup as well as on the mailing list that this was the approach to
> > > be taken.
> >
> > That document missed due to unfortunate circumstances one special case
> > in the LMB usage. Consolidating all struct lmb referring main memory
> > into a single opaque object is useful. Please let's move on to fix the
> > single usage which broke due to the change.
> >
> > From my point of view it's clear that "io_lmb*" is separate from the
> > main memory lmb.
> 
> Yes, that is my view as well. Which is why I was proposing a different
> mechanism for the IOVA allocations.

It's not clear to me what you mean with "different mechanism". My
understanding was that it would be completely separate from lmb but that
is at conflict with the statement of code reuse below.

> > It manages the IO virtual address space of devices and
> > it is fine that there as many (io_)lmb instances as there are IO virtual
> > address spaces. These do not have to be managed via struct lmb and
> > looking at the minimal io_lmb* API a simpler implementation would work.
> 
> Okay. Although I wonder, given your explanation of multiple, per
> device IO VA spaces, how would these spaces be identified, if not
> through an instance of the lmb structure.

If the u-boot driver would gain support for that, each IOVA space would
have its own lmb instance. On iommu side the IOVA spaces are called
stream and the driver would have one lmb structure for each stream.

> > An alternative implementation would be duplicated code and reusing the
> > existing tested lmb code is desireable.
> 
> Yes, I agree.
> 
> > If we want a stronger distinction between lmb and io_lmb we could hide
> > struct lmb from io_lmb.
> 
> Okay, but I am not sure how you would be able to achieve that. Are you
> proposing using the region lists directly? If so, I would say that if
> we are using the LMB code for the allocations, we might as well use
> the structure.

The most basic way would be a struct io_lmb which just embeds struct
lmb and the io_lmb_* API would use struct io_lmb pointers as arguments.
This would provide some type safety but I don't think it is necessary as
the lmb API doesn't use struct lmb pointers (with the exception of
lmb_push/lmb_pop);

best regards,
Janne


More information about the U-Boot mailing list