[PATCH 0/4] Fix IOVA allocation in Apple dart iommu after global LMB mem map changes

Sughosh Ganu sughosh.ganu at linaro.org
Tue Nov 5 20:13:01 CET 2024


On Sun, 3 Nov 2024 at 23:06, Janne Grunau <j at jannau.net> wrote:
>
> On Sun, Nov 03, 2024 at 07:53:36PM +0530, Sughosh Ganu wrote:
> > On Sat, 2 Nov 2024 at 16:00, Mark Kettenis <mark.kettenis at xs4all.nl> wrote:
> > >
> > > > From: Sughosh Ganu <sughosh.ganu at linaro.org>
> > > > Date: Fri, 1 Nov 2024 17:17:22 +0530
> > > >
> > > > On Fri, 1 Nov 2024 at 15:47, Janne Grunau <j at jannau.net> wrote:
> > > > >
> > > > > The changes in "Make LMB memory map global and persistent" [1] break
> > > > > mapping DMA memory in the USB xHCI driver when using the apple_dart
> > > > > iommu present on Apple silicon systems.
> > > > >
> > > > > The IOVA space used by the u-boot driver (low 4GB) and physical memory
> > > > > do not overlap. The physical memory on this systems starts depending on
> > > > > the SoC either at 0x10_0000_0000 or 0x100_0000_0000. It make no sense to
> > > > > manage these distinct regions in a single LMB map. In addition every
> > > > > device has its own iommu and IO address space so sharing a single memory
> > > > > map between all iommu instances is not necessary.
> > > > >
> > > > > To fix this issue restore the used subset (add, alloc and free) of the
> > > > > previous pointer based LMB interface with "io_" as prefix.
> > > > >
> > > > > To ensure that low level lmb functions do not use the global LMB
> > > > > variable reorder lib/lmb.c so that the variable is not visible.
> > > > >
> > > > > Tested with patches from my "Fix device removal order for Apple dart
> > > > > iommu" series [2] to fix a separate issue.
> > > > >
> > > > > The cosmetic commit has two checkpatch warnings in existing code which I
> > > > > ignored.
> > > > >
> > > > > [1] https://lore.kernel.org/u-boot/20240826115940.3233167-1-sughosh.ganu@linaro.org/
> > > > > [2] https://lore.kernel.org/u-boot/20241031-iommu_apple_dart_ordering-v1-0-8a6877946d6b@jannau.net/
> > > > >
> > > > > Signed-off-by: Janne Grunau <j at jannau.net>
> > > > > ---
> > > > > Janne Grunau (4):
> > > > >       lmb: Do not use global LMB variable in _lmb_free()
> > > > >       lmb: cosmetic: reorder functions and global LMB variable
> > > > >       lmb: Add basic io_lmb functionality
> > > > >       iommu: apple: Manage IOVA separately from global LMB mem map
> > > >
> > > > Thanks for these changes. I am currently on leave, but will take a
> > > > look at this in detail next week. But on a cursory glance, I have to
> > > > say that I am not entirely satisfied with this solution as well. With
> > > > these patches, we now have a disparity in terms on how we view the LMB
> > > > memory map -- we have a global map for RAM address range, but a per
> > > > device one for IOVA's. I understand that IOVA's are supposed to be
> > > > per-device,
> > >
> > > The IVOA's are per-stream.  Multiple devices might share a single
> > > stream.
> > >
> > > > but then I think that perhaps LMB is not the right way to
> > > > solve this. This was fine earlier, when the entire LMB implementation
> > > > in U-Boot was local, but that is not the case now. I wonder if there
> > > > can be a different, non-LMB way to implement this.
> > >
> > > Of course there can be a different implementation.  But why add a
> > > separate implementation for what is essentially the same problem:
> > > allocate blocks out of an address space.
> >
> > My primary concern here is to prevent setting a precedent that it is
> > okay to have a local, per caller lmb instance.
>
> I think the issue is thinking this is per caller. It is not (primarily)
> by caller. It is one LMB instance per managed independent address space
> block. Apple silicon SoCs have multiple independent iommus. Each iommu
> can handle multiple address spaces (streams, only partially implemented
> in the u-boot driver.
>
> > If it is allowed for
> > one caller, it becomes difficult to say no to someone proposing a
> > similar approach later. In any case, based on how the allocation logic
> > works, with all the addresses being allocated from the IOMMU driver's
> > map function callback, why can there not be a single instance of an
> > iova lmb structure. That would at least ensure some kind of parity on
> > how the lmb maps are maintained.
>
> There are multiple fully independent IOVA spaces. I doesn't make sense
> to allocate them from a single pool.
>
> > > Before you changed the goal posts, lmb did exactly that.  The only
> > > problem was that there were multiple lmb instances covering the same
> > > "main memory" address space.
> >
> > I had provided an explanation of the situation that existed prior to
> > these changes in a writeup [1]. And that writeup had clearly provided
> > the two possible solutions to fix the issue with EFI allocated memory
> > working with LMB. And it was decided through discussion on that
> > writeup as well as on the mailing list that this was the approach to
> > be taken.
>
> That document missed due to unfortunate circumstances one special case
> in the LMB usage. Consolidating all struct lmb referring main memory
> into a single opaque object is useful. Please let's move on to fix the
> single usage which broke due to the change.
>
> From my point of view it's clear that "io_lmb*" is separate from the
> main memory lmb.

Yes, that is my view as well. Which is why I was proposing a different
mechanism for the IOVA allocations.

 It manages the IO virtual address space of devices and
> it is fine that there as many (io_)lmb instances as there are IO virtual
> address spaces. These do not have to be managed via struct lmb and
> looking at the minimal io_lmb* API a simpler implementation would work.

Okay. Although I wonder, given your explanation of multiple, per
device IO VA spaces, how would these spaces be identified, if not
through an instance of the lmb structure.

> An alternative implementation would be duplicated code and reusing the
> existing tested lmb code is desireable.

Yes, I agree.

> If we want a stronger distinction between lmb and io_lmb we could hide
> struct lmb from io_lmb.

Okay, but I am not sure how you would be able to achieve that. Are you
proposing using the region lists directly? If so, I would say that if
we are using the LMB code for the allocations, we might as well use
the structure.

-sughosh

>
> Janne


More information about the U-Boot mailing list