32-bit DMA limit for devices (and drivers)

Mark Kettenis mark.kettenis at xs4all.nl
Fri Apr 30 14:02:52 CEST 2021

> Date: Fri, 30 Apr 2021 12:21:21 +0100
> From: Andre Przywara <andre.przywara at arm.com>
> Hi,
> We now see the first Allwinner devices [1] having DRAM located above
> 4GB in address space (4GB DRAM starting at 1GB). After one fix[2]
> this works somewhat fine, but the sun8i-emac network device is still
> limited to 32-bit DMA addresses. With U-Boot relocating itself (plus
> stack and heap) to the end of DRAM, it now runs completely beyond 4GB
> on those machines, so not giving pure 32-bit addresses for buffers
> anymore.
> In Linux we handle this easily by just keeping the default DMA
> mask at 32 bits, and letting the DMA framework deal with the nasty
> details.
> I was wondering how this should be handled in U-Boot? The straight
> forward solution would be:
> - Let the driver allocate the RX and TX buffers separately, placing them
>   below 4GB in the address space (using lmb_reserve(), I guess?)
> - Use those RX buffers and hand the addresses back to the upper layers.
> - We already copy TX packets, so this would also be covered, in this
>   situation. Other drivers might need to introduce copying.

What you describe here is called a bounce buffer approach.  I believe
Linux developers also refer to this as swiotlb.

> This sounds like a common problem, so I was wondering if there is a
> more generic solution to this? Maybe there are already platforms or
> devices affected? Or should the whole heap and stack be moved below 4GB
> (if this is easily possible)?
> In our case we make the buffers part of our priv struct, so should
> there be an option to let the priv_auto allocation come from below 4GB?
> Grateful for any input on this!

I looked into this a bit when I was trying to figure out what to do on
Apple M1 systems where I have a somewhat related issue.  These systems
have an IOMMU that can't be bypassed.  Since I don't want to add IOMMU
infrastructure to U-Boot, I set up the IOMMU to map a fixed block of
physical memory and make sure that all allocations of memory come from
that block of memory.  In this case this is fairly easy to achieve.
U-Boot allocates memory from the top of usable memory, so as long as I
let the IOMMU map that high memory, things work.  U-Boot doesn't need
a lot of memory, so a block of 512MB is more than sufficient.

In your case this means that as long as you set the top of usable
memory to an address < 4G, U-Boot itself should be fine and no bounce
buffers are needed.  You have to make sure the addresses in the U-Boot
environment for loading things like the kernel and the FDT are set to
an address < 4G as well.

For EFI things are different though.  You want to expose all physical
memory in the EFI memory map.  This means that an EFI application
(such as an OS loader) may pick memory > 4G and use it to do I/O.  For
this purpose U-Boot already implements bounce buffers.  See the

Hope that helps!

More information about the U-Boot mailing list