[U-Boot] [RFC PATCH] usb: dwc2: handle bcm2835 phys->virt address translations
Stephen Warren
swarren at wwwdotorg.org
Tue Mar 17 18:29:16 CET 2015
On 03/17/2015 08:57 AM, popcorn mix wrote:
> On 17/03/15 03:04, Stephen Warren wrote:
>> It would be nice though if someone from the RPi Foundation could comment
>> on the exact effect of the upper bus address bits, and why 0xc would
>> work for RPi2 but 0x4 for the RPi 1. I wonder if the ARM cache status
>> (enabled, disabled) interacts with the GPU cache enable in any way, e.g.
>> burst vs. non-burst transactions on the bus or something? That's about
>> the only reason I can see for the RPi Foundation kernel working with 0x4
>> bus addresses on both chips, but U-Boot needing something different on
>> RPi2...
>>
>> Dom, for reference, see:
>> http://lists.denx.de/pipermail/u-boot/2015-March/207947.html
>> http://lists.denx.de/pipermail/u-boot/2015-March/thread.html#207947
Thanks for the great explanation. I'll have to bookmark/archive it:-)
> First, remember that 2835 is a large GPU with a small ARM attached. On
> some platforms the ARM is not even used.
> The GPU boots first and may wake the arm. The GPU is the centre of the
> universe, and the ARM has to fit in.
>
> Okay, I'll try to explain what goes on. Here are my definitions of some
> terms:
>
> bus address: a VideoCore/GPU address. The lower 30-bits define the 1G of
> addressable memory. The top two bits define the caching alias.
> physical address: An ARM side address given to the VC MMU. This is a 30
> bit address space.
>
> The GPU always uses bus addresses. GPU bus mastering peripherals (like
> DMA) use bus addresses. The ARM uses physical addresses.
>
> VC MMU: A coarse MMU used by the arm for accessing GPU memory. Each page
> is 16M and there are 64 pages. This maps 30-bits of physical address to
> 32-bits of bus address.
>
> The setup of VC MMU is handled by the GPU and by default the mapping is:
> 2835: first 32 pages map physical addresses 0x00000000-0x1fffffff to bus
> addresses 0x40000000-0x5ffffffff. The next page maps physical adddress
> 0x20000000 to 0x20ffffff to bus addresses 0x7e000000 to 0x7effffff
>
> 2836: first 63 pages map physical addresses 0x00000000-0x3effffff to bus
> addresses 0xc0000000-0xfefffffff. The next page maps physical adddress
> 0x3f000000 to 0x3fffffff to bus addresses 0x7e000000 to 0x7effffff
OK, this explains why in U-Boot, we need to OR in 0x40000000 on bcm2835
and 0xc0000000 on bcm2836; that matches the VC MMU setup.
I guess we need to fix the U-Boot mailbox driver too, and many things in
the upstream RPi kernel.
I have two more questions:
1)
Do the RPi 1 and RPi 2 use different kernel binaries in the RPi
Foundation's images? I'd assumed there was a single unified binary which
supported both. The reason I ask is that I see:
> https://github.com/raspberrypi/linux/blob/rpi-3.18.y/arch/arm/mach-bcm2708/include/mach/memory.h#L38
> #ifdef CONFIG_BCM2708_NOL2CACHE
> #define _REAL_BUS_OFFSET UL(0xC0000000) /* don't use L1 or L2 caches */
> #else
> #define _REAL_BUS_OFFSET UL(0x40000000) /* use L2 cache */
> #endif
That's identical in the mach-bcm2709 version too. However,
arch/arm/mach-bcm270[89]/Kconfig's entry for that config option:
> config BCM2708_NOL2CACHE
> bool "Videocore L2 cache disable"
> depends on MACH_BCM2709
> default y
> help
> Do not allow ARM to use GPU's L2 cache. Requires disable_l2cache in config.txt.
Has "default n" for the bcm2708 version and "default y" for the bcm2709
version. If I'd noticed that difference in default value, it would have
been a big clue that what I proposed in the U-Boot patch was correct!
Anyway, this implies that there are separate kernel binaries for the RPi
1 and RPi 2, since otherwise those default values wouldn't work.
2)
I assume the SDHCI controller (RPi SD card, CM eMMC) is affected by this
just as much; we need to use bus addresses not ARM physical addresses
when programming any DMA there?
Perhaps this would explain why I had issues with the eMMC on the CM (I
think only in the kernel though, whereas U-Boot may have been fine; I'll
have to check)
...
> So, on 2835 the ARM has a 16K L1 cache and no L2 cache. The GPU has a
> 128M L2 cache. The GPU's L2 cache is accessible from the ARM but it's
> not particularly close (i.e. not very fast).
> However mapping through the L2 allocating alias (0x4) was shown to be
> beneficial on 2835, so that is the alias we use.
>
> The situation is different on 2836. The ARM has a 32K L1 cache and a
> 512M integrated/fast L2 cache. Additionally going through the
> smaller/slower GPU L2 is bad for performance.
> So, we map through the SDRAM alias (0xc) and avoid the GPU L2 cache.
I assume 128M and 512M there should be 128K and 512K?
More information about the U-Boot
mailing list