[PATCH] RFC: tegra: xhci: Allocate from non-cached memory

Stephen Warren swarren at wwwdotorg.org
Mon Sep 14 17:50:51 CEST 2020


On 9/12/20 9:24 AM, Marek Vasut wrote:
> On 9/11/20 9:43 PM, twarren at nvidia.com wrote:
>> From: Tom Warren <twarren at nvidia.com>
>>
>> This fixes the XHCI driver on T210 boards (TX1, Nano). I was seeing
>> that Set_Address wasn't completing, returning with a Context Parameter
>> error. Examining the slot context, etc. showed that the correct info was
>> there in RAM. Once I set 'dcache off' globally, it started working.
>> This patch was created to force the TRB, etc. allocation to be in
>> non-cached memory, which resulted in XHCI working on Nano/TX1 w/o the
>> need for a global dcache disable. Thierry Reding pointed to a similar
>> fix he'd done for the rtl6189 driver.
>>
>> Sending this to the list for comment, as this should have affected other
>> XHCI implementations on other SoCs. Note that Tegra X1 (T210) has a
>> 64-byte cache line size (64-bit ARMv8), and I do see the
>> flush_cache/inval_cache ARM code being called via
>> xhci_cache_flash/xhci_inval_cache.
> 
> Is cache management on tegra210 broken ? I've seen the same non-cached
> workaround in the DWMAC ethernet driver.

I believe the issue with DWMAC and r8169 is related to the size/layout
of the descriptors; the Ethernet adapter descriptor size is smaller than
one cache line, and there isn't a way to tell the Ethernet HW to allow
gaps between them to align them with cache lines. Consequently, it's
impossible to perform cache operations that only apply to a single
descriptor, which in turn means that adjacent descriptors are
potentially corrupted when performing cache operations. Disabling the
cache is required in that case. That is unless the HW supports
linked-lists of descriptors so SW can lay them out at will. I don't
recall if either HW supports this, and even if one/both do, then the
driver doesn't currently do this so disabling cache is still the
quickest way of making the HW work. I'd expect this issue to apply to
any ARMv8 system, since IIRC doesn't ARMv8 specify the cache line size?
If not, at least the issue will apply to any system that uses a cache
line size at least as large as Tegra210.

For this XHCI case, there is some other problem, since the cache line
size matches the XHCI descriptor size.


More information about the U-Boot mailing list