[U-Boot] [PATCH v5 02/19] usb: dwc2: Use separate input and output buffers

Stefan Bruens stefan.bruens at rwth-aachen.de
Sun Apr 2 21:34:13 UTC 2017


On Sonntag, 2. April 2017 17:43:38 CEST Simon Glass wrote:
> Hi Stefan,
> 
> On 2 April 2017 at 07:10, Stefan Bruens <stefan.bruens at rwth-aachen.de> 
wrote:
> > On Sonntag, 2. April 2017 05:01:41 CEST Marek Vasut wrote:
> >> On 04/02/2017 01:40 AM, Simon Glass wrote:
> >> > Hi Marek,
> >> > 
> >> > On 1 April 2017 at 14:15, Marek Vasut <marex at denx.de> wrote:
> >> >> On 04/01/2017 08:05 PM, Simon Glass wrote:
> >> >>> On Raspberry Pi 2 and 3 a problem was noticed when enabling driver
> >> >>> model
> >> >>> for USB: the cache invalidate after an incoming transfer does not
> >> >>> seem
> >> >>> to
> >> >>> work correctly.
> >> >>> 
> >> >>> This may be a problem with the underlying caching implementation on
> >> >>> armv7
> >> >>> and armv8 but this seems very unlikely. As a work-around, use
> >> >>> separate
> >> >>> buffers for input and output. This ensures that the input buffer will
> >> >>> not
> >> >>> hold dirty cache data.
> >> >> 
> >> >> What do you think of this patch:
> >> >> [U-Boot] usb: dwc2: invalidate the dcache before starting the DMA
> >> > 
> >> > Yes that matches what I did as a hack. I didn't realise that the DMA
> >> > would go through the cache. Thanks for the pointer.
> >> 
> >> DMA should not go through the cache. I have yet to review that patch,
> >> but IMO it's relevant to this problem you observe.
> > 
> > DMA transfers not going through the cache is probably the problem here:
> > 
> > Assume we have the aligned_buffer at address 0xdead0000
> > 
> > 1. The cpu writes to address 0xdead0002. This is fine, as it is the
> > current
> > owner of the address. The cacheline is marked dirty.
> > 2. The cpu no longer needs the corresponding address range, and it is
> > reallocated (i.e. freed and then allocated from dwc2) or reused (i.e.
> > formerly out buffer, now in buffer).
> > 3. The CPU starts the DMA transfer
> > 4. The DMA transfer writes to e.g. 0xdead0000-0xdead0200 in memory.
> > 5. The CPU fetches an address aliasing with 0xdead0000. The dirty cache
> > line is evicted, and the 0xdead0000-0xdead0040 memory contents are
> > overwritten.
> This is the part I don't understand. This should be an invalidate, not
> a clean and invalidate, so there should be not memory write.
> 
> Also if the CPU fetches from cached 0xdead0000 without an invalidate,
> it will not cause a cash clean. It will simple read the data from the
> cache and ignore what the DMA wrote.

The CPU does not fetch 0xdead0000, but from an address *aliasing* with 
0xdead000. As 0xdead0000 is *dirty* (we have neither flushed (clears dirty 
bit) or invalidated (implicitly clears dirty for the address)), the cache 
controller has to write out the 0xdead0000 cache line to memory.

> On armv8 we appear not to suppose invalidate in the code, so it makes
> sense for rpi_3.

> But for rpi_2 which seems to do a proper invalidate, I still don't see
> the problem.

Which part of the code is different between rpi2 and rpi3? The dwc2 code is 
identical, is the memory invalidated in some other place?
 
> > Obviously, the dirty cache line from (1.) has to be cleared at the
> > beginning of (3.), as Eddys patch does.
> 
> But I still don't understand why we have to clean instead of just
> invalidate?

The patch by Eddie Cai just does an invalidate_dcache_range on the transfer 
buffer, nothing else. Where do you see a "clean" (whatever that refers to)?

Kind regards,

Stefan

-- 
Stefan Brüns  /  Bergstraße 21  /  52062 Aachen
home: +49 241 53809034     mobile: +49 151 50412019
work: +49 2405 49936-424


More information about the U-Boot mailing list