[U-Boot] [PATCH v5 02/19] usb: dwc2: Use separate input and output buffers
Stefan Bruens
stefan.bruens at rwth-aachen.de
Sun Apr 2 21:34:13 UTC 2017
On Sonntag, 2. April 2017 17:43:38 CEST Simon Glass wrote:
> Hi Stefan,
>
> On 2 April 2017 at 07:10, Stefan Bruens <stefan.bruens at rwth-aachen.de>
wrote:
> > On Sonntag, 2. April 2017 05:01:41 CEST Marek Vasut wrote:
> >> On 04/02/2017 01:40 AM, Simon Glass wrote:
> >> > Hi Marek,
> >> >
> >> > On 1 April 2017 at 14:15, Marek Vasut <marex at denx.de> wrote:
> >> >> On 04/01/2017 08:05 PM, Simon Glass wrote:
> >> >>> On Raspberry Pi 2 and 3 a problem was noticed when enabling driver
> >> >>> model
> >> >>> for USB: the cache invalidate after an incoming transfer does not
> >> >>> seem
> >> >>> to
> >> >>> work correctly.
> >> >>>
> >> >>> This may be a problem with the underlying caching implementation on
> >> >>> armv7
> >> >>> and armv8 but this seems very unlikely. As a work-around, use
> >> >>> separate
> >> >>> buffers for input and output. This ensures that the input buffer will
> >> >>> not
> >> >>> hold dirty cache data.
> >> >>
> >> >> What do you think of this patch:
> >> >> [U-Boot] usb: dwc2: invalidate the dcache before starting the DMA
> >> >
> >> > Yes that matches what I did as a hack. I didn't realise that the DMA
> >> > would go through the cache. Thanks for the pointer.
> >>
> >> DMA should not go through the cache. I have yet to review that patch,
> >> but IMO it's relevant to this problem you observe.
> >
> > DMA transfers not going through the cache is probably the problem here:
> >
> > Assume we have the aligned_buffer at address 0xdead0000
> >
> > 1. The cpu writes to address 0xdead0002. This is fine, as it is the
> > current
> > owner of the address. The cacheline is marked dirty.
> > 2. The cpu no longer needs the corresponding address range, and it is
> > reallocated (i.e. freed and then allocated from dwc2) or reused (i.e.
> > formerly out buffer, now in buffer).
> > 3. The CPU starts the DMA transfer
> > 4. The DMA transfer writes to e.g. 0xdead0000-0xdead0200 in memory.
> > 5. The CPU fetches an address aliasing with 0xdead0000. The dirty cache
> > line is evicted, and the 0xdead0000-0xdead0040 memory contents are
> > overwritten.
> This is the part I don't understand. This should be an invalidate, not
> a clean and invalidate, so there should be not memory write.
>
> Also if the CPU fetches from cached 0xdead0000 without an invalidate,
> it will not cause a cash clean. It will simple read the data from the
> cache and ignore what the DMA wrote.
The CPU does not fetch 0xdead0000, but from an address *aliasing* with
0xdead000. As 0xdead0000 is *dirty* (we have neither flushed (clears dirty
bit) or invalidated (implicitly clears dirty for the address)), the cache
controller has to write out the 0xdead0000 cache line to memory.
> On armv8 we appear not to suppose invalidate in the code, so it makes
> sense for rpi_3.
> But for rpi_2 which seems to do a proper invalidate, I still don't see
> the problem.
Which part of the code is different between rpi2 and rpi3? The dwc2 code is
identical, is the memory invalidated in some other place?
> > Obviously, the dirty cache line from (1.) has to be cleared at the
> > beginning of (3.), as Eddys patch does.
>
> But I still don't understand why we have to clean instead of just
> invalidate?
The patch by Eddie Cai just does an invalidate_dcache_range on the transfer
buffer, nothing else. Where do you see a "clean" (whatever that refers to)?
Kind regards,
Stefan
--
Stefan Brüns / Bergstraße 21 / 52062 Aachen
home: +49 241 53809034 mobile: +49 151 50412019
work: +49 2405 49936-424
More information about the U-Boot
mailing list