[U-Boot] [PATCH v5 02/19] usb: dwc2: Use separate input and output buffers

Simon Glass sjg at chromium.org
Sun Apr 2 15:43:38 UTC 2017


Hi Stefan,

On 2 April 2017 at 07:10, Stefan Bruens <stefan.bruens at rwth-aachen.de> wrote:
> On Sonntag, 2. April 2017 05:01:41 CEST Marek Vasut wrote:
>> On 04/02/2017 01:40 AM, Simon Glass wrote:
>> > Hi Marek,
>> >
>> > On 1 April 2017 at 14:15, Marek Vasut <marex at denx.de> wrote:
>> >> On 04/01/2017 08:05 PM, Simon Glass wrote:
>> >>> On Raspberry Pi 2 and 3 a problem was noticed when enabling driver model
>> >>> for USB: the cache invalidate after an incoming transfer does not seem
>> >>> to
>> >>> work correctly.
>> >>>
>> >>> This may be a problem with the underlying caching implementation on
>> >>> armv7
>> >>> and armv8 but this seems very unlikely. As a work-around, use separate
>> >>> buffers for input and output. This ensures that the input buffer will
>> >>> not
>> >>> hold dirty cache data.
>> >>
>> >> What do you think of this patch:
>> >> [U-Boot] usb: dwc2: invalidate the dcache before starting the DMA
>> >
>> > Yes that matches what I did as a hack. I didn't realise that the DMA
>> > would go through the cache. Thanks for the pointer.
>>
>> DMA should not go through the cache. I have yet to review that patch,
>> but IMO it's relevant to this problem you observe.
>
> DMA transfers not going through the cache is probably the problem here:
>
> Assume we have the aligned_buffer at address 0xdead0000
>
> 1. The cpu writes to address 0xdead0002. This is fine, as it is the current
> owner of the address. The cacheline is marked dirty.
> 2. The cpu no longer needs the corresponding address range, and it is
> reallocated (i.e. freed and then allocated from dwc2) or reused (i.e. formerly
> out buffer, now in buffer).
> 3. The CPU starts the DMA transfer
> 4. The DMA transfer writes to e.g. 0xdead0000-0xdead0200 in memory.
> 5. The CPU fetches an address aliasing with 0xdead0000. The dirty cache line
> is evicted, and the 0xdead0000-0xdead0040 memory contents are overwritten.

This is the part I don't understand. This should be an invalidate, not
a clean and invalidate, so there should be not memory write.

Also if the CPU fetches from cached 0xdead0000 without an invalidate,
it will not cause a cash clean. It will simple read the data from the
cache and ignore what the DMA wrote.

On armv8 we appear not to suppose invalidate in the code, so it makes
sense for rpi_3.

But for rpi_2 which seems to do a proper invalidate, I still don't see
the problem.

>
> Obviously, the dirty cache line from (1.) has to be cleared at the beginning
> of (3.), as Eddys patch does.

But I still don't understand why we have to clean instead of just invalidate?

Regards,
Simon


More information about the U-Boot mailing list