[PATCH] nvme: Invalidate dcache before submitting admin cmd

André Przywara andre.przywara at arm.com
Wed Jun 10 13:27:00 CEST 2020


On 09/06/2020 19:47, Michael Nazzareno Trimarchi wrote:

Hi,

that looks much better now, thanks! Some suggestion below.

> On Tue, Jun 9, 2020 at 6:13 AM Bin Meng <bmeng.cn at gmail.com> wrote:
>>
>> Hi André,
>>
>> On Mon, Jun 8, 2020 at 9:52 PM André Przywara <andre.przywara at arm.com> wrote:
>>>
>>> On 07/06/2020 12:22, Jagan Teki wrote:
>>>
>>> Hi,
>>>
>>> (CC: ing Mark)
>>>
>>> Without looking to deep, I think invalidating the cache might be the
>>> right thing to do, but the rationale or at least the wording of it seems
>>> somehow flawed:
>>>
>>>> Some architecture like ARM Cortex A53, A72 would need
>>>
>>> Please don't mix the term "architecture" with actual implementations.
>>> And the problem is more general, I would say. This *might* be a case
>>> here where this is due to speculation, and then I would expect it to
>>> only show on an A72, for instance. I guess this is about NVMe on RK3399?
>>> Does U-Boot run on an A53 or an A72 core?
>>>
>>>> to invalidate dcache to sync the cache with the memory
>>>> contents before flushing the cache to memory.
>>>
>>> That is somewhat confusing. What does "sync" and "flush" mean here?
>>> AFAIK only "clean" and "invalidate" are clearly defined terms.
>>> The command buffer is cleaned to DRAM in nvme_submit_cmd(), so this is
>>> purely about the buffer for the *return* data, in which case I would
>>> expect this being a pure invalidation problem.
>>>
>>> This issue looks like the case described on slide 30 and 31 here:
>>> https://elinux.org/images/6/6d/Rutland2.pdf
>>
>> Thanks for the slides. This is really helpful.
>>
>>>
>>>> The NVME here submitting the admin command using dma_addr
>>>> to the memory without prior cache invalidation. This causing
>>>> dma_addr is pointing to an invalid location in the memory
>>>
>>> This does not sound right: dma_addr is always pointing to the right
>>> location in memory.
>>> The actual reason this fixes something might be that (some) cache lines
>>> of the DMA buffer were dirty and were evicted *before* the existing
>>> invalidation, but *after* the DMA transfer. That sounds unlikely in an
>>> U-Boot context, though, but would match the case described in Mark's slides.
>>>
>>> So there might be more to this. Are we missing a barrier here? Should we
>>> use coherent buffers (memory mapped as un-cached) for the return DMA in
>>> the first place (but I don't think U-Boot provides those)?
>>>
>>>> and found the sane nvme_ctrl result.
>>>>
>>>> Below example shows the nvme disk scan result improper result
>>>>
>>>> => nvme scan
>>>> nvme_get_info_from_identify: nn = 544502629, vwc = 100,
>>>> sn = dev_0T, mn = `�\�, fr = t_part, mdts = 105
>>>>
>>>> So, invalidating the cache before submitting the admin command
>>>> makes the dma_addr points to a valid location in the memory.
>>>
>>> If this shows up already, I think we should address all the comments in
>>> the driver where it says we should do cache maintenance. And it makes me
>>> wonder how this actually works today ...
>>>
>>
>> I believe the driver was only validated on x86 and NXP's PowerPC
>> series where cache coherence is ensured between the DMA master and the
>> system memory. I suspect it was never validated on ARM hence this
>> patch.
>>
>> I agree we should address all the comments in the driver about cache
>> maintenance.
> 
> What about this description:
> 
> nvme: Invalidate dcache before submitting admin cmd
> 
> This patch tries to avoid eviction of dirty lines during DMA
> transfer. The code right now executes the following step:
> 
> - allocate the buffer
> - start a DMA operation using the non-coherent DMA buffer
> - invalidate cache lines associated with the buffer
> - read the buffer
> 
> This can lead to reading back not valid information. CPU can still read
> stale data. In order to avoid the eviction of dirty lines, a new
> invalidation is required before starting the DMA operation. The patch
> just adds an invalidation before submitting the DMA command.

As I think this case is not obvious, I would make it more clear what
actually happens here, for instance (replace your paragraph above):

======================
This can lead to reading back not valid information, because the cache
controller could evict dirty cache lines belonging to the buffer *after*
the DMA operation has started to fill the DRAM.
In order to avoid this, a new invalidation is required *before* starting
the DMA operation. The patch just adds an invalidation before submitting
the DMA command.
======================

And your point about NXP PCIe and x86 being cache coherent is a good
one: Actually the ARM SBSA specification demands the same, so systems
complying with this (servers, typically) would probably also work
already. But those smartphone-class SoCs are probably not coherent.

It seems like drivers for PCI devices in U-Boot seem to assume some
properties based on the host controllers and systems they have been
tested on, as I found other issues lately. Some drivers do cache
maintenance, others don't at all.

I will try to test NVMe on some other ARM platforms as soon as I can get
a hand on one.

Many thanks!
Andre


> The example below shows the nvme disk scan result without the following
> patch
> 
> => nvme scan
> nvme_get_info_from_identify: nn = 544502629, vwc = 100,
> sn = dev_0T, mn = `�\�, fr = t_part, mdts = 105
> 
> So, invalidating the cache before submitting the admin command,
> fix the cpu read.
> 
> Michael
>>
>> Regards,
>> Bin
> 
> 
> 
> 
> --
> | Michael Nazzareno Trimarchi                     Amarula Solutions BV |
> | COO  -  Founder                                      Cruquiuskade 47 |
> | +31(0)851119172                                 Amsterdam 1018 AM NL |
> |                  [`as] http://www.amarulasolutions.com               |
> 



More information about the U-Boot mailing list