[PATCH] nvme: Fix cache alignment

Marek Vasut marek.vasut at gmail.com
Tue Feb 2 17:09:05 CET 2021


On 2/2/21 10:12 AM, Bin Meng wrote:
[...]
>>>>>> cache aligned in memory, however the cache operations are called on
>>>>>> the structure sizes, which themselves might not be cache aligned. Add
>>>>>> the necessary rounding to fix this, which permits the nvme to work on
>>>>>> arm64.
>>>>>
>>>>> +ARM guys
>>>>>
>>>>> Which ARM64 SoC did you test this with?
>>>>
>>>> RCar3, although that's irrelevant, the problem will happen on any arm or
>>>> arm64, and possibly any other system which needs cache management.
>>>
>>> There was a recent change to nvme.c that fixed a cache issue on ARMv8
>>> so I thought this might be platform related.
>>
>> I used master, so unlikely.
> 
> It's strange this issue was not exposed last time when this driver was
> tested on ARMv8.

Which ARMv8 platform ?

>>>>> The round down in this patch should be unnecessary.
>>>>
>>>> Can you explain why ?
>>>
>>> I just took a further look and most of the start address should be
>>> cache line aligned (4KiB aligned) except the
>>> nvme_read_completion_status(). It's only 16 bytes aligned which might
>>> not be cache line aligned.
>>
>> Right, there are various arm chips with 32B/64B alignment requirements.
>>
>>>>> But it's better to
>>>>> figure out which call to dcache_xxx() with an unaligned end address.
>>>>
>>>> If you look at the code, most of them can (and do) trigger this,
>>>> therefore they need such alignment, as explained in the commit message.
>>>
>>> Now I wonder what's the correct implementation of the
>>> invalidate_dcache_range() and flush_dcache_range() in U-Boot?
>>> Shouldn't the round down/up happen in these APIs instead of doing such
>>> in drivers?
>>
>> Definitely not, because then the rounding might flush/invalidate cache
>> over areas where this could cause a problem (e.g. neighboring DMA
>> descriptors). The driver must do the cache management correctly.
> 
> Well we can implement in these APIs and document its expected usage.

That would be harmful, because you cannot really flush/invalidate half 
of cache line. Consider you have two 16B DMA descriptors next to each 
other, which is often the case, then the driver has to implement proper 
cache handling.

That this particular driver already has structures reasonably well 
padded that they can be flushed by adding rounding is an exception.

> Either way a driver has to do the cache management correctly. Not
> doing it in the driver eliminates some duplications of rounding
> up/down.

Indeed, and rounding up/down is the correct way to do it in this driver, 
see the commit message.

> For this case, I believe we just need to take care of
> nvme_read_completion_status().

Can you validate this ?


More information about the U-Boot mailing list