[U-Boot] Unaligned flush_dcache_range in axs101.c

Marek Vasut marex at denx.de
Fri Apr 15 15:49:25 CEST 2016


On 04/15/2016 03:00 PM, Alexey Brodkin wrote:
> Hi Marek,
> 
> On Mon, 2016-04-11 at 20:48 +0200, Marek Vasut wrote:
>> On 04/11/2016 08:13 PM, Alexey Brodkin wrote:
>>>
>>> Hi Marek,
>>>
>>> On Mon, 2016-04-11 at 19:54 +0200, Marek Vasut wrote:
>>>>
>>>> On 04/11/2016 07:48 PM, Alexey Brodkin wrote:
>>>>>
>>>>>
>>>>> Hi Alex,
>>>>>
>>>>> On Mon, 2016-04-04 at 09:38 +0200, Alexander Graf wrote:
>>>>>>
>>>>>>
>>>>>> Hi Alexey,
>>>>>>
>>>>>> Marek just pointed out to me the fact that flush_dcache_range on arm
>>>>>> expects cache line aligned arguments. However, it seems like in axs101.c
>>>>>> we have an unaligned cache flush:
>>>>>>
>>>>>>   flush_dcache_range(RESET_VECTOR_ADDR, RESET_VECTOR_ADDR + sizeof(int));
>>>>>>
>>>>>> Could you please verify whether this is correct and if not just send a
>>>>>> quick patch to fix it?
>>>>> First this code is for support of Synopsys DesignWare AXS10x boards.
>>>>> And AFAIK there's no such board that may sport ARM CPU instead or ARC.
>>>>> So I'm wondering how did you bumped into that [issue?]?
>>>>>
>>>>> Then I'm not really sure if there's a common requirement for arguments of
>>>>> flush_dcache_range(). At least in "include/common.h" there's no comment about
>>>>> that.
>>>> Such comment should then be added. Sub-cacheline flush/invalidate calls
>>>> can corrupt surrounding data.
>>> Well this is not that simple really.
>>>
>>> For example that's what we have on ARC:
>>>
>>> [1] We may deal with each cache line separately. And BTW that's what we have
>>>     now in U-boot, see http://git.denx.de/?p=u-boot.git;a=blob;f=arch/arc/lib/cache.c#l328
>>>     In that case we only mention address of cache line start and regardless of its length
>>>     line will be processed by HW.
>> Can you flush/invalidate on sub-cacheline level? If not, you should not
>> paper over the issue and avoid re-aligning the end address.
>>
>>>
>>> [2] We may although deal with ranges as well (still this is not implemented in u-boot yet).
>>>     In that case we need to set addresses of range beginning and end.
>>>     But if start address falls actually in the middle of cache line it will be processed.
>>>     And the same is valid for end of the region.
>> Same complaint as above, if you cannot flush with sub-cacheline
>> granularity, do not paper over the issue by re-aligning the start address.
>>
>>>
>>> So from above I may conclude that it's more important to place data properly in memory.
>>> I.e. if you put 2 completely independent substances in 1 cache line you won't be able to
>>> deal with cache entries for them separately (at least on ARC).
>>>
>>> I'm not really sure if ARM or any other arch in hardware may invalidate/writeback only part of
>>> one cache line - that might very well be the case.
>> It can not, which is the reason we have a warning on sub-cacheline
>> invalidate/flush on ARM.
>>
>>>
>>> But in the original case my implementation makes perfect sense because what it does
>>> it writes back instructions modified by the active CPU so others may see these changes.
>>> and here I'd like ideally to have an option to write back only 1 CPU word (because that's
>>> what I really modified) but this is not possible due to described above limitations of our HW.
>> Consider a layout where you have first half of the cacheline used as
>> DMAble memory while the second half of the cacheline is used for some
>> variable, say some counter.
>>
>> Consider this sequence of operations:
>>
>> 1. start DMA
>> 2. counter++
>> 3. invalidate_dcache_line
>> 4. read DMAed buffer
>> 5. print the counter
>>
>> What will happen in step 5 ? Will you get the incremented counter or the
>> stale data which were in the DRAM ?
>>
>> I see it this way -- In step 3, the increment of counter performed in
>> step 2 will be discarded. In step 4, the cacheline will be re-populated
>> by stale data from DRAM and in step 5, you will print the old value of
>> the counter.
> 
> Let's not mix data and tools that operate with that data.
> 
> Cache management functions should be implemented per arch or platform and so
> that they match requirements of underlying hardware. If hardware may only work on
> whole cache line then cache routines must do exactly this.

Is ARC designed this way ?

> As an extra options there could be a check for input arguments in those functions
> that will warn a user if he or she passes unaligned start and/or end addresses.

Right, ARMv7 does it already, ARM926 too, others need fixing.

> Now speaking about data layout your example is very correct and we saw a lot of
> such real use-cases - developer of drivers should design data layout such that
> cache management doesn't lead to data corruption.

Right.

> But again the code in question has nothing to do with cache line.
> I only need to writeback 1 word and don't care what really happens with
> all remaining words in the same cache line. And my implementation of
> flush_dcache_range() will do everything perfectly correct - it will
> flush exactly 1 line that contains my word I modified.

And this will lead to silent corruption which is hard to track down and
debug. I would rather have a check-and-refuse behavior than
silently-fix-and-permit.

> -Alexey
> 


-- 
Best regards,
Marek Vasut


More information about the U-Boot mailing list