[U-Boot] [PATCH 1/2] armv8: add hooks for all cache-wide operations

Stephen Warren swarren at wwwdotorg.org
Wed Oct 19 00:47:54 CEST 2016


On 10/18/2016 03:28 PM, york sun wrote:
> On 10/18/2016 02:01 PM, Stephen Warren wrote:
>> On 10/18/2016 12:40 PM, york sun wrote:
>>> On 10/18/2016 11:14 AM, Stephen Warren wrote:
>>>> On 10/18/2016 09:28 AM, york sun wrote:
>>>>> On 10/17/2016 04:35 PM, Stephen Warren wrote:
>>>>>> From: Stephen Warren <swarren at nvidia.com>
>>>>>>
>>>>>> SoC-specific logic may be required for all forms of cache-wide
>>>>>> operations; invalidate and flush of both dcache and icache (note that
>>>>>> only 3 of the 4 possible combinations make sense, since the icache never
>>>>>> contains dirty lines). This patch adds an optional hook for all
>>>>>> implemented cache-wide operations, and renames the one existing hook to
>>>>>> better represent exactly which operation it is implementing. A dummy
>>>>>> no-op implementation of each hook is provided. These dummy
>>>>>> implementations are moved into C code, since there's no need to
>>>>>> implement them in assembly.
>>>>>>
>>>>> Stephen,
>>>>>
>>>>> Moving this function to C may pose an issue. I had a debug a couple of
>>>>> years ago that calling a C function put the stack into cache after
>>>>> flushing L1/L2. That's why I used asm function to flush L3.
>>>>
>>>> Assuming the stack is located in cachable memory, the CPU is free (per
>>>> the definition of the ARM architecture) to pull it into the cache at any
>>>> time the cache is enabled (and perhaps even when it isn't enabled, at
>>>> the very least for the icache on ARMv8 if not other cases too).
>>>> Implementation in C vs. assembly has absolutely no effect here. I guess
>>>> your statement assumes that C functions will write data to the stack and
>>>> assembly functions never will. There's no strict 1:1 correlation between
>>>> those two things; assembly code can touch the stack just like C code. If
>>>> there's an assumption it won't, it needs to be documented in the header
>>>> defining these hook functions.
>>>>
>>>> I assume you're specifically talking about dirtying the dcache between
>>>> the point when dcache flushing starts and the point when the dcache is
>>>> disabled? If so, flush_dcache_all() itself would have to be manually
>>>> coded in assembly to avoid using the stack, as would dcache_disable()
>>>> and set_sctlr(). I think this is why dcache_disable() currently disables
>>>> the dcache first (thus preventing it acquiring new dirty data) and then
>>>> flushes the dcache afterwards (thus guaranteeing that all dirty data is
>>>> flushed with no race condition). This implies that your change to swap
>>>> the order of those two functions isn't correct. I'm pretty sure I'm
>>>
>>> I wonder if David can shed some light on the original order of calls to
>>> disable dcache.
>>>
>>>> correct in saying that the dcache can hit even if it's disabled, hence
>>>> disabling the dcache while it contains dirty data won't lead to issues?
>>>>
>>>
>>> My earlier debug was based on the original order of calls. I found I had
>>> to avoid using the stack before flushing L3. Now with the changed order,
>>> I haven't tested. But I can image the stack will be dirty and flushing
>>> L3 may or may not push the data into main memory (depending on the L3
>>> implementation whether inclusive or not).
>>>
>>> You said you are sure dcache can hit even if it is disabled. Can you
>>> explain more? My test shows as soon as the d-cache is disabled, the core
>>> cannot get the data in dirty cache.
>>
>> By "hit" here, I mean that even with the dcache disabled, when the CPU
>> performs a read access, if the dcache contains a copy of that data, it
>> can return it rather than requiring it to be fetched from DRAM.
>>
>> Yes, with the dcache disabled, I would not expect any writes to allocate
>> new lines in the cache (although presumably writes would update any
>> lines already there, in a write-though sense).
>>
>> At least, I'm pretty sure this is all true. It seems the only way to
>> allow switching from cache-on to cache-off state without losing dirty data.
>
> I believe my test showed otherwise. As soon as the dcache is disabled,
> the core cannot get the dirty cached data if the dcache was flushed by
> way/set for L1 and L2 (L3 wasn't flushed). That's why I proposed to
> change the order to flush first.

It looks like what I was saying is true for ARMv7 but not for ARMv8 (or 
perhaps it's not architectural but implementation-defined). For example, 
the Cortex A15 TRM says:

> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0438i/BABHEJFF.html

When you disable the cache, all Write-Back Cacheable requests still look 
up the L1 cache. If there is a cache hit, the cache is read or updated 
in the same way as if the cache is enabled. This enables Cacheable 
memory to remain fully coherent while the cache is disabled.

However, the Cortex A72 TRM says:

> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0488h/way1382448697827.html

When you disable the cache, all Write-Back Cacheable requests do not 
look up the L1 data cache. L1 cache still services the snoops from the 
L2 cache.

So yes, we should flush the entire cache first, then disable it.

This does indeed imply that we shouldn't write data between starting to 
flush the dcache and disabling the cache. However, that seems too 
restrictive, especially in the face of caches beyond L1/L2 which might 
need quite some code to flush out. I'll try to investigate that aspect 
some more.


More information about the U-Boot mailing list