[U-Boot] [Resend RFC PATCH 1/2] armv8: Fix dcache disable function

Stephen Warren swarren at wwwdotorg.org
Wed Oct 26 21:41:16 CEST 2016


On 10/24/2016 04:44 AM, Mark Rutland wrote:
> Hi,
>
> Sorry for joining this a bit late; apologies if the below re-treads
> ground already covered.
>
> On Wed, Oct 19, 2016 at 09:25:02AM -0600, Stephen Warren wrote:
>> On 10/14/2016 02:17 PM, York Sun wrote:
>>> Current code turns off d-cache first, then flush all levels of cache.
>>> This results data loss. As soon as d-cache is off, the dirty cache
>>> is discarded according to the test on LS2080A. This issue was not
>>> seen as long as external L3 cache was flushed to push the data to
>>> main memory. However, external L3 cache is not guaranteed to have
>>> the data. To fix this, flush the d-cache by way/set first to make
>>> sure cache is clean before turning it off.
>>
>>> diff --git a/arch/arm/cpu/armv8/cache_v8.c b/arch/arm/cpu/armv8/cache_v8.c
>>
>>> @@ -478,9 +478,9 @@ void dcache_disable(void)
>>
>>> +	flush_dcache_all();
>>> 	set_sctlr(sctlr & ~(CR_C|CR_M));
>>>
>>> -	flush_dcache_all();
>>> 	__asm_invalidate_tlb_all();
>>
>> I talked to Mark Rutland at ARM, and I believe the current code is
>> correct.
>
> Well, almost, but not quite. It's a long story... ;)

To be clear, I was referring specifically/only to the relative ordering 
of the call to flush_dcache_all() and the SCTLR register modification, 
rather than wider aspects of the code:-)

> I gave a primer [1,2] on the details at ELC earlier this year, which may
> or may not be useful.
>
> The big details are:
>
> * Generaly "Flush" is ambiguous/meaningless. Here you seem to want
>   clean+invalidate.

Yes. I think the naming of this code is driven by U-Boot's 
cross-architecture compatibility and history, hence why it doesn't use 
the expected ARM terminology.

> * Set/Way operations are for IMPLEMENTATION DEFINED (i.e. SoC-specific)
>   cache maintenance sequences, and are not truly portable (e.g. not
>   affecting system caches).
>
>   I assume that an earlier boot stage initialised the caches prior to
>   U-Boot. Given that, you *only* need to perform maintenance for the
>   memory you have (at any point) mapped with cacheable attrbiutes, which
>   should be a small subset of the PA space. With ARMv8-A, broadcast
>   maintenance to the PoC should affect all relevant caches (assuming you
>   use the correct shareability attributes).

There is another thread (or fork of this thread) where York suggests 
replacing this code with operations by VA instead.

> * You *cannot* write a dcache disable routine in C, as the compiler can
>   perform a number of implicit memory accesses (e.g. stack, globals,
>   GOT). For that alone, I do not believe the code above is correct.
>
>   Note that we have seen this being an issue in practice, before we got
>   rid of Set/Way ops from arm64 Linux (see commit 5e051531447259e5).

I agree. In practice, at least with the compiler I happen to be using, 
we are currently getting lucky and there aren't any RAM accesses emitted 
by the compiler in this part of the code. Admittedly that won't hold 
true for everyone or across all of time though, so this should certainly 
be fixed...

> * Your dcache disable code *must* be clean to the PoC, prior to
>   execution, or instruction fetches could see stale data. You can first
>   *clean* this to the PoC, which is sufficient to avoid the problems
>   above.

I'm not sure if we have any documented rules/assumptions regarding 
cache/... state upon U-Boot entry like the Linux kernel does. It would 
probably be a good idea to do so...

I believe that for this particular piece of code, since it's always 
executed late in U-Boot's operation, the requirement is covered by 
U-Boot's code relocation routine, since that memcpy()s' .text/.data to 
the top of usable RAM, and hence hopefully is already fully cleaning the 
dcache and invalidating the icache afterwards.

> * The SCTLR_ELx.{C,I} bits do not enable/disable caches; they merely
>   activate/deactiveate cacheable attributes on data/instruction fetches.
>
>   Note that cacheable instruction fetches can allocate into unified/data
>   caches.
>
>   Also, note that the I bit is independent of the C bit, and the
>   attributes it provides differ when the M bit is clear. Generally, I
>   would advise that at all times M == C == I, as that leads to the least
>   surprise.

I think M==C==I is generally true in U-Boot, except for a limited time 
right before it jumps to the OS, where icache and dcache "are 
disabled"[1] separately, but one right after the other.

[1] to abuse the terminology that you pointed out was incorrect:-)

> Thanks,
> Mark.
>
> [1] http://events.linuxfoundation.org/sites/events/files/slides/slides_17.pdf
> [2] https://www.youtube.com/watch?v=F0SlIMHRnLk



More information about the U-Boot mailing list