[U-Boot] [Resend RFC PATCH 1/2] armv8: Fix dcache disable function

Stephen Warren swarren at wwwdotorg.org
Wed Oct 19 19:18:33 CEST 2016


On 10/19/2016 09:25 AM, Stephen Warren wrote:
> On 10/14/2016 02:17 PM, York Sun wrote:
>> Current code turns off d-cache first, then flush all levels of cache.
>> This results data loss. As soon as d-cache is off, the dirty cache
>> is discarded according to the test on LS2080A. This issue was not
>> seen as long as external L3 cache was flushed to push the data to
>> main memory. However, external L3 cache is not guaranteed to have
>> the data. To fix this, flush the d-cache by way/set first to make
>> sure cache is clean before turning it off.
>
>> diff --git a/arch/arm/cpu/armv8/cache_v8.c
>> b/arch/arm/cpu/armv8/cache_v8.c
>
>> @@ -478,9 +478,9 @@ void dcache_disable(void)
>
>> +    flush_dcache_all();
>>      set_sctlr(sctlr & ~(CR_C|CR_M));
>>
>> -    flush_dcache_all();
>>      __asm_invalidate_tlb_all();
>
> I talked to Mark Rutland at ARM, and I believe the current code is
> correct. Here's my interpretation of what he said:
>
> The dcache must be disabled first. This prevents allocation of new
> entries in the cache during the flush operation, which prevents the race
> conditions that were mentioned in the other thread.
>
> Then, the flush operation must be invoked. Since the cache is now
> disabled, this can fully flush the cache without worrying about racing
> with things being added to the cache.
>
> This all implies that the implementation of dcache_disable(),
> set_sctlr(), flush_dcache_all(), and any code they call must not access
> data in DRAM at all; since because the dcache is off, any DRAM access
> will[1] read potentially stale data from DRAM, rather than any dirty
> data that might be in the cache.
>
> [1] I'm not sure if that's "will" or "may", i.e. whether this is
> architecturally guaranteed in ARMv8 or is implementation defined. At
> least the Cortex A72 TRM says "will" for that CPU; not sure about others.
>
> Perhaps the most obvious upshot of this is that the stack can't be used.
> This implies to me that we need to recode all those functions purely in
> assembly, or just possibly play some tricks to 100% force gcc not to
> touch memory anywhere inside dcache_disable() or the functions it calls.
> We're just getting lucky here right now since everything happens to be
> inlined, but I don't think we're doing anything to 100% guarantee this.
>
> What worries me here is that at least on Tegra, a "flush the entire
> dcache" operation requires an SMC call to the secure monitor. That will
> certainly access DRAM when the secure monitor runs, but perhaps this
> doesn't matter since that's at a different exception level, and we know
> the secure monitor accesses DRAM regions that are separate from U-Boot's
> DRAM? I suspect life isn't that convenient. I'm wondering if this all
> implies that, like patch 2 in this series, we need to get 100% away from
> flush-by-set/way, even with SoC-specific hooks to make that work
> reliably, and just flush everything by VA, which IIRC is architecturally
> guaranteed to work without SoC-specific logic. That way, we can
> encapsulate everything into an assembly function without worrying about
> calling SMCs or SoC-specific hook functions without using DRAM. Of
> course, how that assembly function knows which VAs to flush without
> decoding the page tables or other data structure is another matter:-(

Re: the last paragraph there:

After reading the ARMv8 ARM, I see that EL1, EL2, and EL3 all have 
separate cache enable bits in SCTLR_ELx. I believe U-Boot only needs to 
flush/... its own RAM out of the dcache, since that's all that's 
relevant at the EL U-Boot is running at, and doesn't care about anything 
EL3 might have mapped cached. So, it's safe to invoke SMCs from the 
cache flush code in U-Boot even if the EL3 code touches its own DRAM. 
There might be a corner case where this isn't true if EL3 has some 
EL1/EL2-owned RAM mapped, but I don't expect that to be the case here.

Re: my other series to add more cache hooks: I'll re-implement the Tegra 
hook in assembly so it's guaranteed not to touch RAM, retest, and resumbit.

If it turns out that dcache_disable() ever starts touching DRAM at the 
wrong time, we can deal with that then; it doesn't now at least.


More information about the U-Boot mailing list