[U-Boot] [PATCH] ARMv8: Bug fix of dcache_disable()
FengHua
fenghua at phytium.com.cn
Thu Feb 26 16:06:10 CET 2015
hi Mark,
You did very detailed analysis of the cache beheaviour. Yes, this patch is not perfect.
But it did fix the actually existed bug. I will try to describe it more clearly in the following.
> -----Original Messages-----
> From: "Mark Rutland" <mark.rutland at arm.com>
> Sent Time: 2015-02-11 23:51:15 (Wednesday)
> To: FengHua <fenghua at phytium.com.cn>
> Cc: "u-boot at lists.denx.de" <u-boot at lists.denx.de>, "albert.u.boot" <albert.u.boot at aribaud.net>
> Subject: Re: Re: [U-Boot] [PATCH] ARMv8: Bug fix of dcache_disable()
>
> On Wed, Feb 11, 2015 at 03:26:06AM +0000, FengHua wrote:
> >
> > hi Mark,
> > Thank you review this patch.
> >
> > > -----Original Messages-----
> > > From: "Mark Rutland" <mark.rutland at arm.com>
> > > Sent Time: 2015-02-09 19:05:54 (Monday)
> > > To: "fenghua at phytium.com.cn" <fenghua at phytium.com.cn>
> > > Cc: "u-boot at lists.denx.de" <u-boot at lists.denx.de>
> > > Subject: Re: [U-Boot] [PATCH] ARMv8: Bug fix of dcache_disable()
> > >
> > > On Mon, Feb 09, 2015 at 08:51:59AM +0000, fenghua at phytium.com.cn wrote:
> > > > From: David Feng <fenghua at phytium.com.cn>
> > > >
> > > > The cache disable operation shoud be performed after flush_dcache_all().
> > > > If cache disable operation is performed before
> > > > flush_dcache_all(), flush_dcache_all() store data directly to memory
> > > > and may be overrided by data copy in cache.
> > >
> > > The reasoning above (and hence this patch) is wrong.
> > >
> > > While the caches are on, they can allocate lines for any portion of the
> > > address space with cacheable attributes, and can acquire dirty cache
> > > lines from other CPUs. Additionally, there is no restriction preventing
> > > lines from migrating between levels of cache while they are active.
> > >
> > > So calling flush_dcache_all (which performs maintenance by Set/Way)
> > > while the caches are enabled is wrong. Per the architecture it provides
> > > no guarantee whatsoever.
> > >
> > > To empty the caches by Set/Way, they must first be disabled. Note that
> > > this only guarantees that the caches are empty; not where the data went.
> > > Other CPUs might acquire dirty lines, or the data might only reach a
> > > system cache rather than memory.
> > >
> > > If you need certain portions of data to be flushed out to memory, then
> > > those must be flushed by VA. If flush_dcache_all performs any memory
> > > accesses before it has completed Set/Way maintenance, it is buggy.
> > >
> > > Thanks,
> > > Mark.
> > You are right. If data acess exist when flushing cache when cache is enabled,
> > the data may be brought to cache again. In normal circumstance we can not do
> > like this.
> > But the problem is flush_dcahe_all is a C routine, it will preserve return
> > address in stack. If disable cache first the return address will be directly
> > store in memory, and if the stack has a copy in cache the data will be covered
> > when flushing cache, then flush_dcache_all will get wrong return address.
> >
> > There should be no data access between disabling cache and flushing cache.
> > U-boot for aarch64 runs at only one processor and the data flush_dcache_all manipulated
> > will not be used by following routines. By simply adjusting the sequence can fix this
> > bug although it's not the best solution.
>
> I don't follow:
>
> * The compiler may generate writes between flush_dcache_all and
> set_sctlr (even in the absence of any explicit writes in source code),
> so the cache might allocate dirty lines that could be written back
> asynchronously later (when the cache id sieabled), clobbering data we
> are using.
Yes, the memory access between flush_dcache_all and set_sctlr is why we talk about this here.
If no memory access between flush_dcache_all and set_sctlr, there will be no bug.
This need the flush_dcache_all is implemented in assembly.
>
> * The cache can allocate clean lines at any point before it is disabled
> (even in the middle of flush_dcache_all), so the cache will almost
> certainly not be empty once disabled. It won't write back clean lines,
> but these could mask data later if not invalidated.
>
> * Set/Way operations aren't guaranteed to flush data to the PoC in the
> presence of a system cache like CCN, so we have no guarantee that
> we've pushed any data to the PoC. Per ARMv8 only maintenance by VA
> guarantees this (but luckily maintenance by VA is mandated to be
> respected by such system caches).
flush_dcache_all should flush both cache existed in architecture defined cache
hierachy and outer cache(such as L3 in CCN), a previous patch did this.
>
> * While the cache is enabled lines could theoretically migrate between
> set/way slots mid-sequence (e.g. with speculative accesses and an
> exclusive L1/L2 configuration). I don't believe this currently happens
> in practice, but the architecture does not prevent this.
>
> So I don't see that moving this maintenance solves any existing problem,
> and it introduces new ones.
The bug actually exist when flush_dcache_all is after of set_sctlr.
I try to describe it more detailed.
flush_dcache_all is a C routine, it will preserve return address in stack. The stack
memory may be in the cache. If we call set_sctlr to disable cache
first, then flush_dcache_all will write the return address directly into memory instead of cache.
But there is another copy of the stack memory in the cache, the correct return address in
memory will be rewritten by wrong value when flush cache, then flush_dcache_all get wrong return address.
The best solution is writing flush_dcache_all totally in assembly and make sure no memory access
between flush_dcache_all and set_sctlr.
Fortunately, there will be another invalidate_dcache_all() when booting OS, so the cache should be clean.
>
> Maintenance by Set/Way was only intended for IMPLEMENTATION DEFINED
> initialisation and for emptying a PE's caches prior to cutting power.
> It doesn't make sense when caches are enabled, and doesn't provide any
> guarantee as to where the data went.
>
> Fundamentally, if flush_dcache_all accesses memory that is not already
> clean to the PoC then it is broken. Likewise for any sequence for
> disabling the caches.
>
> The above isn't theoretical, we were hit by these issues in Linux. See
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=5e051531447259e5df95c44bccb69979537c19e4
flush_cache_all of linux kernel did not flush outer cache. I think the patch mostly deal with this.
right?
Yours,
David.
> > > >
> > > > Signed-off-by: David Feng <fenghua at phytium.com.cn>
> > > > ---
> > > > arch/arm/cpu/armv8/cache_v8.c | 3 ++-
> > > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/arch/arm/cpu/armv8/cache_v8.c b/arch/arm/cpu/armv8/cache_v8.c
> > > > index 9dbcdf2..dc2fc8c 100644
> > > > --- a/arch/arm/cpu/armv8/cache_v8.c
> > > > +++ b/arch/arm/cpu/armv8/cache_v8.c
> > > > @@ -124,9 +124,10 @@ void dcache_disable(void)
> > > > if (!(sctlr & CR_C))
> > > > return;
> > > >
> > > > + flush_dcache_all();
> > > > +
> > > > set_sctlr(sctlr & ~(CR_C|CR_M));
> > > >
> > > > - flush_dcache_all();
> > > > __asm_invalidate_tlb_all();
> > > > }
> > > >
> > > > --
> > > > 1.7.9.5
> > > >
More information about the U-Boot
mailing list