[U-Boot] Unaligned flush_dcache_range in axs101.c

Alexey Brodkin Alexey.Brodkin at synopsys.com
Fri Apr 15 15:00:53 CEST 2016


Hi Marek,

On Mon, 2016-04-11 at 20:48 +0200, Marek Vasut wrote:
> On 04/11/2016 08:13 PM, Alexey Brodkin wrote:
> > 
> > Hi Marek,
> > 
> > On Mon, 2016-04-11 at 19:54 +0200, Marek Vasut wrote:
> > > 
> > > On 04/11/2016 07:48 PM, Alexey Brodkin wrote:
> > > > 
> > > > 
> > > > Hi Alex,
> > > > 
> > > > On Mon, 2016-04-04 at 09:38 +0200, Alexander Graf wrote:
> > > > > 
> > > > > 
> > > > > Hi Alexey,
> > > > > 
> > > > > Marek just pointed out to me the fact that flush_dcache_range on arm
> > > > > expects cache line aligned arguments. However, it seems like in axs101.c
> > > > > we have an unaligned cache flush:
> > > > > 
> > > > >   flush_dcache_range(RESET_VECTOR_ADDR, RESET_VECTOR_ADDR + sizeof(int));
> > > > > 
> > > > > Could you please verify whether this is correct and if not just send a
> > > > > quick patch to fix it?
> > > > First this code is for support of Synopsys DesignWare AXS10x boards.
> > > > And AFAIK there's no such board that may sport ARM CPU instead or ARC.
> > > > So I'm wondering how did you bumped into that [issue?]?
> > > > 
> > > > Then I'm not really sure if there's a common requirement for arguments of
> > > > flush_dcache_range(). At least in "include/common.h" there's no comment about
> > > > that.
> > > Such comment should then be added. Sub-cacheline flush/invalidate calls
> > > can corrupt surrounding data.
> > Well this is not that simple really.
> > 
> > For example that's what we have on ARC:
> > 
> > [1] We may deal with each cache line separately. And BTW that's what we have
> >     now in U-boot, see http://git.denx.de/?p=u-boot.git;a=blob;f=arch/arc/lib/cache.c#l328
> >     In that case we only mention address of cache line start and regardless of its length
> >     line will be processed by HW.
> Can you flush/invalidate on sub-cacheline level? If not, you should not
> paper over the issue and avoid re-aligning the end address.
> 
> > 
> > [2] We may although deal with ranges as well (still this is not implemented in u-boot yet).
> >     In that case we need to set addresses of range beginning and end.
> >     But if start address falls actually in the middle of cache line it will be processed.
> >     And the same is valid for end of the region.
> Same complaint as above, if you cannot flush with sub-cacheline
> granularity, do not paper over the issue by re-aligning the start address.
> 
> > 
> > So from above I may conclude that it's more important to place data properly in memory.
> > I.e. if you put 2 completely independent substances in 1 cache line you won't be able to
> > deal with cache entries for them separately (at least on ARC).
> > 
> > I'm not really sure if ARM or any other arch in hardware may invalidate/writeback only part of
> > one cache line - that might very well be the case.
> It can not, which is the reason we have a warning on sub-cacheline
> invalidate/flush on ARM.
> 
> > 
> > But in the original case my implementation makes perfect sense because what it does
> > it writes back instructions modified by the active CPU so others may see these changes.
> > and here I'd like ideally to have an option to write back only 1 CPU word (because that's
> > what I really modified) but this is not possible due to described above limitations of our HW.
> Consider a layout where you have first half of the cacheline used as
> DMAble memory while the second half of the cacheline is used for some
> variable, say some counter.
> 
> Consider this sequence of operations:
> 
> 1. start DMA
> 2. counter++
> 3. invalidate_dcache_line
> 4. read DMAed buffer
> 5. print the counter
> 
> What will happen in step 5 ? Will you get the incremented counter or the
> stale data which were in the DRAM ?
> 
> I see it this way -- In step 3, the increment of counter performed in
> step 2 will be discarded. In step 4, the cacheline will be re-populated
> by stale data from DRAM and in step 5, you will print the old value of
> the counter.

Let's not mix data and tools that operate with that data.

Cache management functions should be implemented per arch or platform and so
that they match requirements of underlying hardware. If hardware may only work on
whole cache line then cache routines must do exactly this.

As an extra options there could be a check for input arguments in those functions
that will warn a user if he or she passes unaligned start and/or end addresses.

Now speaking about data layout your example is very correct and we saw a lot of
such real use-cases - developer of drivers should design data layout such that
cache management doesn't lead to data corruption.

But again the code in question has nothing to do with cache line.
I only need to writeback 1 word and don't care what really happens with
all remaining words in the same cache line. And my implementation of
flush_dcache_range() will do everything perfectly correct - it will
flush exactly 1 line that contains my word I modified.

-Alexey


More information about the U-Boot mailing list