[U-Boot] armv7 DMA and cache mangement functions

Mark Rutland mark.rutland at arm.com
Thu Aug 27 19:41:01 CEST 2015


On Mon, Aug 24, 2015 at 08:54:17AM +0100, Markus Niebel wrote:
> Hello,
> 
> I'm not an expert in the low level details of this area. So please sorry if there are
> wrong assumptions in this post post.
> 
> Hardware: i.MX6 Solo (TQMa6 on custom Mainboard)
> U-Boot: 2014.10
> gcc: 4.8.3 
> 
> We see an error using TFTP on i.MX6 that seems to triggered, if the code / data size goes
> over a limit. Code changes have nothing to do with network stack, network drivers, 
> memory mangement. TFTP will completely unusable: device sees frequently erroneous packages 
> with different of wierd errors. If code stays below this size all works fine.
> 
> Up to now we checked a lot of things. The following brought us to the assumption, that this
> could be cache related:
> 
> dynamically disable data cache before doing TFTP: 	TFTP works well again
> running with disabled L2 cache (data cache enabled):	TFTP works well again
> 
> Looking at the code in drivers/net/fec_mxc.c, function fec_recv we see a call to
> invalidate_dcache_range before accessing the received ethernet data. When looking at
> the code for invalidate_dcache_range in arch/arm/cpu/armv7/cache_v7.c an comparing
> how the things done in linux and barebox we noticed that the order of L2 chache / data cache
> invalidation is just swapped there. Applying this to the receive code for fec_mxc,
> TFTP will work again.
> 
> Question: is the order of cache invalidation important?

The order is important.

Consider the case where both the external and architected caches contain
stale (but clean) cache lines for the region you care about.

If you invalidate the architected caches before the external L2, the
architected caches may speculatively fetch (stale) data from the L2
before the L2 is cleaned, and so in the end you may still see stale
data in the architected caches.

If you invalidate the L2 first, the architected caches could
speculatively fetch from the L2 (stale) or memory (new) while this is in
progress, but they will then be invalidated, and from then on can only
fetch the new data.

That assumes that both levels were clean to begin with. If they are not,
then additional maintenance is required. It's also conceivable that
caches could be implemented such that the above is insufficient, YMMV.

Thanks,
Mark.


More information about the U-Boot mailing list