[U-Boot] [PATCH v8] usb: align buffers at cacheline

Sun Mar 11 03:35:51 CET 2012

On Wednesday 07 March 2012 02:12:22 puneets wrote:
> On Tuesday 06 March 2012 08:37 AM, Mike Frysinger wrote:
> >> --- a/drivers/usb/host/ehci-hcd.c
> >> +++ b/drivers/usb/host/ehci-hcd.c
> >> 
> >>   static void flush_invalidate(u32 addr, int size, int flush)
> >>   {
> >> +	/*
> >> +	 * Size is the bytes actually moved during transaction,
> >> +	 * which may not equal to the cache line. This results
> >> +	 * stop address passed for invalidating cache may not be aligned.
> >> +	 * Therfore making size as multiple of cache line size.
> >> +	 */
> >> +	size = ALIGN(size, ARCH_DMA_MINALIGN);
> >> +
> >>   	if (flush)
> >>   		flush_dcache_range(addr, addr + size);
> >>   	else
> > 
> > i think this is wrong and merely hides the errors from higher up instead
> > of fixing them.  the point of the warning was to tell you that the code
> > was invalidating *too many* bytes.  this code still invalidates too many
> > bytes without any justification as for why it's OK to do here.  further,
> > this code path only matters to the invalidation logic, not the flush
> > logic.
> 
> The sole purpose of this patch to remove the warnings as start/stop
> address sent for invalidating
> is unaligned. Without this patch code works fine but with lots of
> spew...Which we don't want and discussed
> in earlier thread which Simon posted. Please have a look on following link.
> 
> As I understood, you agree that we need to align start/stop buffer
> address and also agree that
> to align stop address we need to align size as start address is already
> aligned.
> Now, "why its OK to do here"?
> We could have aligned the size in two places, cache_qtd() and cache_qh()
> but then we need to place alignment check
> at all the places where size is passed. So I thought better Aligning at
> flush_invalidate() and "ALIGN" macro does not
> increase the size if size is already aligned.

i think you missed my point.  consider a func which has local vars like so:
	int i;
	char buf[1024];
	int k;

and let's say you're running on a core that has a cache line size of 32 bytes 
(which is fairly common).  if you execute a data cache invalid insn, the 
smallest region it can invalidate is 32 bytes.  doesn't matter if you only 
want to invalidate a buffer of 8 bytes ... everything else around it gets 
invalidated as well.

now, in the aforementioned stack, if it starts off aligned nicely at a 32 byte 
boundary, the integer "i" will share a cache line with the first 28 bytes of 
buffer "buf", and the integer "k" will share a cache line with the last 4 bytes 
of the buffer "buf".  (let's ignore what might or might not happen based on gcc 
since this example can trivially be expanded to structure layout.)

the trouble is when you attempt to invalidate the contents of "buf".  if the 
cache is in writeback mode (which means you could have changes in the cache 
which are not reflected in external RAM), then invalidating buf will also 
discard values that might be in "i" or "k".  this is why Simon put a warning 
in the core data cache invalidate function.  if the cache were in writethrough 
mode (which also tends to be the default), then most likely things would work 
fine and no one would notice.  or if the data cache was merely flushed, things 
would work, but at a decrease in performance: you'd be flushing cache lines to 
external memory that you know will be overwritten by a following transaction 
-- most likely DMA from a peripheral such as the USB controller, and you'd be 
flushing objects that the DMA wouldn't be touching, so they'd have to get 
refetched from external RAM ("i" and "k" in my example above).

simply rounding the address down to the start of the cache line and the length 
up to a multiple of a cache line to keep the core code from issuing the 
warning doesn't fix the problem i describe above.  you actually get the worst 
of both worlds -- silent runtime misbehavior when extra memory gets 
invalidated.

perhaps the warning in the core code could be dropped and all your changes in 
fringe code obsoleted (such as these USB patches): when it detects that an 
address is starting on an unaligned boundary, *flush* that line first, and then 
let it be invalidated.  accordingly, when the end length is on an unaligned 
boundary, do the same flush-then-invalidate step.  this should also make things 
work without a (significant) loss in performance.  if anything, i suspect the 
overhead of doing runtime buffer size calculations and manually aligning 
pointers (which is what ALLOC_CACHE_ALIGN_BUFFER does) is a wash compared to 
partially flushing cache lines in the core ...

Simon: what do you think of this last idea ?
-mike
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.denx.de/pipermail/u-boot/attachments/20120310/8aa995cf/attachment.pgp>