[U-Boot] ARM - cache and alignment

Jean-Jacques Hiblot jjhiblot at ti.com
Tue Jan 17 10:08:01 CET 2017



On 16/01/2017 20:33, Marek Vasut wrote:
> On 01/16/2017 08:16 PM, Jean-Jacques Hiblot wrote:
>>
>> On 16/01/2017 17:00, Marek Vasut wrote:
>>> On 01/16/2017 02:29 PM, Jean-Jacques Hiblot wrote:
>>>> Tom, Marek
>>> Hi,
>>>
>>>> At the moment, whenever an unaligned address is used in cache operations
>>>> (invalidate_dcache_range, or flush_dcache_range), the whole request is
>>>> discarded  for am926ejs. for armV7 or armV8 only the aligned part is
>>>> maintained. This is probably what is causing the bug addressed in
>>>> 8133f43d1cd. There are a lot of unaligned buffers used in DMA operations
>>>> and for all of them, we're possibly handling the cached partially or not
>>>> at all. I've seen this when using the environment from a file stored in
>>>> a FAT partition. commit 8133f43d1cd addresses this by using a bounce
>>>> buffer at the FAT level but it's only one of many cases.
>>>>
>>>> I think we can do better with unaligned cache operations:
>>>>
>>>> * flush (writeback + invalidate): Suppose we use address p which is
>>>> unaligned, flush_dcache_range() can do the writeback+invalidate on the
>>>> whole range [p & ~(line_sz - 1); p + length | (line_sz - 1)]. There
>>>> should no problem with that since writeback can happen at any point in
>>>> time.
>>>>
>>>> * invalidation
>>>>
>>>> It is a bit trickier. here is a pseudo-code:
>>>> invalidate_dcache_range(p,length)
>>>> {
>>>>            write_back_invalidate(first line)
>>>>            write_back_invalidate(last line)
>>>>            invalidate(all other lines)
>>>> }
>>>>
>>>> Here again this should work fine IF invalidate_dcache_range() is called
>>>> BEFORE the DMA operation (again the writeback can happen at time so it's
>>>> valid do it here). Calling it only AFTER the operation, may corrupt the
>>>> data written by the DMA with old data from CPU. This how I used to
>>>> handle unaligned buffers in some other projects.
>>>>
>>>>
>>>> There is however one loophole: a data sitting in the first or the last
>>>> line is accessed before the memory is updated by the DMA, then the
>>>> first/line will be corrupted. But it's not highly probable as this data
>>>> would have to be used in parallel of the DMA (interrupt handling, SMP?,
>>>> dma mgt related variable). So it's not perfect but it would still be
>>>> better than we have today.
>>> Or just fix all the code which complains about unaligned buffers, done.
>>> That's the way to go without all the complications above.
>> It's not that complex, but it's not perfect. We would need to keep the
>> same warning as we have now, but it would make it work in more cases.
> The warning is there for that exact reason -- to inform you something's
> wrong.
>
>> Tracking every possible unaligned buffer that gets invalidated is not a
>> trivial job. Most of the time the buffer is allocated in a upper layer
>> and passed down to a driver via layers like network stack, block layer
>> etc.And in many cases, the warning will come and go depending on how the
>> variable aligned on the stack or the heap.
> I didn't observe this much in fact. I usually see the buffers coming it
> aligned or being allocated in drivers. Also, I think that's why the RC
> cycle is there, so we can test the next release and fix these issues.
It's not commonly seen but I came across it some times.

Here are two examples:

Network:
U-Boot 2016.09-rc1-00087-gd40ff0a (Jul 27 2016 - 10:04:33 -0500)
CPU  : DRA752-HS ES2.0
Model: TI DRA742
[...]
Booting from network ...
cpsw Waiting for PHY auto negotiation to complete.... done
link up on port 0, speed 1000, full duplex
BOOTP broadcast 1
CACHE: Misaligned operation at range [dffecb40, dffecc96]
CACHE: Misaligned operation at range [dffed140, dffed17e]
BOOTP broadcast 2
[...]
File transfer via NFS from server 10.0.1.26; our IP address is 128.247.83.128; sending through gateway 128.247.82.1
[...]
Load address: 0x82000000
Loading: CACHE: Misaligned operation at range [dffebfc0, dffebfea]


FAT: it has been fixed recently by using a bounce buffer in FAT code by "8133f43d1cd fs/fat/fat_write: Fix buffer alignments".
I've also seen it a few years back on PowerPC platforms when accessing a USB storage.

I'm sure that we could find more examples. I don't mean that they shouldn't be fixed, simply that we could still make things work in most cases at a low cost.
The modifications I proposed do not change the behaviour of the code for aligned buffers, it just make it much more likely that it would work with unaligned buffers. I fail to see the reason why this wouldn't be a good thing.

BTW the L2 uniphier cache implements this for flush and invalidation, and some architectures (pxa, blackfin, openrisc, etc.) do the flush() this way too.

Jean-Jacques



More information about the U-Boot mailing list