[U-Boot] [PATCH 1/2] armv8: add hooks for all cache-wide operations
Stephen Warren
swarren at wwwdotorg.org
Wed Oct 19 19:11:57 CEST 2016
On 10/19/2016 09:39 AM, Tom Rini wrote:
> On Wed, Oct 19, 2016 at 09:36:52AM -0600, Stephen Warren wrote:
>> On 10/18/2016 08:41 PM, Simon Glass wrote:
>>> Hi Stephen,
>>>
>>> On 18 October 2016 at 17:33, Stephen Warren <swarren at wwwdotorg.org> wrote:
>>>> On 10/18/2016 05:08 PM, Simon Glass wrote:
>>>>>
>>>>> Hi Stephen,
>>>>>
>>>>> On 18 October 2016 at 16:54, Stephen Warren <swarren at wwwdotorg.org> wrote:
>>>>>>
>>>>>> On 10/18/2016 01:56 PM, Simon Glass wrote:
>>>>>>>
>>>>>>>
>>>>>>> Hi Stephen,
>>>>>>>
>>>>>>> On 18 October 2016 at 13:10, Stephen Warren <swarren at wwwdotorg.org>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 10/18/2016 01:03 PM, Simon Glass wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi Stephen,
>>>>>>>>>
>>>>>>>>> On 18 October 2016 at 12:58, Stephen Warren <swarren at wwwdotorg.org>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 10/18/2016 10:23 AM, Simon Glass wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Hi Stephen,
>>>>>>>>>>>
>>>>>>>>>>> On 17 October 2016 at 15:35, Stephen Warren <swarren at wwwdotorg.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> From: Stephen Warren <swarren at nvidia.com>
>>>>>>>>>>>>
>>>>>>>>>>>> SoC-specific logic may be required for all forms of cache-wide
>>>>>>>>>>>> operations; invalidate and flush of both dcache and icache (note
>>>>>>>>>>>> that
>>>>>>>>>>>> only 3 of the 4 possible combinations make sense, since the icache
>>>>>>>>>>>> never
>>>>>>>>>>>> contains dirty lines). This patch adds an optional hook for all
>>>>>>>>>>>> implemented cache-wide operations, and renames the one existing
>>>>>>>>>>>> hook
>>>>>>>>>>>> to
>>>>>>>>>>>> better represent exactly which operation it is implementing. A
>>>>>>>>>>>> dummy
>>>>>>>>>>>> no-op implementation of each hook is provided. These dummy
>>>>>>>>>>>> implementations are moved into C code, since there's no need to
>>>>>>>>>>>> implement them in assembly.
>>>>>>>>>>>>
>>>>>>>>>>>> Signed-off-by: Stephen Warren <swarren at nvidia.com>
>>>>>>>>>>>> ---
>>>>>>>>>>>> arch/arm/cpu/armv8/cache.S | 6 ------
>>>>>>>>>>>> arch/arm/cpu/armv8/cache_v8.c | 23
>>>>>>>>>>>> ++++++++++++++++++++---
>>>>>>>>>>>> arch/arm/cpu/armv8/fsl-layerscape/lowlevel.S | 4 ++--
>>>>>>>>>>>> arch/arm/include/asm/system.h | 5 ++++-
>>>>>>>>>>>> arch/arm/mach-tegra/tegra186/cache.c | 2 +-
>>>>>>>>>>>> 5 files changed, 27 insertions(+), 13 deletions(-)
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I think we should have a proper interface for this stuff rather than
>>>>>>>>>>> weak functions. Maybe we need a linker-list approach, or a cache
>>>>>>>>>>> uclass?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> What's improper about this interface? Presumably we could argue that
>>>>>>>>>> no
>>>>>>>>>> function in the entire of U-Boot be called by symbol name, which
>>>>>>>>>> doesn't
>>>>>>>>>> seem useful.
>>>>>>>>>>
>>>>>>>>>> I'm not sure exactly what you envisage by a linker-list approach. Can
>>>>>>>>>> you
>>>>>>>>>> provide some background? I understand how the linker can construct
>>>>>>>>>> list
>>>>>>>>>> of
>>>>>>>>>> objects/implementations/..., but that doesn't seem useful here since
>>>>>>>>>> there's
>>>>>>>>>> a known-ahead-of-time single implementation of these functions in a
>>>>>>>>>> single
>>>>>>>>>> build of U-Boot.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Your own commit messages says that this is SoC-specific. I'm
>>>>>>>>> suggesting that we define an interface (which I think you have already
>>>>>>>>> done with your header file additions), and allow SoCs to implement it
>>>>>>>>> via a linker list.
>>>>>>>>>
>>>>>>>>> IMO the cache code in U-Boot is starting to get a bit creaky.
>>>>>>>>>
>>>>>>>>>> A cache uclass seems like /massive/ overkill, especially since I'd
>>>>>>>>>> expect
>>>>>>>>>> these very low-level functions to be required well before any higher
>>>>>>>>>> level
>>>>>>>>>> code like DM/classes/... to be available.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> DM is available very early. But it's not clear from your patch when
>>>>>>>>> this code is actually called.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I believe that weak functions are a perfectly acceptable approach here.
>>>>>>>>
>>>>>>>> Yes, the implementation of these functions is SoC-specific. The
>>>>>>>> Makefiles
>>>>>>>> will pull in the appropriate implementation for that SoC whenever
>>>>>>>> U-Boot
>>>>>>>> is
>>>>>>>> built, just like every other board- or SoC-specific function in the
>>>>>>>> entire
>>>>>>>> of U-Boot.
>>>>>>>>
>>>>>>>> There's no need for linker lists since there is only ever one
>>>>>>>> implementation.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> If there is only ever one implementation, why do you need weak
>>>>>>> functions?
>>>>>>
>>>>>>
>>>>>>
>>>>>> As I explicitly stated above, each SoC can have a different
>>>>>> implementation,
>>>>>> yet only a single implementation is ever needed for a particular U-Boot
>>>>>> build.
>>>>>>
>>>>>>> Just call them directly.
>>>>>>
>>>>>>
>>>>>>
>>>>>> The code is doing that, both before and after my patch.
>>>>>
>>>>>
>>>>> I mean call them without needing weak functions.
>>>>>
>>>>>>
>>>>>>> I think in fact you mean that
>>>>>>> there can be no implementation (but perhaps an empty one), or one
>>>>>>> implementation. You are effectively using multiple weak functions to
>>>>>>> provide default code. I think it would be better if this were
>>>>>>> explicit.
>>>>>>>
>>>>>>> I still think that the cache functions could do with a rethink.
>>>>>>
>>>>>>
>>>>>>
>>>>>> In my opinion, this patch doesn't change the code structure at all. There
>>>>>> is
>>>>>> already an interface between the core (L1/L2) cache management code and
>>>>>> the
>>>>>> SoC-specific cache management code. That interface already uses a weak
>>>>>> function to provide the default no-op implementation. This patch doesn't
>>>>>> change any of that. All this patch does is fix that existing interface to
>>>>>> cover all 3 combinations of dcache_flush, dcache_invalidate, and
>>>>>> icache_invalidate, rather than just one of those combinations. It's more
>>>>>> of
>>>>>> a bug-fix than anything else.
>>>>>
>>>>>
>>>>> Yes I see that.
>>>>>
>>>>>>
>>>>>> If you want to rework this interface sometime, be my guest. However, I
>>>>>> don't
>>>>>> think it's fair to require that someone who simply wants to fix the
>>>>>> existing
>>>>>> code (in a way that is orthogonal to your desired interface refactoring)
>>>>>> do
>>>>>> that refactoring first, rather than doing it yourself.
>>>>>
>>>>>
>>>>> I understand what you are saying, but isn't that how open source
>>>>> software works? Believe me, I have done my fair share of refactoring
>>>>> :-)
>>>>>
>>>>> At least can you look at not making it any harder to fix up later? The
>>>>> more we pile onto this interface, the harder it will be later. We
>>>>> should aim to make ARMv8 really nice as it is the new thing.
>>>>
>>>>
>>>> I believe moving one or three functions into any replacement scheme will
>>>> have an identical level of difficulty. So, I believe the patch already
>>>> satisfies the "no harder" requirement.
>>>
>>> Well it seems your mind is already made up!
>>
>> Well, I don't believe there are any viable or reasonable
>> alternatives. I'm not prolonging this thread because I find it
>> enjoyable, but because of the lack of alternatives to what this
>> patch does.
>>
>> Doing this via DM doesn't simplify anything or make it more
>> maintainable; it's just using DM for the sake of it. DM is great
>> when it makes life simpler and code more maintainable, but we use it
>> because of those benefits, not just for the sake of it. Using DM
>> adds complexity, and so there has to be a benefit of doing so. I
>> don't believe there is here.
>>
>> Doing this by linker scripts simply doesn't make sense:
>>
>> Any given U-Boot binary will only contain a single implementation of
>> these functions, so there's no need for any kind of runtime lookup,
>> and if there was a runtime lookup, what would be the key and where
>> would it come from? I believe we'd still have some
>> compile-time-seledted function/data to drive the runtime lookup
>> process, and so we'd be in exactly the same situation as we already
>> are, just with more complexity added on top.
>>
>> Using linker scripts to switch between implementations at compile
>> time is much more complex than just letting the linker link stuff
>> together directly. That's what it's good at. Using linker scripts
>> would just add extra complexity without any benefit. We'd still end
>> up with a single implementation in the binary, yet to call it code
>> would have to indirect through the linker-defined table, rather than
>> simply calling the same code by symbol name. Uggh!
>>
>> Note that per the discussions in other forks of this thread, it's
>> likely we'll need to code all these cache operations in assembly to
>> avoid accessing DRAM while turning the cache on/off. This implies to
>> me that we'll need to keep all the cache code extremely simple and
>> direct, without any form of runtime or compile time indirection.
>
> For the record, until / unless we move to the point where we can run
> different SoCs within a single binary, doing this via weak functions
> seems to me to be the correct abstraction. If we know that some SoCs
> will be able to nop these, that is. If all SoCs will need something, we
> should simply omit the default functions. Thanks!
I believe there can be[1] SoCs where the architectural by-set/way
instructions are entirely sufficient to flush the dcache. In those
cases, the default no-op implementation of the hook is appropriate.
[1] and likely are; not every existing ARMv8 U-Boot port implements the
current __asm_flush_l3_cache hook, and hopefully they're all correct and
working.
More information about the U-Boot
mailing list