[PATCH v1 0/5] arm64: Add optimized memset/memcpy functions
Stefan Roese
sr at denx.de
Fri Aug 6 16:44:31 CEST 2021
Hi Tom,
On 06.08.21 16:24, Tom Rini wrote:
> On Fri, Aug 06, 2021 at 03:38:38PM +0200, Stefan Roese wrote:
>
>>
>> On an NXP LX2160 based platform it has been noticed, that the currently
>> implemented memset/memcpy functions for aarch64 are suboptimal.
>> Especially the memset() for clearing the NXP MC firmware memory is very
>> expensive (time-wise).
>>
>> By using optimized functions, a speedup of ~ factor 6 has been measured.
>>
>> This patchset now adds the optimized functions ported from this
>> repository:
>> https://github.com/ARM-software/optimized-routines
>>
>> As these functions make use of opcodes that need the caches to be
>> enabled, they can't be used in the very early boot stage, before the
>> caches are enabled. Because of this, a simple memset() version is also
>> added, in this case memset_simple(), and will be used in very few
>> selected places.
>>
>> Please note that checkpatch.pl complains about some issue with this
>> imported file: arch/arm/lib/asmdefs.h
>> Since it's imported I did explicitly not make any changes here, to make
>> potential future sync'ing easer.
>
> Traditionally, we grab the linux kernel's optimized functions. Are
> there not a set to grab there?
Yes, there are and I did this also. Here my comment from the commit log
of patch 4/5:
Note:
I also integrated and tested with the Linux versions of these optimized
functions. They are similar to the ones now integrated but these ARM
versions are still a small bit faster.
Thanks,
Stefan
More information about the U-Boot
mailing list