[PATCH v1 0/5] arm64: Add optimized memset/memcpy functions

Stefan Roese sr at denx.de
Fri Aug 6 16:44:31 CEST 2021


Hi Tom,

On 06.08.21 16:24, Tom Rini wrote:
> On Fri, Aug 06, 2021 at 03:38:38PM +0200, Stefan Roese wrote:
> 
>>
>> On an NXP LX2160 based platform it has been noticed, that the currently
>> implemented memset/memcpy functions for aarch64 are suboptimal.
>> Especially the memset() for clearing the NXP MC firmware memory is very
>> expensive (time-wise).
>>
>> By using optimized functions, a speedup of ~ factor 6 has been measured.
>>
>> This patchset now adds the optimized functions ported from this
>> repository:
>> https://github.com/ARM-software/optimized-routines
>>
>> As these functions make use of opcodes that need the caches to be
>> enabled, they can't be used in the very early boot stage, before the
>> caches are enabled. Because of this, a simple memset() version is also
>> added, in this case memset_simple(), and will be used in very few
>> selected places.
>>
>> Please note that checkpatch.pl complains about some issue with this
>> imported file: arch/arm/lib/asmdefs.h
>> Since it's imported I did explicitly not make any changes here, to make
>> potential future sync'ing easer.
> 
> Traditionally, we grab the linux kernel's optimized functions.  Are
> there not a set to grab there?

Yes, there are and I did this also. Here my comment from the commit log
of patch 4/5:

Note:
I also integrated and tested with the Linux versions of these optimized
functions. They are similar to the ones now integrated but these ARM
versions are still a small bit faster.

Thanks,
Stefan


More information about the U-Boot mailing list