[PATCH v1 0/5] arm64: Add optimized memset/memcpy functions

Stefan Roese sr at denx.de
Fri Aug 6 15:38:38 CEST 2021


On an NXP LX2160 based platform it has been noticed, that the currently
implemented memset/memcpy functions for aarch64 are suboptimal.
Especially the memset() for clearing the NXP MC firmware memory is very
expensive (time-wise).

By using optimized functions, a speedup of ~ factor 6 has been measured.

This patchset now adds the optimized functions ported from this
repository:
https://github.com/ARM-software/optimized-routines

As these functions make use of opcodes that need the caches to be
enabled, they can't be used in the very early boot stage, before the
caches are enabled. Because of this, a simple memset() version is also
added, in this case memset_simple(), and will be used in very few
selected places.

Please note that checkpatch.pl complains about some issue with this
imported file: arch/arm/lib/asmdefs.h
Since it's imported I did explicitly not make any changes here, to make
potential future sync'ing easer.

Thanks,
Stefan


Stefan Roese (5):
  lib/string: Add memset_simple()
  board_init: Use memset_simple() in board_init_f_init_reserve()
  arm64: cache_v8: Use memset_simple() in create_table()
  arm64: arch/arm/lib: Add optimized memset/memcpy functions
  arm64: Kconfig: Enable usage of optimized memset/memcpy

 arch/arm/Kconfig              |  10 +-
 arch/arm/cpu/armv8/cache_v8.c |   2 +-
 arch/arm/lib/Makefile         |   5 +
 arch/arm/lib/asmdefs.h        |  98 ++++++++++++++
 arch/arm/lib/memcpy-arm64.S   | 241 ++++++++++++++++++++++++++++++++++
 arch/arm/lib/memset-arm64.S   | 116 ++++++++++++++++
 common/init/board_init.c      |   2 +-
 include/linux/string.h        |   2 +
 lib/string.c                  |  10 ++
 9 files changed, 478 insertions(+), 8 deletions(-)
 create mode 100644 arch/arm/lib/asmdefs.h
 create mode 100644 arch/arm/lib/memcpy-arm64.S
 create mode 100644 arch/arm/lib/memset-arm64.S

-- 
2.32.0



More information about the U-Boot mailing list