[RFC PATCH 0/4] arm: move inline assembly MRC/MCR P15 to separate .S files

Jerome Forissier jerome.forissier at linaro.org
Tue Jul 8 12:02:46 CEST 2025


This is an attempt at cleaning up some ARM inline assembly, namely: the
MRC/MCR P15 instructions, with two objectives:

1. Getting rid of some changes that are suboptimal, including a recent
workaround for kirkwood. I am referring to commits forcing -marm (arm32
instructions) on whole C files just because they use the MCR/MRC
instruction in inline assembly and those instructions are not supported
in Thumb-1, as well as patches disabling LTO on select files for pretty
much the same reason. Those commits are the following:

62e92077a893 ("arm: support Thumb-1 with CONFIG_SYS_THUMB_BUILD")
8f9696510afc ("ARM: make LTO available")
e5fc9037dd33 ("ARM: fix LTO build for some thumb-interwork cases")
410d59095a9f ("arm: kirkwood: fix freeze on boot")

2. Hopefully get smaller binaries for Arm platforms that use Thumb and/
or LTO, at the cost of marginal increases (say a few dozens of bytes at
most) for the other aArm boards due to not inlining a few functions. And
of course no change for non-arm architectures.

There are four patches in this series:

- The first one reverts a change which does solve a linker error but
does not work as [1] has shown.
- The second patch moves inline assembly into separate .S files
- With that in place, it is possible to remove the -marm overrides as
well as the C flags overrides selectively disabling LTO. This is what
the third patch does.
- The fourth patch is not directly related, but is trying to reduce the
binary size further for platforms that compile C with
-ffunction-sections -fdata-sections so that the linker can eliminate
unreferenced code and data with --gc-sections. I noticed that .S files
do not follow the same convention. By modifying the ENTRY, WEAK and
ENDPROC macros, each assembly function is emitted in its own section
(.text.<name>) like the C compiler would do if it were C code compiled
with -ffunction-sections, thus allowing dead code elimination for .S
too. The savings are not much though (100 or 200 bytes most of the
time), and it makes a couple of boards significantly bigger
(imx8mn_beacon_fspi: all +16196 data +16304; stm32* as well as
imxrt1170-evk: all +2K text +2K). This needs further investigation.
See [6] for details.

Now for the results:

- CI status can be seen at [2]. All good.
- buildman output showing size differences before and after the series:
  $ unbuffer tools/buildman/buildman -b lto-fixes-squashed -c 2 -sS \
    | ansi2html >buildman-series-squashed-1e2c64f1537-size-summary.html
  See [3] (arch averages)
  $ unbuffer tools/buildman/buildman -b lto-fixes-squashed -c 2 -sSd \
    | ansi2html >buildman-series-squashed-1e2c64f1537-size-details.html
  See [4] (per board)
- buildman output showing size differences for each patch:
  $ unbuffer tools/buildman/buildman -b lto-fixes -c 4 -sS \
    | ansi2html >buildman-series-1e2c64f1537-size-summary.html
  See [5] (arch averages)
  $ unbuffer tools/buildman/buildman -b lto-fixes -c 4 -sSd \
    | ansi2html >buildman-series-1e2c64f1537-size-details.html
  See [6] (per board)

[1] https://lists.denx.de/pipermail/u-boot/2025-June/592682.html
[2] https://source.denx.de/u-boot/custodians/u-boot-net/-/pipelines/26972
[3] https://people.linaro.org/~jerome.forissier/u-boot-07-Jul-2025/buildman-series-squashed-1e2c64f1537-size-summary.html
[4] https://people.linaro.org/~jerome.forissier/u-boot-07-Jul-2025/buildman-series-squashed-1e2c64f1537-size-details.html
[5] https://people.linaro.org/~jerome.forissier/u-boot-07-Jul-2025/buildman-series-1e2c64f1537-size-summary.html
[6] https://people.linaro.org/~jerome.forissier/u-boot-07-Jul-2025/buildman-series-1e2c64f1537-size-details.html


Jerome Forissier (4):
  Revert "arm: asm/system.h: mrc and mcr need .arm if __thumb2__ is not
    set"
  arm: move inline assembly CP15 instructions to separate .S files
  arm: do not force -marm on some C files and allow LTO everywhere
  linkage: use per-function section in ENTRY, WEAK and ENDPROC

 arch/arm/cpu/arm926ejs/Makefile            |   7 +-
 arch/arm/cpu/arm926ejs/cache.c             |  32 ++----
 arch/arm/cpu/arm926ejs/cp15.S              |  46 ++++++++
 arch/arm/cpu/arm926ejs/cpu.c               |  13 +--
 arch/arm/cpu/armv7/lowlevel_init.S         |   4 -
 arch/arm/cpu/armv7/ls102xa/psci.S          |   2 +-
 arch/arm/cpu/armv7/nonsec_virt.S           |  16 +--
 arch/arm/cpu/armv7/psci.S                  | 118 ++++++++++-----------
 arch/arm/cpu/armv7/start.S                 |  20 ++--
 arch/arm/cpu/armv8/cache.S                 |  26 -----
 arch/arm/cpu/armv8/psci.S                  |  22 ++--
 arch/arm/cpu/armv8/tlb.S                   |   2 -
 arch/arm/cpu/armv8/transition.S            |   8 --
 arch/arm/include/asm/system.h              |  36 ++-----
 arch/arm/lib/Makefile                      |  17 ++-
 arch/arm/lib/ashldi3.S                     |   8 +-
 arch/arm/lib/ashrdi3.S                     |   8 +-
 arch/arm/lib/bitops.S                      |   8 --
 arch/arm/lib/cache-cp15.c                  |  62 ++++-------
 arch/arm/lib/cache.c                       |   7 +-
 arch/arm/lib/cp15.S                        |  92 ++++++++++++++++
 arch/arm/lib/crt0.S                        |   4 +-
 arch/arm/lib/div64.S                       |   2 -
 arch/arm/lib/lib1funcs.S                   |  36 ++-----
 arch/arm/lib/lshrdi3.S                     |   8 +-
 arch/arm/lib/muldi3.S                      |   8 +-
 arch/arm/lib/relocate.S                    |   6 +-
 arch/arm/lib/semihosting.S                 |   2 -
 arch/arm/lib/setjmp.S                      |   6 --
 arch/arm/lib/setjmp_aarch64.S              |   6 --
 arch/arm/lib/uldivmod.S                    |   2 -
 arch/arm/mach-imx/mx5/lowlevel_init.S      |   4 +-
 arch/arm/mach-kirkwood/Makefile            |   8 +-
 arch/arm/mach-kirkwood/cp15.S              |  13 +++
 arch/arm/mach-kirkwood/include/mach/cpu.h  |  13 +--
 arch/arm/mach-omap2/omap3/lowlevel_init.S  |  36 +++----
 arch/arm/mach-renesas/lowlevel_init_gen3.S |   6 +-
 arch/arm/mach-tegra/psci.S                 |  12 +--
 arch/riscv/lib/memcpy.S                    |   6 +-
 arch/riscv/lib/memmove.S                   |   7 +-
 arch/riscv/lib/memset.S                    |   6 +-
 arch/riscv/lib/semihosting.S               |   2 -
 arch/riscv/lib/setjmp.S                    |   6 --
 common/Makefile                            |   4 -
 include/linux/linkage.h                    |  20 +++-
 45 files changed, 387 insertions(+), 390 deletions(-)
 create mode 100644 arch/arm/cpu/arm926ejs/cp15.S
 create mode 100644 arch/arm/lib/cp15.S
 create mode 100644 arch/arm/mach-kirkwood/cp15.S

-- 
2.43.0

base-commit: 7027b445cc0bfb86204ecb1f1fe596f5895048d9
branch: lto-fixes


More information about the U-Boot mailing list