[PATCH v2 2/7] common: binman: Calling initr_binman() when BINMAN_FDT
Simon Glass
sjg at chromium.org
Tue Dec 10 17:17:18 CET 2024
Hi Michal,
On Tue, 10 Dec 2024 at 05:41, Michal Simek <michal.simek at amd.com> wrote:
>
> Hi Simon,
>
> On 12/9/24 20:27, Simon Glass wrote:
> > Hi Michal,
> >
> > On Mon, 9 Dec 2024 at 11:34, Michal Simek <michal.simek at amd.com> wrote:
> >>
> >>
> >>
> >> On 12/9/24 16:47, Simon Glass wrote:
> >>> Hi,
> >>>
> >>> On Mon, 9 Dec 2024 at 08:32, Tom Rini <trini at konsulko.com> wrote:
> >>>>
> >>>> On Mon, Dec 09, 2024 at 04:26:15PM +0100, Michal Simek wrote:
> >>>>>
> >>>>>
> >>>>> On 12/6/24 20:20, Simon Glass wrote:
> >>>>>> On Fri, 1 Nov 2024 at 03:18, Michal Simek <michal.simek at amd.com> wrote:
> >>>>>>>
> >>>>>>> Calling empty function when BINMAN_FDT is adding +64B for nothing which is
> >>>>>>> not helping on size sensitive configurations as Xilinx mini configurations.
> >>>>>>>
> >>>>>>> Signed-off-by: Michal Simek <michal.simek at amd.com>
> >>>>>>> ---
> >>>>>>>
> >>>>>>> Changes in v2:
> >>>>>>> - new patch
> >>>>>>>
> >>>>>>> From my perspective there is no reason to call empty function. It is just
> >>>>>>> increase footprint for nothing and we are not far from that limit now.
> >>>>>>>
> >>>>>>> ---
> >>>>>>> common/board_r.c | 7 +++----
> >>>>>>> 1 file changed, 3 insertions(+), 4 deletions(-)
> >>>>>>
> >>>>>> Reviewed-by: Simon Glass <sjg at chromium.org>
> >>>>>>
> >>>>>> This is a bit odd, though. Do you have LTO enabled?
> >>>>>>
> >>>>>
> >>>>> yes LTO is enabled. And there are other candidates like this.
> >>>>> Is LTO able to fix function arrays which is calling empty function?
> >>>>>
> >>>>> (without this patch)
> >>>>>
> >>>>> 00000000fffc0eb4 <initr_of_live>:
> >>>>> fffc0eb4: 52800000 mov w0, #0x0 // #0
> >>>>> fffc0eb8: d65f03c0 ret
> >>>>>
> >>>>> 00000000fffc0ebc <initr_dm_devices>:
> >>>>> fffc0ebc: 52800000 mov w0, #0x0 // #0
> >>>>> fffc0ec0: d65f03c0 ret
> >>>>>
> >>>>> 00000000fffc0ec4 <initr_bootstage>:
> >>>>> fffc0ec4: 52800000 mov w0, #0x0 // #0
> >>>>> fffc0ec8: d65f03c0 ret
> >>>>>
> >>>>> 00000000fffc0ecc <power_init_board>:
> >>>>> fffc0ecc: 52800000 mov w0, #0x0 // #0
> >>>>> fffc0ed0: d65f03c0 ret
> >>>>>
> >>>>> 00000000fffc0ed4 <initr_announce>:
> >>>>> fffc0ed4: 52800000 mov w0, #0x0 // #0
> >>>>> fffc0ed8: d65f03c0 ret
> >>>>>
> >>>>> 00000000fffc0edc <initr_binman>:
> >>>>> fffc0edc: 52800000 mov w0, #0x0 // #0
> >>>>> fffc0ee0: d65f03c0 ret
> >>>>>
> >>>>> 00000000fffc0ee4 <initr_status_led>:
> >>>>> fffc0ee4: 52800000 mov w0, #0x0 // #0
> >>>>> fffc0ee8: d65f03c0 ret
> >>>>>
> >>>>> 00000000fffc0eec <initr_boot_led_blink>:
> >>>>> fffc0eec: 52800000 mov w0, #0x0 // #0
> >>>>> fffc0ef0: d65f03c0 ret
> >>>>>
> >>>>> 00000000fffc0ef4 <initr_boot_led_on>:
> >>>>> fffc0ef4: 52800000 mov w0, #0x0 // #0
> >>>>> fffc0ef8: d65f03c0 ret
> >>>>>
> >>>>> 00000000fffc0efc <initr_lmb>:
> >>>>> fffc0efc: 52800000 mov w0, #0x0 // #0
> >>>>> fffc0f00: d65f03c0 ret
> >>>>
> >>>> No, but maybe Simon would prefer if we marked all of the could-be-empty
> >>>> functions as __maybe_unused and did:
> >>>> CONFIG_IS_ENABLED(BINMAN_FDT, initr_binman),
> >>>> etc in the list instead?
> >>>
> >>> Yes that looks better.
> >>
> >> But we are talking about using macro inside array at best with using #ifdefs.
> >> Or maybe I am not seeing what you are saying.
> >>
> >>>
> >>> Michal, see also [1] in case you can work out why it 'stopped
> >>> working'. I could have sworn inlining the function was a win when it
> >>> was applied, but no amount of toolchain juggling could make it be a
> >>> win when I came back to it later.
> >>
> >> Are you saying that it worked in past?
> >
> > I wasn't able to verify that post facto, but I believe I do remember
> > checking it at the time. If you read the original commit message:
> >
> > 47870afab92 initcall: Move to inline function
> >
> > The board_r init function was complaining that we are looping through
> > an array, calling all our tiny init stubs sequentially via indirect
> > function calls (which can't be speculated, so they are slow).
> >
> > The solution to that is pretty easy though. All we need to do is inline
> > the function that loops through the functions and the compiler will
> > automatically convert almost all indirect calls into direct inlined code.
> >
> > With this patch, the overall code size drops (by 40 bytes on riscv64)
> > and boot time should become measurably faster for every target.
> >
> > Signed-off-by: Alexander Graf <agraf at suse.de>
> >
> > Despite this hopeful sentiment, I seriously doubt any improvement in boot time.
>
> I am not able to replicate this observation on arm64 or riscv64.
>
> Loop unrolling is not happening even if you pass -funroll-all-loops flag.
>
> Maybe different toolchains should be used to see this behavior.
Yes, maybe. Or perhaps it changed for some reason.
Regards,
Simon
More information about the U-Boot
mailing list