[PATCH v2 2/7] common: binman: Calling initr_binman() when BINMAN_FDT

Michal Simek michal.simek at amd.com
Tue Dec 10 13:40:50 CET 2024


Hi Simon,

On 12/9/24 20:27, Simon Glass wrote:
> Hi Michal,
> 
> On Mon, 9 Dec 2024 at 11:34, Michal Simek <michal.simek at amd.com> wrote:
>>
>>
>>
>> On 12/9/24 16:47, Simon Glass wrote:
>>> Hi,
>>>
>>> On Mon, 9 Dec 2024 at 08:32, Tom Rini <trini at konsulko.com> wrote:
>>>>
>>>> On Mon, Dec 09, 2024 at 04:26:15PM +0100, Michal Simek wrote:
>>>>>
>>>>>
>>>>> On 12/6/24 20:20, Simon Glass wrote:
>>>>>> On Fri, 1 Nov 2024 at 03:18, Michal Simek <michal.simek at amd.com> wrote:
>>>>>>>
>>>>>>> Calling empty function when BINMAN_FDT is adding +64B for nothing which is
>>>>>>> not helping on size sensitive configurations as Xilinx mini configurations.
>>>>>>>
>>>>>>> Signed-off-by: Michal Simek <michal.simek at amd.com>
>>>>>>> ---
>>>>>>>
>>>>>>> Changes in v2:
>>>>>>> - new patch
>>>>>>>
>>>>>>>    From my perspective there is no reason to call empty function. It is just
>>>>>>> increase footprint for nothing and we are not far from that limit now.
>>>>>>>
>>>>>>> ---
>>>>>>>     common/board_r.c | 7 +++----
>>>>>>>     1 file changed, 3 insertions(+), 4 deletions(-)
>>>>>>
>>>>>> Reviewed-by: Simon Glass <sjg at chromium.org>
>>>>>>
>>>>>> This is a bit odd, though. Do you have LTO enabled?
>>>>>>
>>>>>
>>>>> yes LTO is enabled. And there are other candidates like this.
>>>>> Is LTO able to fix function arrays which is calling empty function?
>>>>>
>>>>> (without this patch)
>>>>>
>>>>> 00000000fffc0eb4 <initr_of_live>:
>>>>>       fffc0eb4:   52800000        mov     w0, #0x0                        // #0
>>>>>       fffc0eb8:   d65f03c0        ret
>>>>>
>>>>> 00000000fffc0ebc <initr_dm_devices>:
>>>>>       fffc0ebc:   52800000        mov     w0, #0x0                        // #0
>>>>>       fffc0ec0:   d65f03c0        ret
>>>>>
>>>>> 00000000fffc0ec4 <initr_bootstage>:
>>>>>       fffc0ec4:   52800000        mov     w0, #0x0                        // #0
>>>>>       fffc0ec8:   d65f03c0        ret
>>>>>
>>>>> 00000000fffc0ecc <power_init_board>:
>>>>>       fffc0ecc:   52800000        mov     w0, #0x0                        // #0
>>>>>       fffc0ed0:   d65f03c0        ret
>>>>>
>>>>> 00000000fffc0ed4 <initr_announce>:
>>>>>       fffc0ed4:   52800000        mov     w0, #0x0                        // #0
>>>>>       fffc0ed8:   d65f03c0        ret
>>>>>
>>>>> 00000000fffc0edc <initr_binman>:
>>>>>       fffc0edc:   52800000        mov     w0, #0x0                        // #0
>>>>>       fffc0ee0:   d65f03c0        ret
>>>>>
>>>>> 00000000fffc0ee4 <initr_status_led>:
>>>>>       fffc0ee4:   52800000        mov     w0, #0x0                        // #0
>>>>>       fffc0ee8:   d65f03c0        ret
>>>>>
>>>>> 00000000fffc0eec <initr_boot_led_blink>:
>>>>>       fffc0eec:   52800000        mov     w0, #0x0                        // #0
>>>>>       fffc0ef0:   d65f03c0        ret
>>>>>
>>>>> 00000000fffc0ef4 <initr_boot_led_on>:
>>>>>       fffc0ef4:   52800000        mov     w0, #0x0                        // #0
>>>>>       fffc0ef8:   d65f03c0        ret
>>>>>
>>>>> 00000000fffc0efc <initr_lmb>:
>>>>>       fffc0efc:   52800000        mov     w0, #0x0                        // #0
>>>>>       fffc0f00:   d65f03c0        ret
>>>>
>>>> No, but maybe Simon would prefer if we marked all of the could-be-empty
>>>> functions as __maybe_unused and did:
>>>>           CONFIG_IS_ENABLED(BINMAN_FDT, initr_binman),
>>>> etc in the list instead?
>>>
>>> Yes that looks better.
>>
>> But we are talking about using macro inside array at best with using #ifdefs.
>> Or maybe I am not seeing what you are saying.
>>
>>>
>>> Michal, see also [1] in case you can work out why it 'stopped
>>> working'. I could have sworn inlining the function was a win when it
>>> was applied, but no amount of toolchain juggling could make it be a
>>> win when I came back to it later.
>>
>> Are you saying that it worked in past?
> 
> I wasn't able to verify that post facto, but I believe I do remember
> checking it at the time. If you read the original commit message:
> 
> 47870afab92 initcall: Move to inline function
> 
>      The board_r init function was complaining that we are looping through
>      an array, calling all our tiny init stubs sequentially via indirect
>      function calls (which can't be speculated, so they are slow).
> 
>      The solution to that is pretty easy though. All we need to do is inline
>      the function that loops through the functions and the compiler will
>      automatically convert almost all indirect calls into direct inlined code.
> 
>      With this patch, the overall code size drops (by 40 bytes on riscv64)
>      and boot time should become measurably faster for every target.
> 
>      Signed-off-by: Alexander Graf <agraf at suse.de>
> 
> Despite this hopeful sentiment, I seriously doubt any improvement in boot time.

I am not able to replicate this observation on arm64 or riscv64.

Loop unrolling is not happening even if you pass -funroll-all-loops flag.

Maybe different toolchains should be used to see this behavior.

Thanks,
Michal





More information about the U-Boot mailing list