[U-Boot] [PATCH v2 2/8] arm: relocation: clear .bss section with arch memset if defined

Simon Glass sjg at chromium.org
Thu Feb 19 19:59:14 CET 2015


On 18 February 2015 at 05:31, Przemyslaw Marczak <p.marczak at samsung.com> wrote:
> Hello,
>
>
> On 02/18/2015 05:32 AM, Simon Glass wrote:
>>
>> Hi Przemyslaw,
>>
>> On 16 February 2015 at 08:21, Przemyslaw Marczak <p.marczak at samsung.com>
>> wrote:
>>>
>>> Hello,
>>>
>>>
>>> On 02/16/2015 04:13 PM, Przemyslaw Marczak wrote:
>>>>
>>>>
>>>> For ARM architecture, enable the CONFIG_USE_ARCH_MEMSET/MEMCPY,
>>>> will highly increase the memset/memcpy performance. This is able
>>>> thanks to the ARM multiple register instructions.
>>>>
>>>> Unfortunatelly the relocation is done without the cache enabled,
>>>> so it takes some time, but zeroing the BSS memory takes much more
>>>> longer, especially for the configs with big static buffers.
>>>>
>>>> A quick test confirms, that the boot time improvement after using
>>>> the arch memcpy for relocation has no significant meaning.
>>>> The same test confirms that enable the memset for zeroing BSS,
>>>> reduces the boot time.
>>>>
>>>> So this patch enables the arch memset for zeroing the BSS after
>>>> the relocation process. For ARM boards, this can be enabled
>>>> in board configs by defining: 'CONFIG_USE_ARCH_MEMSET'.
>>>>
>>>> This was tested on Trats2.
>>>> A quick test with trace. Boot time from start to main_loop() entry:
>>>> - ~1384ms - before this change
>>>> -  ~888ms - after this change
>>>>
>>>> Signed-off-by: Przemyslaw Marczak <p.marczak at samsung.com>
>>>> Cc: Albert Aribaud <albert.u.boot at aribaud.net>
>>>> Cc: Tom Rini <trini at ti.com>
>>>> ---
>>>>    arch/arm/lib/crt0.S | 10 +++++++++-
>>>>    1 file changed, 9 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/arm/lib/crt0.S b/arch/arm/lib/crt0.S
>>>> index 22df3e5..fab3d2c 100644
>>>> --- a/arch/arm/lib/crt0.S
>>>> +++ b/arch/arm/lib/crt0.S
>>>> @@ -115,14 +115,22 @@ here:
>>>>          bl      c_runtime_cpu_setup     /* we still call old routine
>>>> here
>>>> */
>>>>
>>>>          ldr     r0, =__bss_start        /* this is auto-relocated! */
>>>> -       ldr     r1, =__bss_end          /* this is auto-relocated! */
>>>>
>>>> +#ifdef CONFIG_USE_ARCH_MEMSET
>>>> +       ldr     r3, =__bss_end          /* this is auto-relocated! */
>>>> +       mov     r1, #0x00000000         /* prepare zero to clear BSS */
>>>> +
>>>> +       subs    r2, r3, r0              /* r2 = memset len */
>>>> +       bl      memset
>>>> +#else
>>>> +       ldr     r1, =__bss_end          /* this is auto-relocated! */
>>>>          mov     r2, #0x00000000         /* prepare zero to clear BSS */
>>>>
>>>>    clbss_l:cmp   r0, r1                  /* while not at end of BSS */
>>>>          strlo   r2, [r0]                /* clear 32-bit BSS word */
>>>>          addlo   r0, r0, #4              /* move to next */
>>>>          blo     clbss_l
>>>> +#endif
>>>>
>>>>          bl coloured_LED_init
>>>>          bl red_led_on
>>>>
>>>
>>> This commit left unchanged. After boot time test using oscilloscope and
>>> the
>>> clock cycle counter I didn't noticed a time difference in more then one
>>> ms.
>>> In this case I think that insert a duplicated code here, has no sense.
>>
>>
>> I don't understand this comment, sorry.
>>
>> Regards,
>> Simon
>>
>
> Sorry for the misleading message.
> When I send this patch set, I forgot about adding the message-id of the
> previous thread as "in-reply-to".
>
> There was a discussion about insert the memory zeroing routines as an asm
> here, instead of using the 'memset' call. But I tested that there is no
> difference in the performance. So in this case, it's better to use the
> common lib and this commit is the same as it was in the first version.
>
> (I missed the changelog)

I see, thanks.

Reviewed-by: Simon Glass <sjg at chromium.org>


More information about the U-Boot mailing list