[U-Boot] [PATCH 0/3] arm: reduce .bss section clear time
Pantelis Antoniou
panto at antoniou-consulting.com
Wed Jan 28 15:34:23 CET 2015
Hi Przemyslaw,
> On Jan 28, 2015, at 16:30 , Przemyslaw Marczak <p.marczak at samsung.com> wrote:
>
> Hello,
>
> On 01/28/2015 03:18 PM, Pantelis Antoniou wrote:
>> Hi Przemyslaw,
>>
>>> On Jan 28, 2015, at 16:10 , Przemyslaw Marczak <p.marczak at samsung.com> wrote:
>>>
>>> Hello Stefan,
>>>
>>> On 01/28/2015 02:12 PM, Stefan Roese wrote:
>>>> Hi Przemyslaw,
>>>>
>>>> On 28.01.2015 13:55, Przemyslaw Marczak wrote:
>>>>> This patchset reduces the boot time for ARM architecture,
>>>>> Exynos boards, and boards with DFU enabled(ARM).
>>>>>
>>>>> For tested Trats2 device, this was done in three steps.
>>>>>
>>>>> First was enable the arch memcpy and memset.
>>>>> The second step was enable memset for .bss clear.
>>>>> The third step for reduce this operation is to keep .bss section
>>>>> small as possible.
>>>>>
>>>>> The .bss section will grow if we have a lot of static variables.
>>>>> This section is cleared before jump to the relocated U-Boot,
>>>>> and it's done word by word. To reduce the time for this step,
>>>>> we can enable arch memset, which uses multiple ARM registers.
>>>>>
>>>>> For configs with DFU enabled, we can find the dfu buffer in this section,
>>>>> which has at least 8MB (32MB for trats2). This is a lot of useless data,
>>>>> which is not required for standard boot. So this buffer should be dynamic
>>>>> allocated.
>>>>>
>>>>> Przemyslaw Marczak (3):
>>>>> exynos: config: enable arch memcpy and arch memset
>>>>> arm: relocation: clear .bss section with arch memset if defined
>>>>> dfu: mmc: file buffer: remove static allocation
>>>>>
>>>>> arch/arm/lib/crt0.S | 10 +++++++++-
>>>>> drivers/dfu/dfu_mmc.c | 25 ++++++++++++++++++++++---
>>>>> include/configs/exynos-common.h | 3 +++
>>>>> 3 files changed, 34 insertions(+), 4 deletions(-)
>>>>
>>>> Looking at the commit messages of this patchset I can conclude that your
>>>> overall boot time reduction is:
>>>>
>>>> from ~1527ms
>>>> to ~464ms
>>>>
>>>> This is amazing! Congrats. :)
>>>>
>>>
>>> Thank you. I was also amazed.
>>>
>>> The time results are taken with from the clock cycle counter, I think it's reliable. Some day I would like to check it using the oscilloscope.
>>>
>>>> We really should in general make more use of the optimized functions and
>>>> take care that the buffers (e.g. the DFU buffer in this case) are used
>>>> in a sane way.
>>>>
>>>> Thanks,
>>>> Stefan
>>>>
>>>>
>>>
>>> Yes you're right, I thought that Exynos config has enabled arch memcpy/set lib, before I checked this…
>>>
>>
>> Those numbers are indeed incredible; I suppose the caches are disabled?
>>
>>
>
> The caches are enabled after the relocation, in one of board_init_r calls.
>
How big is this .bss section? We’re talking about something that takes 1.5secs to clear
a few MBs of memory? This is horrible.
Even at the optimized case .5secs is too much.
>>> Best regards,
>>> --
>>> Przemyslaw Marczak
>>> Samsung R&D Institute Poland
>>> Samsung Electronics
>>> p.marczak at samsung.com
>>
>> Regards
>>
>> — Pantelis
>>
>>
>
> Best regards,
> --
> Przemyslaw Marczak
> Samsung R&D Institute Poland
> Samsung Electronics
> p.marczak at samsung.com
Regards
— Pantelis
More information about the U-Boot
mailing list