[U-Boot] fdt performance
Simon Glass
sjg at chromium.org
Sun Jan 26 17:56:39 CET 2014
Hi Aaron,
On 13 January 2014 23:13, Aaron Williams
<Aaron.Williams at caviumnetworks.com> wrote:
> Hi Simon,
>
> Sorry for the long delay.
>
>
>
> On 10/17/2013 03:27 PM, Simon Glass wrote:
>>
>> Hi Aaron,
>>
>> On Thu, Oct 17, 2013 at 12:24 AM, Aaron Williams
>> <Aaron.Williams at caviumnetworks.com> wrote:
>>>
>>> Hi all,
>>>
>>> In our bootloader based off of 2013.07 we make extensive use of the flat
>>> device tree. In profiling our bootloader in our simulator I found that
>>> the
>>> function eating up the most time is fdt_next_tag. Looking at it,
>>> especially
>>> fdt_offset_ptr, it looks like there is a lot of room for improvement
>>> especially in the skip name section.
>>>
>>> Some of the checks in fdt_offset_ptr also look useless, such as if
>>> ((offset
>>> + len) < offset) which will always be false, or
>>> if (p + len < p)
>>>
>>> len is always positive.
>>
>> Are you using CONFIG_OF_CONTROL?
>>
>> If so, as a higher-level point, we could bring in an efficient DT
>> library, which converts the the FDT into a tree structure for faster
>> parsing. I can point you to a starting point if you like.
>>
>> Regards,
>> Simon
>
> A higher-level point is not desirable since when we are experiencing the
> performance issues we are running out of NOR flash or our simulator. Since
> most of our customers use NOR flash this a huge issue for us. We have very
> little memory available for holding data structures since basically only the
> stack is available before relocation.
>
> Taking out these checks significantly sped up our boot process.
>
> If you're checking for a wrap-around it should not check for each byte but
> should check only once if it will wrap and handle it accordingly. If we're
> wrapping then the device tree is hosed and we have bigger problems.
Are you scanning through the FDT multiple times before relocation?
Certainly libfdt is designed to be careful about things and there are
many checks. Are you suggesting adding some kind of CONFIG optoin tot
turn them off?
I'm having a hard time understanding why these simple checks (which
would expand to a few machine instructions) should take so long. Have
you confirmed that removing them does significantly speed up the
hardware, and it is not just an artifact of your profiling system?
It is certainly possible to pass a -ve number as the device tree
offset to any of the exported functions. This should result in correct
behaviour (returning an error) rather than a crash.
Of course anything that speeds up the code is welcome so long as it is
still correct.
Regards,
Simon
>
>
> -Aaron
>
> --
> Aaron Williams
> Software Engineer
> Cavium, Inc.
> (408) 943-7198 (510) 789-8988 (cell)
>
More information about the U-Boot
mailing list