[U-Boot] [PATCH 16/22] x86: board_f: Adjust x86 boot order for performance

Simon Glass sjg at chromium.org
Mon Jan 5 18:33:31 CET 2015


Hi,

On 5 January 2015 at 10:22, Graeme Russ <gruss at tss-engineering.com> wrote:
> Hi Simon & Bin,
>
> On Tue, Jan 6, 2015 at 12:40 AM, Bin Meng <bmeng.cn at gmail.com> wrote:
>>
>> Hi Simon,
>>
>> On Mon, Jan 5, 2015 at 9:44 AM, Simon Glass <sjg at chromium.org> wrote:
>> > Hi Bin,
>> >
>> > On 3 January 2015 at 20:26, Bin Meng <bmeng.cn at gmail.com> wrote:
>> >> Hi Simon,
>> >>
>> >> On Fri, Jan 2, 2015 at 6:23 AM, Simon Glass <sjg at chromium.org> wrote:
>> >>> Hi Bin,
>> >>>
>> >>> On 30 December 2014 at 22:51, Bin Meng <bmeng.cn at gmail.com> wrote:
>> >>>>
>> >>>> Hi Simon,
>> >>>>
>> >>>> On Sun, Dec 28, 2014 at 10:20 AM, Simon Glass <sjg at chromium.org>
>> >>>> wrote:
>> >>>> > For bare platforms we turn off ROM-caching before calling
>> >>>> > board_init_f_r()
>> >>>> > It is then very slow to copy U-Boot from ROM to RAM. So adjust the
>> >>>> > order so
>> >>>> > that the copying happens before we turn off ROM-caching.
>> >>>> >
>> >>>> > Signed-off-by: Simon Glass <sjg at chromium.org>
>> >>>> > ---
>> >>>> >
>> >>>> >  common/board_f.c | 8 +++++---
>> >>>> >  1 file changed, 5 insertions(+), 3 deletions(-)
>> >>>> >
>> >>>> > diff --git a/common/board_f.c b/common/board_f.c
>> >>>> > index 98c9c72..1b65575 100644
>> >>>> > --- a/common/board_f.c
>> >>>> > +++ b/common/board_f.c
>> >>>> > @@ -983,6 +983,11 @@ static init_fnc_t init_sequence_f[] = {
>> >>>> >         INIT_FUNC_WATCHDOG_RESET
>> >>>> >         reloc_fdt,
>> >>>> >         setup_reloc,
>> >>>> > +#ifdef CONFIG_X86
>> >>>> > +       copy_uboot_to_ram,
>> >>>> > +       clear_bss,
>> >>>> > +       do_elf_reloc_fixups,
>> >>>> > +#endif
>> >>>> >  #if !defined(CONFIG_ARM) && !defined(CONFIG_SANDBOX)
>> >>>> >         jump_to_copy,
>> >>>> >  #endif
>> >>>> > @@ -1042,9 +1047,6 @@ void board_init_f(ulong boot_flags)
>> >>>> >   */
>> >>>> >  static init_fnc_t init_sequence_f_r[] = {
>> >>>> >         init_cache_f_r,
>> >>>> > -       copy_uboot_to_ram,
>> >>>> > -       clear_bss,
>> >>>> > -       do_elf_reloc_fixups,
>> >>>> >
>> >>>> >         NULL,
>> >>>> >  };
>> >>>> > --
>
>
> Wow, doesn't this bring back some memories. I've had a look over this code
> as it is now and it took a while to sink in. Things have moved on in the
> past 2 years :)

Nice to hear from you again!

>
>>
>> >>>>
>> >>>> I don't understand. Isn't the init_cache_f_r() which turns on the
>> >>>> cache?
>> >>>>
>> >>>
>> >>> Yes it turns on the cache, but turns off the ROM cache (they are
>> >>> different things). So this ordering is much faster. I think I might
>> >>> have more tuning to do though.
>> >>>
>> >>
>> >> Got it. Since we moved these 3 routines from init_sequence_f_r[] to
>> >> init_sequence_f[], how about we remove the whole init_sequence_f_r[]
>> >> completely? If not possible, the comment block above
>> >> init_sequence_f_r[] needs to be fixed?
>> >
>> > I'll remove the comment.
>
>
> I think init_sequence_f_r can go... but I need to have a better look at the
> code
>
> If turning off the ROM cache by init_cache_f_r is the problem, then perhaps
> the following order would be better:
>
>   copy_uboot_to_ram,
>   init_cache_f_r,
>   clear_bss,
>   do_elf_reloc_fixups,
>
> Without enabling the CPU's cache, clear_bss and do_elf_reloc_fixups will
> suffer.

Better would be to have init_cache_f_r after all this I think.

>
>> >
>> >>
>> >>  *
>> >>  * The '_f_r' sequence must, as a minimum, copy U-Boot to RAM (if
>> >>  * supported).  It _should_, if possible, copy global data to RAM and
>> >>  * initialise the CPU caches (to speed up the relocation process)
>> >>  *
>> >>  * NOTE: At present only x86 uses this route, but it is intended that
>> >>  * all archs will move to this when generic relocation is implemented.
>> >>  */
>> >>
>> >> So looks that we should either drop this _f_r stage, or make other
>> >> arch use this _f_r?
>> >
>> > I think we should move to generic relocation, meaning that all archs
>> > do relocation the same way with the same code. Then only arch-specific
>> > stuff is the then ELF fixup function, which adjusts a relocation site,
>> > and the code to actually change the stack pointer.
>
>
> This was always my plan - have arch specific do_reloc_fixups and the rest
> would be generic
>
>>
>> >
>> > This board_init_f_r() code is part of one attempt to do this - the
>> > premise was that turning the cache on before relocating U-Boot will
>> > save time. That's true, but it would be even better to turn the cache
>> > on much earlier. With pit (Chromebook 2) we turn on the cache in SPL.
>> > So I think turning it on here is too late if we are trying to save
>> > time. Still it is a good start and if we could make it happen
>> > generally it would be nice.
>
>
> And now you have me thinking board_init_f_r is not needed at all...
>
> If we can setup the stack, clear BSS, copy U-Boot to RAM and perform
> relocation fixups while running from ROM, what is left for board_init_f_r to
> do?
>
> The only thing I can think of is the caveats mentioned in the comment
> ('static' variables are read-only / Global Data (gd->xxx) is read/write).
> But aren't these true when running from ROM?
>
> Looking at non-x86 arches, relocate_code() looks to do what
> copy_uboot_to_ram and clear_bss does in x86 land (not sure about
> do_elf_reloc_fixups - seems to be arch specific as to whether
> relocate_code() does this or it is done later (hence the
> CONFIG_NEEDS_MANUAL_RELOC #define?)

Yes this can be unified. There is still something in there though for
board_init_f_r(), at least as things are now. It just so happens on
x86 that we are running from ROM and CAR so as I understand it we
sort-of have the cache on before relocation. That doesn't apply for
coreboot though, so there is probably an optimisation to be made
somewhere.

>
>> >
>> > See here for my original attempt, which was tied up with generic
>> > board. In the end I untied them and we got one but not the other.
>> >
>> > http://lists.denx.de/pipermail/u-boot/2012-February/118409.html
>> >
>
>
> Ah, generic relocation... I really wish my life hadn't taken a hard-left
> turn when it did. We got so close.
>
> From where I'm looking (fresh eyes - I might be missing something big) we
> should be able to do the ROM->RAM copy, BSS clearing, and relocation fixups
> in board_init_f.
>
> When we return from board_init_f it should be a fairly simple matter of:
>  - Copying the contents of the Global Data structure from CAR to RAM (or
> from RAM to RAM if your on a platform which initialises RAM before U-Boot)
>  - Set the gd pointer (reserved register) to point to the new copy
>  - Figure out where board_init_r is and jump to it
>
> board_init_r should be able to do any remaining cache tweaks. If cache
> tweaks cannot be done while executing from RAM then they need to be done in
> board_init_f
>
> I just cannot see the point of board_init_f_r any more

Yes, it's hard to see, I'm not sure either.

Anyway I'm going to apply this patch while we figure out what further
work can be done.

Regards,
Simon


More information about the U-Boot mailing list