[U-Boot] U-Boot proper(not SPL) relocate option
Joakim Tjernlund
Joakim.Tjernlund at infinera.com
Wed Nov 29 10:48:07 UTC 2017
On Wed, 2017-11-29 at 19:11 +0900, Masahiro Yamada wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
> Hi Simon,
>
>
> 2017-11-28 2:13 GMT+09:00 Simon Glass <sjg at chromium.org>:
> > (Tom - any thoughts about a more expansive cc list on this?)
> >
> > Hi Masahiro,
> >
> > On 26 November 2017 at 07:16, Masahiro Yamada
> > <yamada.masahiro at socionext.com> wrote:
> > > 2017-11-26 20:38 GMT+09:00 Simon Glass <sjg at chromium.org>:
> > > > Hi Philipp,
> > > >
> > > > On 25 November 2017 at 16:31, Dr. Philipp Tomsich
> > > > <philipp.tomsich at theobroma-systems.com> wrote:
> > > > > Hi,
> > > > >
> > > > > > On 25 Nov 2017, at 23:34, Simon Glass <sjg at chromium.org> wrote:
> > > > > >
> > > > > > +Tom, Masahiro, Philipp
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > On 22 November 2017 at 03:27, Wolfgang Denk <wd at denx.de> wrote:
> > > > > > > Dear Kever Yang,
> > > > > > >
> > > > > > > In message <fd0bb500-80c4-f317-cc18-f7aaf1344fd8 at rock-chips.com> you wrote:
> > > > > > > >
> > > > > > > > I can understand this feature, we always do dram_init_banks() first,
> > > > > > > > then we relocate to 'known' area, then will be no risk to access memory.
> > > > > > > > I believe there must be some historical reason for some kind of device,
> > > > > > > > the relocate feature is a wonderful idea for it.
> > > > > > >
> > > > > > > This is actuallyu not so much a feature needed to support some
> > > > > > > specific device (in this case much simpler approahces would be
> > > > > > > possible), but to support a whole set of features. Unfortunately
> > > > > > > these appear to get forgotten / ignored over time.
> > > > > > >
> > > > > > > > many other SoCs should be similar.
> > > > > > > > - Without relocate we can save many step, some of our customer really
> > > > > > > > care much about the boot time duration.
> > > > > > > > * no need to relocate everything
> > > > > > > > * no need to copy all the code
> > > > > > > > * no need init the driver more than once
> > > > > > >
> > > > > > > Please have a look at the README, section "Memory Management".
> > > > > > > The reloaction is not done to any _fixed_ address, but the address
> > > > > > > is actually computed at runtime, depending on a number features
> > > > > > > enabled (at least this is how it used to be - appearently little of
> > > > > > > this is tested on a regular base, so I would not be surprised if
> > > > > > > things are broken today).
> > > > > > >
> > > > > > > The basic idea was to reserve areas of memory at the top of RAM,
> > > > > > > that would not be initialized / modified by U-Boot and Linux, not
> > > > > > > even across a reset / warm boot.
> > > > > > >
> > > > > > > This was used for exaple for:
> > > > > > >
> > > > > > > - pRAM (Protected RAM) which could be used to store all kind of data
> > > > > > > (for example, using a pramfs [Protected and Persistent RAM
> > > > > > > Filesystem]) that could be kept across reboots of the OS.
> > > > > > >
> > > > > > > - shared frame buffer / video memory. U-Boot and Linux would be able
> > > > > > > to initialize the video memory just once (in U-Boot) and then
> > > > > > > share it, maybe even across reboots. especially, this would allow
> > > > > > > for a very early splash screen that gets passed (flicker free) to
> > > > > > > Linux until some Linux GUI takes over (much more difficult today).
> > > > > > >
> > > > > > > - shared log buffer: U-Boot and Linux used to use the same syslog
> > > > > > > buffer mechanism, so you could share it between U-Boot and Linux.
> > > > > > > this allows for example to
> > > > > > > * read the Linux kernel panic messages after reset in U-Boot; this
> > > > > > > is very useful when you bring up a new system and Linux crashes
> > > > > > > before it can display the log buffer on the console
> > > > > > > * pass U-Boot POST results on to Linux, so the application code
> > > > > > > can read and process these
> > > > > > > * process the system log of the previous run (especially after a
> > > > > > > panic) in Lunux after it rebootet.
> > > > > > >
> > > > > > > etc.
> > > > > > >
> > > > > > > There are a number of such features which require to reserve room at
> > > > > > > the top of RAM, the size of which is calculatedat runtime, often
> > > > > > > depending on user settable environment data.
> > > > > > >
> > > > > > > All this cannot be done without relocation to a (dynmaically
> > > > > > > computed) target address.
> > > > > > >
> > > > > > >
> > > > > > > Yes, the code could be simpler and faster without that - but then,
> > > > > > > you cut off a number of features.
> > > > > >
> > > > > > I would be interested in seeing benchmarks showing the cost of
> > > > > > relocation in terms of boot time. Last time I did this was on Exynos 5
> > > > > > and it was some years ago. The time was pretty small provided the
> > > > > > cache was on for the memory copies associated with relocation itself.
> > > > > > Something like 10-20ms but I don't have the numbers handy.
> > > > > >
> > > > > > I think it is useful to be able to allocate memory in board_init_f()
> > > > > > for use by U-Boot for things like the display and the malloc() region.
> > > > > >
> > > > > > Options we might consider:
> > > > > >
> > > > > > 1. Don't relocate the code and data. Thus we could avoid the copy and
> > > > > > relocation cost. This is already supported with the GD_FLG_SKIP_RELOC
> > > > > > used when U-Boot runs as an EFI app
> > > > > >
> > > > > > 2. Rather than throwing away the old malloc() region, keep it around
> > > > > > so existing allocated blocks work. Then new malloc() region would be
> > > > > > used for future allocations. We could perhaps ignore free() calls in
> > > > > > that region
> > > > > >
> > > > > > 2a. This would allow us to avoid re-init of driver model in most cases
> > > > > > I think. E.g. we could init serial and timer before relocation and
> > > > > > leave them inited after relocation. We could just init the
> > > > > > 'additional' devices not done before relocation.
> > > > > >
> > > > > > 2b. I suppose we could even extend this to SPL if we wanted to. I
> > > > > > suspect it would just be a pain though, since SPL might use memory
> > > > > > that U-Boot wants.
> > > > > >
> > > > > > 3. We could turn on the cache earlier. This removes most of the
> > > > > > boot-time penalty. Ideally this should be turned on in SPL and perhaps
> > > > > > redone in U-Boot which has more memory available. If SPL is not used,
> > > > > > we could turn on the cache before relocation.
> > > > >
> > > > > Both turning on the cache and initialising the clocking could be of benefit
> > > > > to boot-time.
> > > > >
> > > > > However, the biggest possible gain will come from utilising Falcon mode
> > > > > to skip the full U-Boot stage and directly boot into the OS from SPL. This
> > > > > assumes that the drivers involved are fully optimised, so loading up the
> > > > > OS image does not take longer than necessary.
> > > >
> > > > I'd like to see numbers on that. From my experience, loading and
> > > > running U-Boot does not take very long...
> > > >
> > > > >
> > > > > > 4. Rather than the reserving memory in board_init_f() we could have it
> > > > > > call malloc() from the expanded region. We could then perhaps then
> > > > > > move this reserve/allocate code in to particular drivers or
> > > > > > subsystems, and drop a good chunk of the init sequence. We would need
> > > > > > to have a larger malloc() region than is currently the case.
> > > > > >
> > > > > > There are still some arch-specific bits in board_init_f() which make
> > > > > > these sorts of changes a bit tricky to support generically. IMO it
> > > > > > would be best to move to 'generic relocation' written in C, where all
> > > > > > archs work basically the same way, before attempting any of the above.
> > > > > >
> > > > > > Still, I can see some benefits and even some simplifications.
> > > > > >
> > > > > > Regards,
> > > > > > Simon
> > >
> > >
> > >
> > > This discussion should have happened.
> > > U-Boot boot sequence is crazily inefficient.
> > >
> > >
> > >
> > > When we talk about "relocation", two things are happening.
> > >
> > > [1] U-Boot proper copies itself to the very end of DRAM
> > > [2] Fix-up the global symbols
> > >
> > > In my opinion, only [2] is useful.
> > >
> > >
> > > SPL initializes the DRAM, so it knows the base and size of DRAM.
> > > SPL should be able to load the U-Boot proper to the final destination.
> > > So, [1] is unnecessary.
> > >
> > >
> > > [2] is necessary because SPL may load the U-Boot proper
> > > to a different place than CONFIG_SYS_TEXT_BASE.
> > > This feature is useful for platforms
> > > whose DRAM base/size is only known at run-time.
> > > (Of course, it should be user-configurable by CONFIG_RELOCATE
> > > or something.)
> > >
> > > Moreover, board_init_f() is unneeded -
> > > everything in board_init_f() is already done by SPL.
> > > Multiple-time DM initialization is really inefficient and ugly.
> > >
> > >
> > > The following is how the ideal boot loader would work.
> > >
> > >
> > > Requirement for U-Boot proper:
> > > U-Boot never changes the location by itself.
> > > So, SPL or a vendor loader must load U-Boot proper
> > > to the final destination directly.
> > > (You can load it to the very end of DRAM if you like,
> > > but the actual place does not matter here.)
> > >
> > >
> > > Boot sequence of U-Boot proper:
> > > If CONFIG_RELOCATE (or something) is enabled,
> > > it fixes the global symbols at the very beginning
> > > of the boot.
> > > (In this case, CONFIG_SYS_TEXT_BASE can be arbitrary)
> > >
> > > That's it. Proceed to the rest of init code.
> > > (= board_init_r)
> > > board_init_f() is unnecessary.
> > >
> > > This should work for recent platforms.
> >
> > Yes that sounds reasonable to me.
> >
> > We could do the symbol fixup/relocation in SPL after loading U-Boot.,
> > although that would probably push us to using ELF format for U-Boot
> > which is a bit limited.
> >
> > Still I think the biggest performance improvement comes from turning
> > on the cache in SPL. So the above is a simplification, not really a
> > speed-up.
>
>
> Right.
> I am more interested in simplification than in speed-up.
> The boot speed is not a significant problem at least for my boards.
>
>
> > >
> > >
> > >
> > > We should think about old platforms that boot from a NOR flash or something.
> > > There are two solutions:
> > > - execute-in-place: run the code in the flash directly
> > > - use SPL (common/spl/spl-nor.c) if you want to run
> > > it from RAM
> >
> > This seems like a big regression in functionality. For example for x86
> > 32-bit we currently don't have an SPL (we do for 64-bit). So I think
> > this means that everything would be forced to have an SPL?
>
> After grace period for migration, Yes.
> XIP or SPL.
> No relocation in U-Boot proper.
>
> This assumption will allow us to dump a lot of burden.
>
> Remove relocation
> Remove board_init_f()
> Remove pre-reloc DM init
> Perhaps, remove struct global_data
> etc.
I have not managed to keep up with this discussion but it seems you are suggesting
some radical change for NOR based boot boards ?
We use such boards(ppc) and also use pram etc. would these still
work?
Jocke
More information about the U-Boot
mailing list