[U-Boot] [PATCH v3 1/6] spl: add Kconfig option to clear bss early

Simon Goldschmidt simon.k.r.goldschmidt at gmail.com
Wed May 22 19:42:34 UTC 2019


Simon Glass <sjg at chromium.org> schrieb am Mi., 22. Mai 2019, 21:34:

> Hi Simon,
>
> On Wed, 22 May 2019 at 02:05, Simon Goldschmidt
> <simon.k.r.goldschmidt at gmail.com> wrote:
> >
> > On Wed, May 22, 2019 at 2:53 AM Simon Glass <sjg at chromium.org> wrote:
> > >
> > > Hi Andreas,
> > >
> > > On Tue, 21 May 2019 at 15:01, Andreas Dannenberg <dannenberg at ti.com>
> wrote:
> > > >
> > > > Hi Simon (Glass),
> > > >
> > > > On Sat, May 18, 2019 at 10:08:19AM -0600, Simon Glass wrote:
> > > > > Hi Andreas,
> > > > >
> > > > > On Mon, 6 May 2019 at 22:49, Andreas Dannenberg <dannenberg at ti.com>
> wrote:
> > > > > >
> > > > > > Hi Simon,
> > > > > >
> > > > > > On Mon, May 06, 2019 at 09:51:56PM -0600, Simon Glass wrote:
> > > > > > > Hi Andreas,
> > > > > > >
> > > > > > > On Fri, 3 May 2019 at 14:25, Andreas Dannenberg <
> dannenberg at ti.com> wrote:
> > > > > > > >
> > > > > > > > Simon,
> > > > > > > >
> > > > > > > > On Sat, Mar 30, 2019 at 09:18:08PM +0100, Simon Goldschmidt
> wrote:
> > > > > > > > > Simon Glass <sjg at chromium.org> schrieb am Sa., 30. März
> 2019, 21:06:
> > > > > > > > >
> > > > > > > > > > Hi Simon,
> > > > > > > > > >
> > > > > > > > > > On Wed, 27 Mar 2019 at 13:40, Simon Goldschmidt
> > > > > > > > > > <simon.k.r.goldschmidt at gmail.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > This introduces a new Kconfig option SPL_CLEAR_BSS_F.
> If enabled, it
> > > > > > > > > > clears
> > > > > > > > > > > the bss before calling board_init_f() instead of
> clearing it before
> > > > > > > > > > calling
> > > > > > > > > > > board_init_r().
> > > > > > > > > > >
> > > > > > > > > > > This also ensures that variables placed in BSS can be
> shared between
> > > > > > > > > > > board_init_f() and board_init_r() in SPL.
> > > > > > > > > > >
> > > > > > > > > > > Such global variables are used, for example, when
> loading things from FAT
> > > > > > > > > > > before SDRAM is available: the full heap required for
> FAT uses global
> > > > > > > > > > > variables and clearing BSS after board_init_f() would
> reset the heap
> > > > > > > > > > state.
> > > > > > > > > > > An example for such a usage is socfpa_arria10 where an
> FPGA configuration
> > > > > > > > > > > is required before SDRAM can be used.
> > > > > > > > > > >
> > > > > > > > > > > Make the new option depend on ARM for now until more
> implementations
> > > > > > > > > > follow.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I still have objections to this series and I think we
> should discuss
> > > > > > > > > > other ways of solving this problem.
> > > > > > > > > >
> > > > > > > > > > Does socfgpa have SRAM that could be used before SDRAM
> is available?
> > > > > > > > > > If so, can we not use that for the configuration? What
> various are
> > > > > > > > > > actually in BSS that are needed before board_init_r() is
> called? Can
> > > > > > > > > > they not be in a struct created from malloc()?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > The problem is the board needs to load an FPGA
> configuration from FAT
> > > > > > > > > before SDRAM is available. Yes, this is loaded into SRAM
> of course, but the
> > > > > > > > > whole code until that is done uses so many malloc/free
> iterations that The
> > > > > > > > > simple mall of implementation would require too much
> memory.
> > > > > > > > >
> > > > > > > > > And it's the full malloc state variables only that use
> BSS, not the FAT
> > > > > > > > > code.
> > > > > > > >
> > > > > > > > I've actually faced very similar issues working on our TI
> AM654x "System
> > > > > > > > Firmware Loader" implementation (will post upstream soon),
> where I need
> > > > > > > > to load this firmware and other files from media such as
> MMC/FAT in a very
> > > > > > > > memory-constrained SPL pre-relocation environment *before* I
> can bring up
> > > > > > > > DDR.
> > > > > > > >
> > > > > > > > Initially, I modified the fat.c driver to re-use memory so
> it is not as
> > > > > > > > wasteful during SYS_MALLOC_SIMPLE. While I'm not proud of
> this solution [1]
> > > > > > > > this allowed us to get going, allowing to load multiple
> files without
> > > > > > > > issues in pre-relocation SPL.
> > > > > > >
> > > > > > > That seems to point the way to a useful solution I think. We
> could
> > > > > > > have a struct containing allocated pointers which is private
> to FAT,
> > > > > > > and just allocate them the first time.
> > > > > >
> > > > > > The board_init_f()-based loader solution we use extends beyond
> MMC/FAT,
> > > > > > but also for OSPI, X/Y-Modem, and (later) USB, network, etc.
> > > > > >
> > > > > > Background:
> > > > > > On our "TI K3" devices we need to do a whole bunch of stuff
> before
> > > > > > DDR is up with limited memory, namely loading and installing a
> firmware
> > > > > > that controls the entire SoC called "System Firmware". It is
> only after
> > > > > > this FW is loaded from boot media and successfully started that
> I can
> > > > > > bring up DDR. So all this is done in SPL board_init_f(), which
> as the
> > > > > > last step brings up DDR.
> > > > > >
> > > > > > Not having BSS available to carry over certain state to the
> > > > > > board_init_r() world would lead to a bunch of hacky changes
> across
> > > > > > the board I'm afraid, more below.
> > > > >
> > > > > This is really unfortunate.
> > > > >
> > > > > It seems to me that we have two choises:
> > > > >
> > > > > 1. Hack around with board_init_f() such as to remove the
> distinction
> > > > > between this and board_init_r().
> > > > >
> > > > > 2. Enter board_init_r() without DRAM ready, and deal with setting
> it up there.
> > > > >
> > > > > I feel that the second solution is worth exploring. We could have
> some
> > > > > board-specific init in board_init_r(). We already have
> > > > > spl_board_init() so perhaps we could have spl_early_board_init()
> > > > > called right near the top?
> > > > >
> > > > > We can refactor a few of the functions in spl/spl.c so they can be
> > > > > called from board-specific code if necessary. We could also add new
> > > > > flags to global_data to control the behaviour of the SPL code, and
> the
> > > > > board code could set these.
> > > >
> > > > Let me explore this option. I can probably make something work but I
> > > > don't yet see how to do it cleanly, maybe it becomes clearer once I
> put
> > > > some code down. Currently by definition board_init_r() has DDR
> > > > available, and much of the code is geared towards it (for example the
> > > > calling of spl_relocate_stack_gd() before entering board_init_r()
> which
> > > > will already switch over to DDR).
> > > >
> > > > Also, and not to discourage that we can't improve upon the status
> quo,
> > > > but there is already a ton of boards using such an "early BSS"
> scheme...
> > > >
> > > > $ git grep --show-function 'memset.*bss' | grep board_init_f | wc -l
> > > > 35
> > >
> > > Yes I know :-(
> > >
> > > We should migrate these boards to use the generic SPL framework.
> >
> > socfpga_gen5 is one of the architectures listed here. I'm not even sure
> > whether that's actually needed. However, it's hard to test, isn't it? How
> > do you actually tell BSS isn't used before entering board_init_r?
>
> One way might be to link the SPL code without the call to
> board_init_r() and then check the map to make sure BSS is empty.
>

That would be worth a try.


> >
> > To be sure, we'd need to initialize unused memory to some magic
> > constant and check that it has been left untouched later (on boards
> > where BSS is available in board_init_r and remains in place when
> > moving on).
>
> I prefer a build-time check. We might even be able to automate it.
>

Of course a build-time check is better here than a runtime check. I just
couldn't come up with one.

I think it would be really beneficial to add such a check to all boards so
we know what we're discussing: I'll bet there are many of them just using
bss early because No one noticed...

And in board-local code, that could even be ok, it's just not ok for code
shared with boards not having access to bss early.

Regards,
Simon


> Regards,
> Simon
>
> >
> > Regards,
> > Simon
> >
> > >
> > > >
> > > > > > > I wonder if that would be enough for
> > > > > > >
> > > > > > > >
> > > > > > > > In the quest of creating something more upstream-friendly I
> had then
> > > > > > > > switched to using full malloc in pre-relocation SPL so that
> I didn't
> > > > > > > > have to hack the FAT driver, encountering similar issues
> like you
> > > > > > > > brought up and got this working, but ultimately abandoned
> this
> > > > > > > > approach after bundling all files needed to get loaded into
> a single
> > > > > > > > image tree blob which no longer required any of those
> solutions.
> > > > > > > >
> > > > > > > > What remained till today however is a need to preserve
> specific BSS
> > > > > > > > state from pre-relocation SPL over to post-relocation SPL
> environment,
> > > > > > > > namely flags set to avoid the (expensive) re-probing of
> peripheral
> > > > > > > > drivers by the SPL loader. For that I introduced a Kconfig
> option that
> > > > > > > > allows skipping the automatic clearing of BSS during
> relocation [2].
> > > > > > > >
> > > > > > > > Seeing this very related discussion here got me thinking
> about how else
> > > > > > > > I can carry over this "state" from pre- to post relocation
> but that's
> > > > > > > > probably a discussion to be had once I post my "System
> Firmware Loader
> > > > > > > > Series", probably next week.
> > > > > > >
> > > > > > > Since this is SPL I don't you mean 'relocation' here. I think
> you mean
> > > > > > > board_init_f() to board_init_r()?
> > > > > >
> > > > > > Yes that's what I mean. AFAIK relocation in SPL is still called
> relocation
> > > > > > from what I have seen working on U-boot, it just relocates gd
> and stack
> > > > > > but not the actual code (personally I find it misleading calling
> what SPL
> > > > > > does "relocation", but I got used to it).
> > > > > >
> > > > > > > You can use global_data to store state,
> > > > > >
> > > > > > I thought the idea was to stay away from gd, so we can
> eventually get
> > > > > > rid of it altogether?
> > > > >
> > > > > Not that I know of. It is how we communicate state before we have
> BSS.
> > > >
> > > > Oh ok I see now, I guess I have taken the comment [1] from
> > > > arch/arm/lib/spl.c out of context.
> > > >
> > > > [1]
> https://github.com/u-boot/u-boot/blob/master/arch/arm/lib/spl.c#L22
> > >
> > > Ah yes. That is referring to putting global_data in the data section.
> > > Perhaps we can delete that code now and see what breaks?
> > >
> > > >
> > > > > It is how we communicate state before we have BSS.
> > > >
> > > > Understood.
> > > >
> > > > > > > or malloc() to allocate memory and put things there.
> > > > > >
> > > > > > The challenge with potentially having to incorporate such a
> custom
> > > > > > solution for state preservation into several drivers as explained
> > > > > > earlier (SPI, USB, network, etc.) is that it does not appear to
> scale
> > > > > > well. I think using BSS instead would make all those additions
> cleaner
> > > > > > and simpler.
> > > > > >
> > > > > > > But using BSS seems wrong to me.
> > > > > >
> > > > > > I've seen you saying this a few times :)
> > > > > > Why?
> > > > >
> > > > > Driver model does its own allocate of memory and this is all
> attached
> > > > > to the DM structures.
> > > > >
> > > > > Drivers themselves cannot assume they are the only instance
> running,
> > > > > so data should be attached to their private-data pointers.
> Similarly
> > > > > for uclasses, if we put everything in the uclass-private data,
> then we
> > > > > don't need BSS and don't have any problems dealing with whether it
> is
> > > > > available yet. In general, BSS creates a lot of problems early in
> > > > > U-Boot's execution, and we don't actually need to use it.
> > > >
> > > > Yes, understood. The DM concept with private-data pointers is quite
> > > > clean from an OOP point of view.
> > > >
> > > > > If you look at the DM design you'll see that we try to avoid
> malloc()
> > > > > and BSS as much as possible. I suppose this series is another
> example
> > > > > of why :-)
> > > > >
> > > > > >
> > > > > > > If you are doing something in board_init_f() in SPL that needs
> BSS,
> > > > > > > can you not just move that code to board_init_r()?
> > > > > >
> > > > > > I need to access media drivers in board_init_f(), for which
> currently
> > > > > > I'm using BSS so I can preserve some limited state, such as that
> the
> > > > > > peripheral init was done, so that it doesn't get re-done by the
> actual
> > > > > > SPL loader later. board_init_r() requires DDR to be available
> which I
> > > > > > can't use without doing all that work in board_init_f() first to
> > > > > > load/start the system controller firmware, so it's a bit of a
> chicken
> > > > > > and egg issue here.
> > > > >
> > > > > Let's try moving the egg into board_init_r()  and putting the
> chicken
> > > > > after it, as mentioned above.
> > > >
> > > > Well I'll give this a shot.
> > > >
> > > > > >
> > > > > > My system firmware loader patch series is about ready and I was
> planning
> > > > > > on posting it tomorrow. How about with the entire approach being
> in the
> > > > > > open we use this as an opportunity to re-look at potential
> alternative
> > > > > > solutions...
> > > > >
> > > > > Sure. I hope I've explained my POV above.
> > > >
> > > > Yes thanks for the review and comments.
> > >
> > > You're welcome, and good luck with it.
> > >
> > > Regards,
> > > Simon
>


More information about the U-Boot mailing list