Requiring SPL_DM for new boards?

Simon Glass sjg at chromium.org
Tue Nov 8 00:36:17 CET 2022


Hi,

On Tue, 1 Nov 2022 at 19:54, Samuel Holland <samuel at sholland.org> wrote:
>
> Hi all,
>
> Thanks for CCing me, Andre.
>
> On 11/1/22 13:15, Andre Przywara wrote:
> > On Mon, 31 Oct 2022 15:43:01 -0400
> > Tom Rini <trini at konsulko.com> wrote:
> >
> > Hi Tom, Simon,
> >
> >> On Mon, Oct 31, 2022 at 01:27:06PM -0600, Simon Glass wrote:
> >>> Hi Tom,
> >>>
> >>> On Sun, 30 Oct 2022 at 11:53, Tom Rini <trini at konsulko.com> wrote:
> >>>>
> >>>> On Sat, Oct 29, 2022 at 07:44:01PM -0600, Simon Glass wrote:
> >>>>> Hi Tom,
> >>>>>
> >>>>> On Fri, 21 Oct 2022 at 10:26, Tom Rini <trini at konsulko.com> wrote:
> >>>>>
> >>>>>>
> >>>>>> On Fri, Oct 14, 2022 at 09:56:44AM -0600, Simon Glass wrote:
> >>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> What do people think about requiring SPL_DM for new boards?
> >>>>>>> Would that cause any problems?
> >>>>>>>
> >>>>>>> There is not much use of of-platdata (compiling the DT into C
> >>>>>>> to save space) - is that because it doesn't work for people?
>
> Indeed, I tried enabling of-platdata on sunxi at one time, and it was
> not feasible to get working. Some of the issues I saw:
>
>  - There was no support for udevice_id::data -> udevice::driver_data.
>    Several of our drivers share one U_BOOT_DRIVER for the entire SoC
>    family; DM_DRIVER_ALIAS would need to plumb through a data argument.

We could add that if you like.

>
>  - Functions like clk_get_by_index and gpio_request_by_name would need
>    to work at least for index zero, where you could sort of get away
>    with not being able to read #foo-cells from the supplier. (The
>    OF_PLATDATA code would need to rewrite phandles to device IDs.)
>    Otherwise, you need totally separate code for OF_PLATDATA vs OF_REAL.

Can you use clk_get_by_phandle() ?

>
>  - We recently converted our platform to use the pinctrl nodes from the
>    devicetree (PINCTRL_GENERIC) in U-Boot proper. That does not work at
>    all with OF_PLATDATA.

Right, and that is a pain. The use of strings and the big pinmux
tables makes it quite a challenge.

Can you hard-code pinctrl for TPL, perhaps, then use the full driver in SPL?

>
> >>>>>>> I am particularly keen to drop the old block interface from
> >>>>>>> SPL. It seems to me that boards that can use that might have
> >>>>>>> enough space to enable SPL_DM and SPL_DM_BLK? What do people
> >>>>>>> think?
> >>>>>>
> >>>>>> I don't think this works. The problem is we aren't seeing new
> >>>>>> SoCs that have a large initial amount of memory but rather many
> >>>>>> continuing to have 32KiB or similar tiny sizes. So, I'd rather
> >>>>>> continue to go with saying it's optional, but that we won't
> >>>>>> introduce new SPL functionality that can be DM or not DM, but
> >>>>>> only new functionality that needs SPL_DM and if platforms want
> >>>>>> it, but have limited memory, we need to go TPL->SPL in that
> >>>>>> case.
> >>>>>
> >>>>> OK I see.
> >>>>>
> >>>>> What do you think of a migration method for boards which don't use
> >>>>> SPL_DM, so they migrate to TPL? Would that cause a lot of
> >>>>> problems?
> >>>>
> >>>> I'm not sure what it gains us. Maybe the first step here is to see
> >>>> what the list of non-DM_SPL platforms / SoCs are?
> >>>
> >>> OK:
> >>>
> >>> $./tools/moveconfig.py -b
> >>>
> >>> $ ./tools/moveconfig.py -f SPL ~SPL_DM
> >>> 323 matches
> >>> ...
> >>>
> >>> $ ./tools/moveconfig.py -f SPL_DM
> >>> 333 matches
> >>> ...
> >>
> >> OK, if we start parsing things out, PowerPC is one chunk of that and
> >> won't change. Another chunk of that is sunxi which is a "still making
> >> new SoCs with very small SRAM" and it's worth talking with Andre for
> >> thoughts there.
> >
> > Most newer SoCs are not that seriously limited anymore - though that's
> > not universal, since Allwinner is still making a lot of new "small"
> > SoCs (with single Cortex-A7s (or older!) and embedded DRAM). And
> > regardless of that, until recently the BROM wouldn't load more than 32 KB
> > into SRAM, anyway.
>
> Starting with H6, and as far as I know for every SoC after that, the
> normal BROM will load up to the full size of SRAM A1 + SRAM C, which is
> >96 KB. That is plenty of space for SPL_DM + SPL_OF_REAL. I can look
> back at my BROM analyses if you want the exact numbers.
>
> So I don't think it is  correct to say Allwinner is "still making new
> SoCs with very small SRAM".
>
> > So I cannot say for sure what the situation for new "boards" (rather
> > SoCs?) will be, but we are stuck with the current legacy SPL for existing
> > SoCs, for sure.
>
> That is mostly true, with the exception that H6 and H616 can upgrade to
> SPL_DM. As you mention, A64 and H5 are the most problematic cases, since
> they are aarch64 but still have the 32 KB limit.
>
> > Samuel has been working on SPL_DM for the D1 (RISC-V) port, though I am
> > not a big fan of it:
> > - We would still need the legacy SPL code, since older SoCs are still
> >   bound to the 32K limit. That is already a stretch for SoCs like the
> >   A64, where we are already very close to that limit.
> > - It adds to the test matrix, since we now need to support and
> >   maintain DM/proper, legacy SPL and SPL-DM.
>
> I agree that there is some additional complication here, with the two
> main sources of surprising differences between the SPL and U-Boot DM
> execution environment being inconsistent Kconfig options and malloc.
>
> However, on the driver side, the code running under SPL_DM and U-Boot is
> exactly the same. There is not a third driver variant.
>
> > - Forcing a DT and DM code into that very restricted space requires too
> >   many compromises for my taste. I like nice driver frameworks and love
> >   DT, but one must be able to afford all of this. If you have 100s of
> >   KBs or MBs available, that's all fine, but cutting corners to make it
> >   fit into 32K takes away much of the beauty and flexibility. The DT
> >   changes (u-boot,dm-pre-reloc) we need to make are some sign of it.
>
> Yes, I would *really* appreciate the ability to skip fdtgrep and just
> put the whole devicetree in SPL, or at least have it be much more
> conservative about what it drops (certain properties, disabled nodes) so
> that annotating the devicetree is not necessary.
>
> The current Makefile logic does not allow us to have a per-SoC
> "-u-boot.dtsi" file, so we would need to annotate per board. But even if
> it did, I don't want to pick and choose nodes; that's what Kconfig is for!
>
> > - We actually don't gain much, because the information the SPL needs is
> >   mostly not in the DT to begin with:
> >   - The whole DT clock node is opaque, it basically just says "it's this
> >     SoC's CCU". That is OK for a single image kernel like Linux, but the
> >     SPL knows that already - either by build time config or by reading the
> >     SOCID register. And the SPL does *basic* clock setup, which we cannot
> >     really describe in the DT at all.
> >   - The situation is similar for pinctrl: the actual mux value for a
> >     certain function is not in the DT, but hardcoded in the driver. We
> >     already tried to hack this down for U-Boot, and only got away with
> >     quite some squinting.
>
> You say "we don't gain much", but I see clk_enable() and the automatic
> pinctrl_select_state() Just Working as an absolutely massive improvement.
>
> Currently, SoC bringup requires updating the DT driver... and then 20+
> ifdefs in various files. If new SoCs use SPL_DM, then new SoCs just need
> an updated DT driver. They don't have to touch any of the ifdefs.
>
> >   - The DRAM controller isn't even mentioned in the DT. And while we could
> >     add that, the information we need is very minimal.
>
> It is; the DRAM controller is the mbus node.
>
> Writing a DM driver for the DRAM controller is effectively the same
> amount of effort as doing it the legacy way. You just wrap it in a
> U_BOOT_DRIVER and do the init in the .probe function. The change in the
> board code is similarly trivial:
>
> -       sunxi_dram_init();
> +       uclass_get_device(UCLASS_RAM, 0, &dev);
>
> >   - For storage devices (MMC, SPI-NOR) we can use the same fixed per-SoC
> >     values as the BROM does, so just need the base address. There are only
> >     like four different values across all Allwinner SoCs. The rest of the
>
> There are even fewer values to keep track of if the only MMIO addresses
> hardcoded in some header are the ones for pre-H6 SoCs ;-).
>
> >     DT node is either not useful (opaque clock handles) or not needed
> >     (interrupts).
>
> As long as you ignore raw NAND, maybe you can get away with using slow,
> safe code like the BROM does. But sometimes you really do need per-board
> information like ECC parameters from the DT node.
>
> > Yes, there are some boards which require regulator setup in the SPL, which
> > is described in the DT, but again this still requires regulator
> > knowledge in the code, and is also quite universal (mostly by SoC again).
> >
> > So in summary: it would be a lot of work, which we cannot extend to older
> > SoCs because of technical limitations. But more importantly I think we
> > don't gain much to make it worth.
>
> I agree that old SoCs pre-H6 are stuck for now. But I do see a lot of
> benefit for new SoCs. Especially when starting from a blank slate on
> RISC-V, it was much easier porting one set of drivers and SPL_DM, than
> porting two sets of drivers.
>
> > Historically we more naturally shared code between SPL and U-Boot proper,
> > because U-Boot proper used to look much like the SPL looks today (clock
> > code, for instance). But much of this is mostly obsolete, because there is
> > not much overlap, code-wise, the only exception being the common MMC protocol
> > handling, maybe. So I am actually more tempted to spell this out more
> > openly, and separate and trim down the SPL code, avoiding full-featured
> > (DM) drivers at all, if possible (like the SPI NOR code does).
> >
> > We can look into parsing the DT to gather base addresses (and putting them
> > into generated headers), or to enable Kconfig options (board needs a
> > regulator), but I would very much like to keep the SPL lean and mean.
> >
> > The BROM is able to do all the loading without *any* board information
> > whatsoever. All that the BROM is missing is the DRAM init, which requires
> > just two or three parameters (LPDDR vs. DDR, frequency). So we could
> > actually live with a *per-SoC* SPL: we know where the BROM booted from, so
> > can continue doing so using the same fixed settings as the BROM used (for
> > SD card, eMMC, SPI, FEL). We actually exercise this idea already in
> > arch/arm/mach-sunxi/spl_spi_sunxi.c, which is separate from the normal
> > SPI-NOR code, just focusing on some conservative read-only command to get
> > the FIT image into DRAM.
> > I would rather go into this direction than forcing DM into the SPL.
>
> I suppose there are two ways of thinking about SPL:
>  1) Exactly like U-Boot proper, except we removed all of the interactive
>     parts to make it smaller.
>  2) Load U-Boot from a fixed location on disk as fast as possible using
>     as little code as possible, and do nothing else.
>
> The second view ignores things like disk/MTD partitions, verified boot,
> falcon mode, reboot modes, multi-DTB FITs, etc. that would benefit from
> having all of the DM and FDT infrastructure available. But on the other
> hand, it gets you a highly-optimized program that does one thing and
> does it well. I do like the idea of only needing one binary per SoC.
>
> I guess my question is, what do the U-Boot maintainers want U-Boot SPL
> to be? If it's more like the first description, then maybe it makes more
> sense to build the lean and mean SPL outside the U-Boot infrastructure.
> Then if the SPL_DM migration is forced at some point, we could just
> disable SUPPORT_SPL for anything older than H6, and treat SPL as a blob.
> But that still seems like quite a lot of unnecessary work when we have a
> working U-Boot SPL for those SoCs today. What do you think?

I don't think of SPL as minimal anymore. It needs to load U-Boot,
sometimes from a block device / filesystem. It probably uses FIT and
may even load things like OP-TEE.

Should TPL be where SoCs do their very early non-DM init, if needed?

As I mentioned earlier, my goal is to require SPL_BLK if block devices
are needed in SPL. Perhaps this might be possible for newer SoCs?

Regards,
Simon


More information about the U-Boot mailing list