[U-Boot] Cavium/Marvell Octeon Support

Daniel Schwierzeck daniel.schwierzeck at gmail.com
Fri Oct 25 15:13:57 UTC 2019

Hi Aaron,

Am 23.10.19 um 05:50 schrieb Aaron Williams:
> Hi all,
> I have been tasked with porting our Octeon U-Boot to the latest U-Boot
> and merging it upstream. This will involve a very significant amount of
> code that generally will not be compatible with other MIPS processors
> due to our needs and requirements. For example, the start.S will need to
> be completely different than what is present. For example, our existing
> start.S is 3577 lines of code in order to deal with things like RAS,
> exceptions, virtual memory and more. We need to use virtual memory since
> U-Boot can be loaded at any 4MB boundary in memory, not just 0xbfc00000.
> A number of drivers will need to be updated in order to properly map
> pointers to physical addresses. This is needed anyway, since I see
> numerous drivers that assume that a pointer is a DMA address. For MIPS
> this is never the case (I'm looking at XHCI).

Good to see some progress in mainline Octeon support. Could you briefly
describe the differences and commonalities in booting an Octeon CPU
compared to other "generic" MIPS cores? Or could you point me to a
public Git tree? It can't be that different because Linux kernel is also
able to share most of the code ;)

In principle you could compile an own start.S in your mach-octeon
directory, but you should try to use the generic start.S which is
already customisable and extensible. If needed, we could add more
extension points to it. Booting from any custom memory address is
already supported and very common for other MIPS based SoC's. Exception
support is also already there.

> The new Octeon U-Boot will be native 64-bit instead of how the earlier
> one was 32-bit using the N32 ABI (so 64-bit addresses could be
> accessed). We had to jump through some hoops to make a 32-bit U-Boot
> fully support 64-bit hardware.

We have 64 bit support for MIPS. I even sync'ed the asm/io stuff from
Linux in the past (which includes support for Octeon) so that you would
be able to use the standard IO primitives and ioremap stuff and hook in
your platform-specifc memory mappings.

> I think we can shrink the code by removing support for starting "simple
> executive" tasks. Simple executive tasks are bare metal applications
> that can run on dedicated cores beside Linux (or without Linux). I will
> also not be porting any support for anything older than Octeon3.
> We also make heavy use of our SDK in order to perform hardware
> initialization and networking. In our old U-Boot, we have almost 900K
> lines of code. I can cut out much of this but much will remain.
> We also have added extensive infrastructure for handling SFP and QSFP
> cables as well as very extensive phy support for phys from
> Aquantia/Marvell, Vitesse/Microsemi, Inphi/Cortina and an Avago gearbox.
> Our customer wants us to port all of this to the new U-Boot and upstream
> it. I'm worried about the sheer amount of code since it is absolutely
> massive. 

Maybe you should cut down your customers expectations a bit. According
to sloccount we currently have 1.6M SLOC for the whole U-Boot. I guess
Tom or Wolfgang wouldn't agree with adding another 900k only for one
CPU. Actually what should be upstream is the basic CPU, driver and board
support to be able to boot a mainline kernel. Everything else like
custom bare metal applications or the SFP/PHY handling stuff mentioned
below could also be maintained in a downstream tree. Maybe Wolfgang is
willing to host one on gitlab.denx.de.

> Some of these phy drivers are extremely complex and need to tie
> into the SFP management. We also need to use a background polling thread
> while at the command prompt. A fair bit of our phy code is not in the
> normal phy drivers because it did not fit the model. Some of these phy
> drivers need to interact with the SFP support code in order to handle
> hot plug events in order to reconfigure themselves based on the cable
> type. The existing SFP code handles everything from SFP to SFP28 as well
> as QSFP and 100G QSFP (never tested).
> In the old U-Boot the PHY support had to be significantly enhanced due
> to requirements for hot-plugging and how some of the PHYs are
> configured. It gets quite complicated with phys like the Inphi where one
> phy can handle either four ports (XFI/SGMII) or a single 4-lane port
> (XLAUI). It gets even worse since in some boards we use reclocking chips
> and there is one chip that handles the receive path of a QSFP and
> another that handles the transmit path. Further complicating things,
> with a QSFP it can be treated either as XLAUI or as four XFI ports, so
> you can have four ports spread across two chips, with each port using
> different slices of each chip. In the case of the Inphi/Cortina chip, a
> single device can handle one or four ports based on the configuration
> and it is configured by "slice" which is basically an offset into the
> MDIO register space. We had to jump through hoops in order to have this
> stuff work in a sane way in the device tree. We added entries for SFP
> and QSFP slots in the device tree which point to the MACs, GPIOs and I2C
> bus because pointing them to the phys just got too insane. This will
> need to be ported to the new U-Boot. It should not break the existing
> support since most of it was implemented outside of the core PHY
> handling code. In the port, it would be far better if this could be
> integrated in. The SFP management code is architecture agnostic as is
> all of the PHY support. The callbacks for the SFP support are used by
> the MAC which then notifies the PHY since the MAC often needs to
> reconfigure itself. It can handle some crazy configurations.
> While I see some phy drivers that we also support, i.e. Cortina, our
> drivers tend to have a lot more functionality. For example, all of our
> phy drivers that support firmware support commands for upgrading the
> firmware as well as things like cable testing and other features.

PHY drivers and ethernet drivers should be really reduced to the
required functionality to enable basic networking like Ping, DHCP, TFTP.
U-Boot is still "just" a bootloader and not a system managemnt tool ;)
You should do that stuff either in Linux or in a downstream fork.

> Our bootloader needs to be able to be booted from a variety of sources,
> including SPI, eMMC, NOR flash and booting over the PCI bus from a host
> system. This is one reason we use virtual memory. The other reason is
> that it eliminates the need to perform relocation. Our start.S code
> handles all of these different cases as well as exception handling.

This is already supported for MIPS. You should try to use the generic
SPL framework for that. Whether you like the relocation or not, it's one
of the basic design principles of U-Boot. I guess it likely won't be
accepted if you circumvent this. In fact by now we're sharing the same
technology as Linux to have relocatable binaries without using gcc's
-fPIC or -mabicalls to reduce the binary footprint. You can configure
gd->ram_top to any address of your liking as reference address for the

> I will also say up front that the memory initialization code is a mess
> and quite large (it was written by a hardware engineer who never heard
> of functions).
> One thing is that this will break mips unless it is refactored like ARM
> is, for example, separating armv7 and armv8. This way we could have
> arch/mips/cpu/octeon. I did this with the old bootloader to separate our
> stuff. I'm open to suggestions as for the naming. I don't see how we can
> share much of the code with the other MIPS CPUs.

We have the same mach directory handling as in Linux MIPS. So you could
easily add all your platform specific code (except drivers) to
arch/mips/mach-octeon or (-cavium). Inside that directory you can have
an include directory for you cusom header files, you can even override
the generic files from arch/mips/include like in Linux. arch/mips/cpu
and arch/mips/lib should only contain generic code. As already mentioned
you could provide an own start.S inside arch/mips/mach-octeon but if
possible you should try to reuse or extend the generic variant.

> All in all, I think the final port will add between 500K-1M lines of
> code for the Octeon CPU. It is much more extensive than what is required
> for OcteonTX since in the latter case most of the hardware
> initialization is done by earlier stage bootloaders and the ATF handles
> things like SFP port management and many of the networking operations.
> I'm not sure how well I'll be able to upstream all of this code at this
> point since I was just handed this task. We already have at least 1M
> lines of code added to the old U-Boot which is based off of 2013.08 with
> a lot of backports.

- Daniel

More information about the U-Boot mailing list