[RFC] Load U-Boot without LK on DragonBoard 410c (+ DB820c?)

Stephan Gerhold stephan at gerhold.net
Fri Jul 2 12:28:44 CEST 2021


On Fri, Jul 02, 2021 at 01:04:42PM +0300, Ramon Fried wrote:
> On Thu, Jul 1, 2021 at 4:24 PM Stephan Gerhold <stephan at gerhold.net> wrote:
> > On Thu, Jul 01, 2021 at 01:27:30PM +0200, Nicolas Dechesne wrote:
> > > On Thu, Jul 1, 2021 at 11:07 AM Stephan Gerhold <stephan at gerhold.net> wrote:
> > > >
> > > > Hi!
> > > >
> > > > at the moment the U-Boot ports for both DragonBoard 410c and 820c are
> > > > designed to be loaded as an Android boot image after Qualcomm's LK
> > > > bootloader. This is simple to set up but LK is redundant in this case,
> > > > since everything done by LK can be also done directly by U-Boot.
> > > >
> > > > Dropping LK entirely would have at least the following advantages:
> > > >   - Easier installation/board code (no need for Android boot images)
> > > >   - (Slightly) faster boot
> > > >   - Boot directly in 64-bit without a round trip to 32-bit for LK
> > > >
> > > > This was not possible so far because of some unsolved problems.
> > > > For clarity I try to describe them together with some background here,
> > > > but I want to apologize for the long text. It's all quite complicated. :)
> > > >
> > > > 1. "Signing" 64-bit U-Boot
> > > > ==========================
> > > >
> > > > Ramon already tried to eliminate LK for DB410c 3 years ago [1].
> > > > One of the open problems back then was to have a proper "signing"
> > > > tool with 64-bit support. The firmware expects an ELF image with a few
> > > > Qualcomm-specific ELF headers. Normally this is used for secure boot
> > > > setups. This is not used on DragonBoards, but the firmware still insists
> > > > on having a dummy (self-signed) certificate chain in the ELF images.
> > >
> > > Yeah, the signing was the last step we missed. We were able to sign
> > > using internal / non open source tools.. but never finalized the boot
> > > process completely.. I am very happy you persisted with that!
> > >
> > > >
> > > > Linaro uses signlk [2] to sign their builds of LK. It looks like Nicolas
> > > > extended it with ELF64 support after Ramon's mail [3]. However, for some
> > > > reason signlk literally works only for LK for me. I tried to "sign"
> > > > U-Boot and some other firmware, but everything except LK is always
> > > > rejected with the following message on boot:
> > > >
> > > >     B -   1031113 - Error code 302e at boot_config.c Line 296
> > > >
> > > > I tried to track down the issue in the source code for quite some time
> > > > but did not manage to find the problem. Perhaps it's some subtle mistake
> > > > with some of the ELF modifications, I'm not sure. (For some reason,
> > > > signlk makes subtle changes to all of the existing ELF headers...)
> > > >
> > > > After reading about the image format myself I decided to try to make my
> > > > own "signing" tool, qtestsign: https://github.com/msm8916-mainline/qtestsign
> > > > It's based on a mixture of the specification [4] and some missing bits
> > > > taken from signlk, put in a simple and clean Python tool. I still don't
> > > > know what exactly qtestsign does different, but unlike signlk it can
> > > > successfully "sign" U-Boot and all other firmware from DragonBoard 410c.
> > >
> > > There is no specific reason to restrict ourselves to using signlk.. if
> > > you have something better, which works, that's perfect!
> > >
> > > >
> > > > [1]: https://lore.kernel.org/u-boot/CA+Kvs9kS=DbJKNAixk_3tz+3iWnRaSP0gJdZ8eKrzasKOr6wcw@mail.gmail.com/
> > > > [2]: https://git.linaro.org/landing-teams/working/qualcomm/signlk.git/
> > > > [3]: https://git.linaro.org/landing-teams/working/qualcomm/signlk.git/commit/?id=1f61c03322c3728f35b3f0cd4ff04f73522f1e67
> > > > [4]: https://www.qualcomm.com/media/documents/files/secure-boot-and-image-authentication-technical-overview-v1-0.pdf
> > > >
> > > > My solution
> > > > -----------
> > > >
> > > > Now we have all we need to install U-Boot without LK. For DragonBoard 410c
> > > > the following steps end up in the U-Boot prompt without going through LK:
> > > >
> > > >     1. Change dragonboard410c_defconfig as follows:
> > > >
> > > >        -CONFIG_SYS_TEXT_BASE=0x80080000
> > > >        +CONFIG_SYS_TEXT_BASE=0x8F600000
> > > >        +CONFIG_OF_EMBED=y (I discuss this at the end of the mail)
> > > >
> > > >     2. $ make
> > > >     3. Sign the ELF image: $ qtestsign.py aboot <out>/u-boot [5]
> > > >     4. Flash "<out>/u-boot-test-signed.mbn" to the "aboot" partition
> > > >
> > > > [5]: https://github.com/msm8916-mainline/qtestsign
> > > >
> > > > 2. Linux gets stuck when loaded by 64-bit U-Boot without LK
> > > > ===========================================================
> > > >
> > > > This should work well enough to get the U-Boot prompt on serial.
> > > > However, once you load Linux you will likely notice a problem:
> > > >
> > > >     [    0.059043] smp: Bringing up secondary CPUs ...
> > > >     [    5.120691] CPU1: failed to come online
> > > >     [   10.246760] CPU2: failed to come online
> > > >     [   15.372848] CPU3: failed to come online
> > > >     [   15.406275] CPU: All CPU(s) started at EL1
> > > >      ...
> > > >     [   16.185527] genirq: irq_chip msmgpio did not update eff. affinity mask of irq 79
> > > >      Board freezes forever. :(
> > > >
> > > > My investigations have shown this is a bug in the PSCI implementation on
> > > > DB410c (part of the TrustZone/"tz" firmware). Shortly said, since we
> > > > have never done the 32-bit -> 64-bit switch in LK, the PSCI implementation
> > > > seems to believe we are still running in 32-bit mode and starts all
> > > > further CPUs in 32-bit mode. The other CPU cores crash immediately when
> > > > coming up and CPU 0 hangs once CPU idle suspends it for the first time.
> > > >
> > > > I have described this problem together with a workaround in detail here:
> > > > https://github.com/msm8916-mainline/qhypstub#boot-flow
> > > >
> > > > The idea is to execute the TZ syscall to switch from 32-bit -> 64-bit
> > > > even though we are already running in 64-bit mode. This will make the
> > > > PSCI implementation aware that we want all further CPU cores booted in
> > > > 64-bit mode as well.
> > >
> > > You haven't asked.. but just in case.. chances to get a fix for this
> > > firmware is close to 0 (really close). I am glad you have a
> > > workaround.
> > >
> >
> > Yeah, that's what I expected to be honest. :)
> >
> > > >
> > > > My solution
> > > > -----------
> > > >
> > > > The workaround is applied automatically when using my open-source "hyp"
> > > > firmware replacement qhypstub: https://github.com/msm8916-mainline/qhypstub
> > > > As a bonus, both U-Boot and Linux start in EL2, making it possible to
> > > > use virtualization (e.g. KVM in Linux).
> > > >
> > > >     $ git clone https://github.com/msm8916-mainline/qhypstub.git
> > > >     $ cd qhypstub
> > > >     $ make CROSS_COMPILE=aarch64-linux-gnu-
> > > >     $ qtestsign.py hyp qhypstub.elf
> > > >     # Flash "qhypstub-test-signed.mbn" to "hyp" partition and reboot.
> > > >
> > > > Now it works:
> > > >
> > > >     [    0.063411] CPU1: Booted secondary processor 0x0000000001 [0x410fd030]
> > > >     [    0.064184] CPU2: Booted secondary processor 0x0000000002 [0x410fd030]
> > > >     [    0.064906] CPU3: Booted secondary processor 0x0000000003 [0x410fd030]
> > > >     [    0.123032] CPU: All CPU(s) started at EL2
> > > >     [    0.448743] kvm [1]: Hyp mode initialized successfully
> > > >      ...
> > > >
> > > > And with that U-Boot is fully working as far as I can tell.
> > > > (I have only tested serial, SD card and USB so far. If something is
> > > >  broken, it's likely some missing register initialization that should
> > > >  be ported from LK/Linux...)
> > >
> > > you mean you tested serial, SD and USB from u-boot, or from Linux once
> > > booted from uboot?
> > >
> >
> > I tested both, but I meant in U-Boot here.
> >
> > > what's the overall status in Linux when you boot with this new boot flow?
> > >
> >
> > I'm not sure about typical use cases for DB410c, but I was not able to
> > notice any problems with the new boot flow. dmesg looks fine, I tested
> > USB, eMMC/SD card, display, GPU, WiFi, HDMI audio, everything seems fine. :)
> >
> > KVM/QEMU works fine too although performance isn't exceptional of course.
> > You're probably not going to build a cloud hosting cluster with this. ;)
> >
> > > >
> > > > 3. Remaining open questions
> > > > ===========================
> > > >
> > > > I still see 3 questions that we need to discuss:
> > > >
> > > >   1. This is a quite fundamental change.
> > > >      Can we just make it to dragonboard410c_defconfig?
> > > >      Does it make sense to keep the old setup with LK?
> > > >      When would it be used?
> > >
> > > I believe it's used by distro. iirc, at least Archlinux, Fedora and
> > > Ubuntu have some level of support (and instructions) for the DB410c,
> > > and they are using this uboot config. So we need to check at least
> > > that we are not breaking any Linux features with this boot flow. There
> > > is indeed no reason to keep LK in the boot flow if nothing breaks once
> > > we remove it. However it's going to change their installation
> > > instructions, since uboot becomes 'aboot' and 'boot' is no longer
> > > used. In other words, this change is not transparent for users.
> > >
> >
> > As written above I think Linux is fine. So if other distributions were
> > using the LK as-is (as provided by Linaro) they should be fine after
> > adjusting the installation instructions. If they made modifications to
> > LK (e.g. bringing up some screen in the bootloader) they might need to
> > port those to U-Boot, though.
> >
> > Is there any way we can make this change more obvious?
> > Will someone check before upgrading U-Boot in those distributions?
> > If not, can we make them aware of this somehow?
> >
> > > >
> > > >   2. Workaround for PSCI bug: I'm not sure if we want to make qhypstub [6]
> > > >      a requirement for U-Boot. On the one hand it's open-source, solves
> > > >      the problem nicely without changes in U-Boot and provides EL2
> > > >      additionally. I'm also not aware of any problem/disadvantage when
> > > >      using it (if you find a problem, please let me know!).
> > > >
> > > >      But I realize it's unofficial. If we want to support using Qualcomm's
> > > >      "hyp" firmware as well I could try porting the PSCI workaround
> > > >      from qhypstub to U-Boot. It should be ~10 lines of ARM64 assembly [7]
> > > >      placed e.g. in board/qualcomm/dragonboard410c/head.S.
> > > >
> > > >      However, I will need to make sure to detect if U-Boot was started
> > > >      in EL2 by qhypstub because otherwise doing the workaround twice
> > > >      will conflict and U-Boot might demote itself back to EL1.
> > >
> > > I think we want (we need?) to support both HYP implementations, I
> > > would prefer to have a workaround in u-boot to support existing users
> > > (with QCOM hyp). In everything we've done in linux for qcom, we always
> > > tend to (try to) support the default firmware released by QCOM, to get
> > > a chance to support more users...
> > >
> >
> > OK, I will check if I can port the workaround without too much effort.
> > It should be simply in theory, but I need to try to be sure.
> >
> > > >
> > > >   3. CONFIG_OF_EMBED: There is a big warning about this in the build log:
> > > >      "This option should only be used for debugging purposes. Please use
> > > >       CONFIG_OF_SEPARATE for boards in mainline."
> > > >
> > > >      The important part here is that we need an ELF image with both
> > > >      U-Boot and the DTB. CONFIG_OF_EMBED is convenient for that because
> > > >      we can just use the ELF image built by the linker and it already
> > > >      contains the DTB.
> > > >
> > > >      If CONFIG_OF_EMBED is really so bad it might be possible to build
> > > >      a new boot image based on "u-boot-dtb.bin" (which is U-Boot with
> > > >      DTB appended). I'm not sure if this is really much better though.
> > > >
> > > > Bonus question: Could something similar also work for DB820c? I don't
> > > > have one myself but I think a similar setup short also work on it.
> > > > If someone is interested in testing this I would be happy to help. :)
> > >
> > > The clk, regulators, ... implementation on 820 are different from 410
> > > in general, and I don't remember how we left things on 820.. but in
> > > general it should work.
> > > If you want to get a DB820c, I should be able to help with that (ping
> > > me privately ;-).
> > >
> >
> > Oh, thanks for the offer! I will contact you later although I cannot
> > promise much since I have my hands full with things related to MSM8916.
> > (Actually not focused around DB410c but smartphones/tablets based on
> >  MSM8916. DB410c is just "close enough" that it can easily make use of
> >  the work for those...) :)
> >
> > Thanks!
> > Stephan
> Hi Stephan awesome work.
> The 32bit to 64bit jump part was the last straw for me, I debugged
> this damn thing for hours without any progress.

Yeah, I've actually found it only by a lucky coincidence because it also
occured in other situations while I was working on qhypstub. The advantage
I had there was that I could insert some (assembly) debugging code between
PSCI and Linux. So I noticed quite quickly that PSCI is telling my firmware
to boot all other CPU cores in 32-bit mode for some reason.

> I concur with everything Nico said, I want to see the fix in U-boot,
> and if we need to detect if the PSCI is already aware of this status,
> let's add logic for that as well.

OK I will try to port it and might just post some patches soon
if everything goes smoothly.

> Regarding the signing tool, I don't recall I had issues signing both
> the 64bit and 32bit images, it's strange, I'll test it later.
> 

Thanks, perhaps I'm just using signlk wrong. :)

> Regarding 820c, The BIG problem is that U-boot is missing a UFS driver
> / infrastructure, so without that you can only boot from SD.
> If you're up to the challenge you're more than welcome to try porting
> UFS to u-boot.

Uh, that sounds like a lot of work. :)

As I mentioned I'm mainly interested in MSM8916/APQ8016 because of other
devices I'm working on. I'm curious if it's fairly easy to port my PSCI
workaround and perhaps qhypstub to APQ8096 for DB820c. But otherwise
I already have too much on my plate, it wouldn't help to add another
SoC to it.

Thanks!
Stephan


More information about the U-Boot mailing list