[BUG regression in HEAD] efi: memory: use the lmb API's for allocating and freeing memory

Sughosh Ganu sughosh.ganu at linaro.org
Thu Nov 21 12:53:32 CET 2024


On Thu, 21 Nov 2024 at 13:34, Sughosh Ganu <sughosh.ganu at linaro.org> wrote:
>
> On Wed, 20 Nov 2024 at 13:49, E Shattow <lucent at gmail.com> wrote:
> >
> > Hi all, (on-list)
> >
> > On Tue, Nov 19, 2024 at 9:14 PM Sughosh Ganu <sughosh.ganu at linaro.org> wrote:
> > >
> > > On Wed, 20 Nov 2024 at 10:09, E Shattow <lucent at gmail.com> wrote:
> > > >
> > > > On Tue, Nov 19, 2024 at 6:42 PM Sughosh Ganu <sughosh.ganu at linaro.org> wrote:
> > > > >
> > > > > On Wed, 20 Nov 2024 at 04:48, E Shattow <lucent at gmail.com> wrote:
> > > > > >
> > > > > > Hello,
> > > > > >
> > > > > > On HEAD some commits after v2024.10 I encounter a regression for
> > > > > > `bootefi bootmgr` fail with error "Not a PE-COFF file";  The fall-thru
> > > > > > case of global EFI boot is successful.
> > > > > >
> > > > > > Having run a git bisect I discover the first bad commit 22f2c9ed:
> > > > > >
> > > > > > $ git checkout -b master origin/master
> > > > > > branch 'master' set up to track 'origin/master'.
> > > > > > Switched to a new branch 'master'
> > > > > > $ git bisect start
> > > > > > status: waiting for both good and bad commits
> > > > > > $ git bisect bad HEAD
> > > > > > status: waiting for good commit(s), bad commit known
> > > > > > $ git bisect good v2024.10
> > > > > > Bisecting: 850 revisions left to test after this (roughly 10 steps)
> > > > > > [82686e678e1587ddbd9570f82c58cdc3aecf2dbe] Merge branch 'staging' of
> > > > > > https://source.denx.de/u-boot/custodians/u-boot-tegra
> > > > > > $ git bisect good
> > > > > > Bisecting: 422 revisions left to test after this (roughly 9 steps)
> > > > > > [8963d433eb5d4a9f3a9def84e9c61a45c13e72bc] Merge tag
> > > > > > 'u-boot-rockchip-20241026' of
> > > > > > https://gitlab.denx.de/u-boot/custodians/u-boot-rockchip
> > > > > > $ git bisect bad
> > > > > > Bisecting: 214 revisions left to test after this (roughly 8 steps)
> > > > > > [0a504585d1cefeaf35ae8f860a5e5aa44dfffed5] arm: dts: k3-j722s-binman:
> > > > > > Add support for HS-SE
> > > > > > $ git bisect bad
> > > > > > Bisecting: 106 revisions left to test after this (roughly 7 steps)
> > > > > > [88057dab2cde8710ccc95d12fb312184e0b023ca] mtd: spi-nor: Allow flashes
> > > > > > to specify MTD writesize
> > > > > > $ git bisect good
> > > > > > Bisecting: 53 revisions left to test after this (roughly 6 steps)
> > > > > > [625d40ab120dbc6f45dbd975857f8f87e422bd0f] test: boot: fix
> > > > > > bootflow_cmd_label for when DSA_SANDBOX is disabled
> > > > > > $ git bisect bad
> > > > > > Bisecting: 26 revisions left to test after this (roughly 5 steps)
> > > > > > [5b9261fb0b1ed087387f2036d279fd3f4bb20a61] Makefile: Drop
> > > > > > SPL_FIT_GENERATOR support
> > > > > > $ git bisect good
> > > > > > Bisecting: 13 revisions left to test after this (roughly 4 steps)
> > > > > > [e1b6822d6522d94d579d53092342b542d368a04b] efi_memory: do not add RAM
> > > > > > memory to the memory map
> > > > > > $ git bisect bad
> > > > > > Bisecting: 6 revisions left to test after this (roughly 3 steps)
> > > > > > [2f6191526a1325b6ddb59795a093eca69dbf8976] lmb: notify of any changes
> > > > > > to the LMB memory map
> > > > > > $ git bisect bad
> > > > > > Bisecting: 2 revisions left to test after this (roughly 2 steps)
> > > > > > [3c6896ad2fb876b0a23202f62a83c0d44380c9ea] lmb: add a flag to allow
> > > > > > suppressing memory map change notification
> > > > > > $ git bisect good
> > > > > > Bisecting: 0 revisions left to test after this (roughly 1 step)
> > > > > > [22f2c9ed9f533a56bed09bd4e0e37852b6b9f3b1] efi: memory: use the lmb
> > > > > > API's for allocating and freeing memory
> > > > > > $ git bisect bad
> > > > > > Bisecting: 0 revisions left to test after this (roughly 0 steps)
> > > > > > [eb052cbb896fee6f947765b44b0d80a54b19ce1a] lmb: add and reserve memory
> > > > > > above ram_top
> > > > > > $ git bisect good
> > > > > > 22f2c9ed9f533a56bed09bd4e0e37852b6b9f3b1 is the first bad commit
> > > > > >
> > > > > > A commit is good if Star64 boots and absent the error about "Not a
> > > > > > PE-COFF file" (duly confirmed by eficonfig to adjust boot order
> > > > > > allowing removable media of an OS installer image on SD Card to be the
> > > > > > priority, verifying that the installer runs as expected).  A commit is
> > > > > > bad if U-Boot crashes and/or has the error "Not a PE-COFF file".
> > > >
> > > > >
> > > > > Can you post the output of the following. Thanks.
> > > >
> > > > >
> > > > > 1) running the 'bdinfo' command
> > > >
> > > > U-Boot SPL 2024.10-00989-geb052cbb896f (Nov 19 2024 - 14:39:43 -0800)
> > > > ...
> > > > StarFive # bdinfo
> > > > boot_params = 0x0000000000000000
> > > > DRAM bank   = 0x0000000000000000
> > > > -> start    = 0x0000000040000000
> > > > -> size     = 0x0000000100000000
> > > > flashstart  = 0x0000000000000000
> > > > flashsize   = 0x0000000000000000
> > > > flashoffset = 0x0000000000000000
> > > > baudrate    = 115200 bps
> > > > relocaddr   = 0x00000000fff46000
> > > > reloc off   = 0x00000000bfd46000
> > > > Build       = 64-bit
> > > > current eth = ethernet at 16030000
> > > > ethaddr     = 6c:cf:39:00:75:63
> > > > IP addr     = <NULL>
> > > > fdt_blob    = 0x00000000ff72da20
> > > > lmb_dump_all:
> > > >  memory.count = 0x1
> > > >  memory[0]      [0x40000000-0x13fffffff], 0x100000000 bytes flags: none
> > > >  reserved.count = 0x2
> > > >  reserved[0]    [0x40000000-0x4005ffff], 0x00060000 bytes flags: no-map
> > > >  reserved[1]    [0xfe729630-0xffffffff], 0x018d69d0 bytes flags: no-overwrite
> > > > devicetree  = board
> > > > serial addr = 0x0000000010000000
> > > >  width      = 0x0000000000000004
> > > >  shift      = 0x0000000000000002
> > > >  offset     = 0x0000000000000000
> > > >  clock      = 0x00000000016e3600
> > > > boot hart   = 0x0000000000000001
> > > > firmware fdt= 0x0000000042200000
> > > >
> > > > U-Boot SPL 2025.01-rc2-00129-g7fe55182d926 (Nov 19 2024 - 19:56:55 -0800)
> > > > ...
> > > > StarFive # bdinfo
> > > > boot_params = 0x0000000000000000
> > > > DRAM bank   = 0x0000000000000000
> > > > -> start    = 0x0000000040000000
> > > > -> size     = 0x0000000100000000
> > > > flashstart  = 0x0000000000000000
> > > > flashsize   = 0x0000000000000000
> > > > flashoffset = 0x0000000000000000
> > > > baudrate    = 115200 bps
> > > > relocaddr   = 0x00000000fff46000
> > > > reloc off   = 0x00000000bfd46000
> > > > Build       = 64-bit
> > > > current eth = ethernet at 16030000
> > > > ethaddr     = 6c:cf:39:00:75:63
> > > > IP addr     = <NULL>
> > > > fdt_blob    = 0x00000000ff72da10
> > > > lmb_dump_all:
> > > >  memory.count = 0x1
> > > >  memory[0]      [0x40000000-0x13fffffff], 0x100000000 bytes, flags: none
> > > >  reserved.count = 0x3
> > > >  reserved[0]    [0x40000000-0x4005ffff], 0x60000 bytes, flags: no-map
> > > >  reserved[1]    [0xfe72d620-0xffffffff], 0x18d29e0 bytes, flags: no-overwrite
> > > >  reserved[2]    [0x13fffb000-0x13fffffff], 0x5000 bytes, flags:
> > > > no-notify, no-overwrite
> > > > devicetree  = board
> > > > serial addr = 0x0000000010000000
> > > >  width      = 0x0000000000000004
> > > >  shift      = 0x0000000000000002
> > > >  offset     = 0x0000000000000000
> > > >  clock      = 0x00000000016e3600
> > > > boot hart   = 0x0000000000000001
> > > > firmware fdt= 0x0000000042200000
> > > >
> > > > Differences in bdinfo output between working (parent of the
> > > > regression) and non-working (origin/master) version:
> > > >
> > > > -fdt_blob    = 0x00000000ff72da20
> > > > +fdt_blob    = 0x00000000ff72da10
> > > >  lmb_dump_all:
> > > >   memory.count = 0x1
> > > > - memory[0]      [0x40000000-0x13fffffff], 0x100000000 bytes flags: none
> > > > - reserved.count = 0x2
> > > > - reserved[0]    [0x40000000-0x4005ffff], 0x00060000 bytes flags: no-map
> > > > - reserved[1]    [0xfe729630-0xffffffff], 0x018d69d0 bytes flags: no-overwrite
> > > > + memory[0]      [0x40000000-0x13fffffff], 0x100000000 bytes, flags: none
> > > > + reserved.count = 0x3
> > > > + reserved[0]    [0x40000000-0x4005ffff], 0x60000 bytes, flags: no-map
> > > > + reserved[1]    [0xfe72d620-0xffffffff], 0x18d29e0 bytes, flags: no-overwrite
> > > > + reserved[2]    [0x13fffb000-0x13fffffff], 0x5000 bytes, flags:
> > > > no-notify, no-overwrite
> > > >
> > > > >
> > > > > 2) do you get any errors when running the 'bootefi bootmgr' command
> > > > > other than what you mention above
> > > > >
> > > >
> > > > U-Boot SPL 2024.10-00989-geb052cbb896f (Nov 19 2024 - 14:39:43 -0800)
> > > > ...
> > > > StarFive # bootefi bootmgr
> > > > Booting: mmc 0
> > > > error: no suitable video mode found.
> > > >
> > > > U-Boot SPL 2025.01-rc2-00129-g7fe55182d926 (Nov 19 2024 - 19:56:55 -0800)
> > > > ...
> > > > StarFive # bootefi bootmgr
> > > > Card did not respond to voltage select! : -110
> > > > Not a PE-COFF file
> > > > Loading Boot0000 'mmc 0' failed
> > > > Loading Boot0001 'nvme 0' failed
> > > > EFI boot manager: Cannot load any image
> > > >
> > > > > 3) What exactly do you mean by "global EFI boot is successful"
> > > > >
> > > >
> > > > I don't know what the correct name of it is. EFI can boot with what I
> > > > am labeling as global EFI boot and searching for fixed path names (?),
> > > > or I guess it can decide from EFI variables what to do which I
> > > > consider to be the user-configured EFI boot. One of these (the
> > > > user-configured EFI boot) is broken since the regression.
> > > >
> > > > U-Boot SPL 2024.10-00989-geb052cbb896f (Nov 19 2024 - 14:39:43 -0800)
> > > > ...
> > > > Card did not respond to voltage select! : -110
> > > > Failed to load EFI variables
> > > > ** Booting bootflow '<NULL>' with efi_mgr
> > > > Booting: mmc 0
> > > > error: no suitable video mode found.
> > > >
> > > > U-Boot SPL 2025.01-rc2-00129-g7fe55182d926 (Nov 19 2024 - 19:56:55 -0800)
> > > > ...
> > > > Card did not respond to voltage select! : -110
> > > > Failed to load EFI variables
> > > > ** Booting bootflow '<NULL>' with efi_mgr
> > > > Not a PE-COFF file
> > > > Loading Boot0000 'mmc 0' failed
> > > > Loading Boot0001 'nvme 0' failed
> > > > EFI boot manager: Cannot load any image
> > > > Boot failed (err=-14)
> > > > ** Booting bootflow 'mmc at 16010000.bootdev.part_1' with efi
> > > > Booting /\EFI\BOOT\BOOTRISCV64.EFI
> > > > error: no suitable video mode found.
> > >
> > > Based on the logs above, it seems like you are booting using the
> > > bootmeth efi_mgr? If so, can you try disabling the bootstd config. I
> > > might be wrong, but I remember some issues with the bootstd efi_mgr
> > > method on some other platform. Also, are you available on irc?
> > >
> > > -sughosh
> > >
> > > > > -sughosh
> > > > >
> > > > > >
> > > > > > For context, the Star64 eMMC contains here an installed Debian Linux
> > > > > > OS in the usual way with Grub2 EFI on the EFI System Partition there,
> > > > > > and that image boots fine from U-Boot v2024.10 also when loaded into
> > > > > > memory and using 'bootefi' directly on that memory address.
> > > > > >
> > > >
> > > > Thanks,
> > > >
> > > > -E Shattow
> >
> > Confirming that some discussion about this happened off-list with a
> > positive result and now awaiting a fix.
>
> I tried to reproduce this issue on the qemu arm64 virt platform, with
> 4GB of DRAM memory starting from 0x4000_0000 - 0x1_4000_0000. This is
> the exact same memory map for DRAM memory as on your board. I also
> modified the value of ram_top to 4GB, as on your board. But I am
> unable to hit this on the qemu arm64 platform when I try to boot the
> Debian image with 'bootefi bootmgr' command. The only difference that
> I see is that on the qemu emulator, the OS is on a virtio disk, as
> against mmc in your case. So I think we should try to get to the root
> cause of this. I think you mentioned on irc yesterday that you observe
> this on two boards, so there is definitely something going on here.

Heinrich figured out that this is an issue with the mmc driver on
these platforms. So it looks like the mmc driver is unable to load any
data to addresses above 4GB. E Shattow confirmed this by trying to
load an image from the mmc to an address above 4GB with the 'load mmc'
command, which failed. Loading the same to an address below 4GB works.
I am unaware of the mmc controller IP, so why is the mmc driver not
able to access memory addresses above 4GB would have to be
investigated. Thanks.

-sughosh

>
> -sughosh
>
> >
> > Thanks very much!  -E


More information about the U-Boot mailing list