BISECTED f3866909e350 ("distro_bootcmd: call EFI bootmgr even without having /EFI/boot")

AKASHI Takahiro takahiro.akashi at linaro.org
Sun Jun 13 07:28:43 CEST 2021


On Sat, Jun 12, 2021 at 04:50:46PM +0300, Matwey V. Kornilov wrote:
> сб, 12 июн. 2021 г. в 05:05, AKASHI Takahiro <takahiro.akashi at linaro.org>:
> >
> > Hi Matwey,
> >
> > On Fri, Jun 11, 2021 at 06:08:28PM +0300, Matwey V. Kornilov wrote:
> > > чт, 10 июн. 2021 г. в 23:05, Heinrich Schuchardt <xypron.glpk at gmx.de>:
> > > >
> > > > On 6/7/21 7:51 PM, Matwey V. Kornilov wrote:
> > > > > вс, 6 июн. 2021 г. в 19:47, Heinrich Schuchardt <xypron.glpk at gmx.de>:
> > > > >>
> > > > >> On 6/6/21 6:21 PM, Heinrich Schuchardt wrote:
> > > > >>> On 6/6/21 5:42 PM, Matwey V. Kornilov wrote:
> > > > >>>> вс, 6 июн. 2021 г. в 18:20, Heinrich Schuchardt <xypron.glpk at gmx.de>:
> > > > >>>>>
> > > > >>>>> On 6/6/21 4:37 PM, Matwey V. Kornilov wrote:
> > > > >>>>>> Hi,
> > > > >>>>>>
> > > > >>>>>> I've found that
> > > > >>>>>>
> > > > >>>>>> f3866909e350 ("distro_bootcmd: call EFI bootmgr even without having
> > > > >>>>>> /EFI/boot")
> > > > >>>>>>
> > > > >>>>>> breaks running EFI application from USB device on BeagleBone Black
> > > > >>>>>> (am335x) device.
> > > > >>>>>>
> > > > >>>>>> With this patch I see the following:
> > > > >>>>>>
> > > > >>>>>> Booting /efi\boot\bootarm.efi
> > > > >>>>>> Welcome to GRUB!
> > > > >>>>>>
> > > > >>>>>> data abort
> > > > >>>>>> pc : [<9ce0b6d0>]          lr : [<9ffab7c7>]
> > > > >>>>>> reloc pc : [<7d69d6d0>]    lr : [<8083d7c7>]
> > > > >>>>>> sp : 9df44e28  ip : 9ffdfe90     fp : 00000003
> > > > >>>>>> r10: 9ffe3300  r9 : 00000000     r8 : 9df6fe88
> > > > >>>>>> r7 : 00000000  r6 : 9ce5da08     r5 : 9ce571f8  r4 : 9ce2c040
> > > > >>>>>> r3 : 00000000  r2 : 00000001     r1 : 9ce56598  r0 : 00000000
> > > > >>>>>> Flags: NzCv  IRQs off  FIQs on  Mode SVC_32
> > > > >>>>>> Code: e3500000 0a000015 e590000c eb00f96e (e5d03000)
> > > > >>>>>    > UEFI image [0x9ce46000:0x9cf28fff] '/efi\boot\bootarm.efi'
> > > > >>>>>    > Resetting CPU ...
> > > > >>>>>
> > > > >>>>> Hello Matwey,
> > > > >>>>>
> > > > >>>>> thank you for reporting the issue.
> > > > >>>>>
> > > > >>>>> $ echo 'Code: e3500000 0a000015 e590000c eb00f96e (e5d03000)' |
> > > > >>>>> CROSS_COMPILE=arm-linux-gnueabihf- ARCH=arm scripts/decodecode
> > > > >>>>>
> > > > >>>>> Code: e3500000 0a000015 e590000c eb00f96e (e5d03000)
> > > > >>>>> All code
> > > > >>>>> ========
> > > > >>>>>       0:   e3500000        cmp     r0, #0
> > > > >>>>>       4:   0a000015        beq     0x60
> > > > >>>>>       8:   e590000c        ldr     r0, [r0, #12]
> > > > >>>>>       c:   eb00f96e        bl      0x3e5cc
> > > > >>>>>      10:*  e5d03000        ldrb    r3, [r0]                <-- trapping
> > > > >>>>> instruction
> > > > >>>>>
> > > > >>>>> Code starting with the faulting instruction
> > > > >>>>> ===========================================
> > > > >>>>>       0:   e5d03000        ldrb    r3, [r0]
> > > > >>>>>
> > > > >>>>> Looking at the disassembly above we see that reading memory location
> > > > >>>>> NULL fails.
> > > > >>>>>
> > > > >>>>> We need to find out where the exception occurs. The code position is
> > > > >>>>> neither in bootarm.efi nor in U-Boot (9ce0b6d0 is lower than the load
> > > > >>>>> position of bootarm.efi, so it is below the relocated U-Boot code).
> > > > >>>>>
> > > > >>>>> Please, add the following line at the start of grub.cfg to get more
> > > > >>>>> output from GRUB:
> > > > >>>>>
> > > > >>>>>           debug=all
> > > > >>>>
> > > > >>>> This doesn't provide any additional output from GRUB :(
> > > > >>>>
> > > > >>>>>
> > > > >>>>> When building U-Boot, please, add
> > > > >>>>>
> > > > >>>>>           #define DEBUG 1
> > > > >>>>>
> > > > >>>>> in lib/efi_loader/efi_disk.c and lib/efi_loader_file.c a line before
> > > > >>>>> #include <common.h>.
> > > > >>>>
> > > > >>>>
> > > > >>>> This doesn't provide much output as well:
> > > > >>>>
> > > > >>>> Scanning disk mmc at 48060000.blk...
> > > > >>>> EFI: Call: efi_install_multiple_protocol_interfaces( &handle,
> > > > >>>> &efi_guid_device_path, diskobj->dp, &efi_block_io_guid, &diskobj->ops,
> > > > >>>> NULL)
> > > > >>>> EFI: 0 returned by efi_install_multiple_protocol_interfaces( &handle,
> > > > >>>> &efi_guid_device_path, diskobj->dp, &efi_block_io_guid, &diskobj->ops,
> > > > >>>> NULL)
> > > > >>>> ** Unrecognized filesystem type **
> > > > >>>> Scanning disk mmc at 481d8000.blk...
> > > > >>>> EFI: Call: efi_install_multiple_protocol_interfaces( &handle,
> > > > >>>> &efi_guid_device_path, diskobj->dp, &efi_block_io_guid, &diskobj->ops,
> > > > >>>> NULL)
> > > > >>>> EFI: 0 returned by efi_install_multiple_protocol_interfaces( &handle,
> > > > >>>> &efi_guid_device_path, diskobj->dp, &efi_block_io_guid, &diskobj->ops,
> > > > >>>> NULL)
> > > > >>>> EFI: Call: efi_install_multiple_protocol_interfaces( &handle,
> > > > >>>> &efi_guid_device_path, diskobj->dp, &efi_block_io_guid, &diskobj->ops,
> > > > >>>> NULL)
> > > > >>>> EFI: 0 returned by efi_install_multiple_protocol_interfaces( &handle,
> > > > >>>> &efi_guid_device_path, diskobj->dp, &efi_block_io_guid, &diskobj->ops,
> > > > >>>> NULL)
> > > > >>>> Found 3 disks
> > > > >>>
> > > > >>> This implies that GRUB is crashing before even accessing the file system
> > > > >>> (including grub.cfg).
> > > > >>>
> > > > >>> On an OrangePi PC I deleted /boot.scr and moved grubarm.efi to
> > > > >>> /EFI/boot/bootarm.efi. It boots without problem.
> > > > >>>
> > > > >>> What version of GRUB are you using?
> > > > >>> How were you booting before updating U-Boot?
> > > > >>> What version of U-Boot are you using where the error occurs?
> > > > >>> Why do you have grub in /EFI/boot/bootarm.efi and not in a distro
> > > > >>> specific path, e.g. /EFI/debian/grubarm.efi? /EFI/boot is typically only
> > > > >>> used by installers.
> > > > >>>
> > > > >>> If the boot manager is started by distroboot it may not have an
> > > > >>> appropriate device path. It tries to load the file given by environment
> > > > >>> variable $fdtfile from the boot device.
> > > > >>>
> > > > >>>   From the U-Boot console could you, please, try:
> > > > >>>
> > > > >>> 1)
> > > > >>> load usb 0:1 $kernel_addr_r EFI/boot/bootarm.efi
> > > > >>> bootefi bootmgr
> > > > >>>
> > > > >>>
> > > > >>> 2)
> > > > >>> load usb 0:1 $kernel_addr_r EFI/boot/bootarm.efi
> > > > >>> load usb 0:2 $fdt_addr_r dtb
> > > > >>> bootefi bootmgr $fdt_addr_r
> > > > >>>
> > > > >>> where you need to replace dtb by the correct device tree file and adjust
> > > > >>> the partition numbers.
> > > > >>>
> > > > >>> Best regards
> > > > >>>
> > > > >>> Heinrich
> > > > >>
> > > > >> To catch the earlier EFI API calls you can add
> > > > >>
> > > > >> #define DEBUG 1
> > > > >>
> > > > >> to lib/efi_loader/efi_boottime.c
> > > > >
> > > > >
> > > > > Welcome to GRUB!
> > > > >
> > > > >      EFI: Entry efi_locate_handle_ext(2,
> > > > > 9042a9de-23dc-4a38-96fb-7aded080516a, 00000000, 9df40dfc, 9ce2d660)
> > > >
> > > > EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID
> > > >
> > > > This call could be from grub-core/video/efi_gop.c, check_protocol().
> > > >
> > > > >      EFI: Exit: efi_locate_handle_ext: 14
> > > >
> > > > EFI_NOT_FOUND
> > > >
> > > > >      EFI: Entry efi_open_protocol(9df5f298,
> > > > > 5b1b31a1-9562-11d2-8e3f-00a0c969723b, 9df40e14, 9df5f298, 00000000,
> > > > > 0x2)
> > > >
> > > > EFI_LOADED_IMAGE_PROTOCOL_GUID
> > > >
> > > > This call could be from grub-core/kern/efi/efi.c,
> > > > grub_efi_get_loaded_image().
> > > >
> > > > >      EFI: Exit: efi_open_protocol: 0
> > > > >      EFI: Entry efi_open_protocol(00000000,
> > > >
> > > > The parameter @handle must not be NULL.
> > > >
> > > > > 09576e91-6d3f-11d2-8e39-00a0c969723b, 9df40e14, 9df5f298, 00000000,
> > > >
> > > > EFI_DEVICE_PATH_PROTOCOL_GUID
> > > >
> > > > This could be called from grub-core/kern/efi/efi.c,
> > > > grub_efi_get_device_path() which is invoked from
> > > > grub_machine_get_bootlocation().
> > > >
> > > > > 0x2)
> > > >
> > > > EFI_INVALID_PARAMETER is returned because the handle is NULL.
> > > >
> > > > I could partially reproduce the problem by setting
> > > >
> > > >     info->device_handle = NULL;
> > > >
> > > > at the end of efi_setup_loaded_image():
> > > >
> > > > Welcome to GRUB!
> > > >
> > > >      EFI: Entry efi_open_protocol(79fdea40,
> > > > 5b1b31a1-9562-11d2-8e3f-00a0c969723b, 79f570e4, 79fdea40, 00000000, 0x2)
> > > >      EFI: Exit: efi_open_protocol: 0
> > > >      EFI: Entry efi_open_protocol(00000000,
> > > > 09576e91-6d3f-11d2-8e39-00a0c969723b, 79f570b4, 79fdea40, 00000000, 0x2)
> > > >      EFI: Exit: efi_open_protocol: 2
> > > > error: disk `,msdos2' not found.
> > > > grub rescue> >
> > > >
> > > > This leaves me with two questions:
> > > >
> > > > Why does GRUB not handle
> > > >
> > > >      *device = grub_efidisk_get_device_name (NULL);
> > > >
> > > > gracefully? Maybe it is because it tries to print via the graphical
> > > > output protocol which does not exist?
> > > >
> > > > Why is image->device_handle NULL?
> > > >
> > > > Next step is to verify that image->device_handle is really NULL.
> > > >
> > > > Please apply the following change to efi/efi_loader/efi_boottime.c
> > > >
> > > > @@ -2060,6 +2069,7 @@ efi_status_t EFIAPI efi_load_image(bool boot_policy,
> > > >                  free(info);
> > > >          }
> > > >   error:
> > > > +       printf("*** %p\n", info->device_handle);
> > > >          return EFI_EXIT(ret);
> > > >   }
> > >
> > >
> > > Booting /efi\boot\bootarm.efi
> > > EFI: Entry efi_load_image(0, 9df56d90,
> > > /VenHw(e61d73b9-a384-4acc-aeab-82e828f3628b)/UsbClass(0x13fe,0x4123,0x0,0x0,0x0)/HD(1,GPT,c3d6ff69-0ec0-4be1-91a3-e5fbeed5c4b9,0x2000,0x8000)/efi\boot\bootarm.efi,
> > > 82000000, 929792, 9df40ea0)
> > > *** 00000000
> >
> > The commit message of f3866909e350 ("distro_bootcmd: call EFI bootmgr
> > even without having /EFI/boot") says:
> >     Thus, call the bootmgr before we try to boot the EFI binary inside
> >     the removable media path.
> >
> > Given the above, your log messages suggest:
> >     - bootmgr didn't boot grub
> >     - So grub in USB:/efi/boot/bootarm.efi was booted by bootefi
> >       after enumerating/probing USB devices
> >     - Then, grub tried to open a protocol using the handle of
> >       the (boot?) device (i.e. USB partition) to access some file
> >       and got corrupted.
> >     - the handle was NULL because efi_dp_find_obj() in
> >       efi_setup_loaded_image() failed and returned NULL
> >       when booting grub
> >
> > The last problem happened because:
> > - When bootmgr was first called, uefi subsystem was initialized
> >   and all the U-Boot disks at this point were enumerated as UEFI disks.
> >   Unfortunately, any USB devices had not yet enumerated due to the commit
> >   above.
> > - Then, bootefi was called after enumerating USB devices in
> >   distro_bootcmd, but the result was never reflected to UEFI
> >   subsystem due to the limitation of the current UEFI implementation.
> > - So grub image was successfully loaded from USB, yet UEFI never recognized
> >   the device.
> > - Please note that efi_set_bootdev() doesn't care whether the device is
> >   already recognized by UEFI or not.
> >
> > If my guess is right, this is another example of the defect in
> > UEFI disk implementation.
> > FYI, see:
> > https://lists.denx.de/pipermail/u-boot/2021-June/451828.html
> 
> Seems to make sense. Thank you for the explanation.
> But how to fix it?

If I'm correct, you need to change the boot device order and
put USB in the first place in BOOT_TARGET_DEVICES() in am335x_*.h.

But please note this is a quick hack, and the same issue can happen
to everybody and on any platform. The ultimate solution is to fix
the defect in UEFI disk implementation as I proposed in the message above.

-Takahiro Akashi


> 
> 
> >
> > -Takahiro Akashi
> >
> >
> > > EFI: Exit: efi_load_image: 0
> > > EFI: Entry efi_start_image(9df5f298, 9df40e78, 9df40e7c)
> > >   EFI: Call: efi_open_protocol(image_handle, &efi_guid_loaded_image,
> > > &info, NULL, NULL, EFI_OPEN_PROTOCOL_GET_PROTOCOL)
> > >     EFI: Entry efi_open_protocol(9df5f298,
> > > 5b1b31a1-9562-11d2-8e3f-00a0c969723b, 9df40e60, 00000000, 00000000,
> > > 0x2)
> > >     EFI: Exit: efi_open_protocol: 0
> > >
> > >
> > > >
> > > > Best regards
> > > >
> > > > Heinrich
> > > >
> > > > >      EFI: Exit: efi_open_protocol: 2
> > > > > data abort
> > > > > pc : [<9ce076d0>]          lr : [<9ffa85b3>]
> > > > > reloc pc : [<7d69d6d0>]    lr : [<8083e5b3>]
> > > > > sp : 9df40e28  ip : 00000000     fp : 00000003
> > > > > r10: 9ffe2df8  r9 : 00000000     r8 : 9df5f298
> >
> > > > > r7 : 00000000  r6 : 9ce59a08     r5 : 9ce531f8  r4 : 9ce28040
> > > > > r3 : 00000000  r2 : 9ffeb328     r1 : 00000000  r0 : 00000000
> > > > > Flags: NzCv  IRQs off  FIQs on  Mode SVC_32
> > > > > Code: e3500000 0a000015 e590000c eb00f96e (e5d03000)
> > > > > UEFI image [0x9ce42000:0x9cf24fff] '/efi\boot\bootarm.efi'
> > > > > Resetting CPU ...
> > > > >
> > > > >
> > > > >>
> > > > >> Best regards
> > > > >>
> > > > >> Heinrich
> > > > >>
> > > > >>
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > With best regards,
> > > Matwey V. Kornilov
> 
> 
> 
> --
> With best regards,
> Matwey V. Kornilov


More information about the U-Boot mailing list