EFI from usb HDD
Michal Simek
michal.simek at xilinx.com
Thu Aug 19 07:38:04 CEST 2021
On 8/19/21 6:14 AM, AKASHI Takahiro wrote:
> On Wed, Aug 18, 2021 at 11:07:09AM +0200, Michal Simek wrote:
>>
>>
>> On 8/18/21 7:13 AM, AKASHI Takahiro wrote:
>>> On Tue, Aug 17, 2021 at 09:20:31AM +0200, Michal Simek wrote:
>>>>
>>>>
>>>> On 8/12/21 11:43 AM, AKASHI Takahiro wrote:
>>>>> On Fri, Jul 30, 2021 at 08:22:18AM +0200, Michal Simek wrote:
>>>>>>
>>>>>>
>>>>>> On 7/30/21 7:33 AM, AKASHI Takahiro wrote:
>>>>>>> On Fri, Jul 30, 2021 at 06:41:01AM +0200, Michal Simek wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 7/30/21 4:35 AM, AKASHI Takahiro wrote:
>>>>>>>>> On Thu, Jul 29, 2021 at 04:09:32PM +0200, Michal Simek wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> On 6/10/21 2:59 PM, AKASHI Takahiro wrote:
>>>>>>>>>>> On Thu, Jun 10, 2021 at 02:31:46PM +0200, Michal Simek wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 6/10/21 12:51 PM, Heinrich Schuchardt wrote:
>>>>>>>>>>>>> On 6/10/21 12:04 PM, Michal Simek wrote:
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 6/10/21 11:47 AM, Heinrich Schuchardt wrote:
>>>>>>>>>>>>>>> On 6/10/21 10:44 AM, Michal Simek wrote:
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I am playing with booting from USB via EFI. And I see very weird
>>>>>>>>>>>>>>>> behavior. I have burnt image with grub to USB flashdisk and I have
>>>>>>>>>>>>>>>> tested it on 3 zynqmp boards. zcu102, zcu104 and SOM Kria board.
>>>>>>>>>>>>>>>> On zcu102 grub is going to boot menu and everything is working fine as
>>>>>>>>>>>>>>>> expected.
>>>>>>>>>>>>>>>> On zcu104 and SOM Kria I am able to get grub not to menu. When I list
>>>>>>>>>>>>>>>> partitions in grub I see that only SDs are listed:
>>>>>>>>>>>>>>>> grub> ls
>>>>>>>>>>>>>>>> (hd0) (hd0,msdos1) (hd1) (hd1,msdos1)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hello Michal,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> thanks for sharing your observations.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> What devices do hd0 and hd1 relate to?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On zcu102(working board) I also see usb(gpt) partitions and SD.
>>>>>>>>>>>>>>>> grub> ls
>>>>>>>>>>>>>>>> (hd0) (hd0,gpt2) (hd0,gpt1) (hd1) (hd1,msdos1)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> GPT and MBR partitioning are independent of the device type.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On zcu104 I see one more error message
>>>>>>>>>>>>>>>> "PE image measurement failed"
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This is related to CONFIG_EFI_TCG2_PROTOCOL=y. Do you have a TPMv2? This
>>>>>>>>>>>>>>> will not stop disk enumeration.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> But I can't see it on SOM.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> U-Boot image is just the same for all boards. I am using generic
>>>>>>>>>>>>>>>> xilinx_zynqmp_virt_defconfig.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> When I compare DT description for USB between zcu102 and zcu104 they
>>>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>> the same. SOM doesn't have usb enabled by default (but I enabled it)
>>>>>>>>>>>>>>>> but
>>>>>>>>>>>>>>>> grub starts which means that communication with USB is fine.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It is based on my latest patches available here.
>>>>>>>>>>>>>>>> u-boot/custodians/u-boot-microblaze.git (usb-efi-issue branch)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Also when I list usb I see all partitions just fine.
>>>>>>>>>>>>>>>> ZynqMP> part list usb 0
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Partition Map for USB device 0 -- Partition Type: EFI
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Part Start LBA End LBA Name
>>>>>>>>>>>>>>>> Attributes
>>>>>>>>>>>>>>>> Type GUID
>>>>>>>>>>>>>>>> Partition GUID
>>>>>>>>>>>>>>>> 1 0x00000800 0x001007fe "Microsoft basic data"
>>>>>>>>>>>>>>>> attrs: 0x0000000000000000
>>>>>>>>>>>>>>>> type: ebd0a0a2-b9e5-4433-87c0-68b6b72699c7
>>>>>>>>>>>>>>>> type: data
>>>>>>>>>>>>>>>> guid: 0e7f8b3d-296b-4720-be9d-c4687d3c4a77
>>>>>>>>>>>>>>>> 2 0x00100800 0x001197fe "Microsoft basic data"
>>>>>>>>>>>>>>>> attrs: 0x0000000000000000
>>>>>>>>>>>>>>>> type: ebd0a0a2-b9e5-4433-87c0-68b6b72699c7
>>>>>>>>>>>>>>>> type: data
>>>>>>>>>>>>>>>> guid: 8892eddc-231a-4e6e-a5e1-c310f4482fb7
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Do you have any idea why on one system is working fine to get to menu
>>>>>>>>>>>>>>>> and on others there is an issue to get all partitions even u-boot is
>>>>>>>>>>>>>>>> able to see them and can work with them.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Michal
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Where is the GRUB binary? - If it is in EFI/boot/bootaa64.efi, it could
>>>>>>>>>>>>>>> be that the USB sub-system is simply not initialized yet when the boot
>>>>>>>>>>>>>>> manager is called by distroboot.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For testing partition detection in the UEFI sub-system it is enough
>>>>>>>>>>>>>>> to run
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> efidebug devices
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Until yesterday we had a problem with partition numbers >= 10, cf.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> efi_loader: partition numbers are hexadecimal
>>>>>>>>>>>>>>> https://source.denx.de/u-boot/u-boot/-/commit/3dca77b1dc1b6dbf9c8b51572fe4b0553cef009f
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Block devices are enumerated in efi_disk_register(). Please, try to add
>>>>>>>>>>>>>>> debug output there to elucidate the problem.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I found where the problem is. First of all zcu102 didn't use the same
>>>>>>>>>>>>>> image as others (it wasn't updated properly).
>>>>>>>>>>>>>> When you have CONFIG_EFI_CAPSULE_ON_DISK_EARLY that efi_disk_register()
>>>>>>>>>>>>>> is called before usb block devices are detected and registered that's
>>>>>>>>>>>>>> why grub doesn't see them.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The problem is CONFIG_EFI_SETUP_EARLY=y required by
>>>>>>>>>>>>> CONFIG_EFI_CAPSULE_ON_DISK_EARLY.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Why is USB initialized later then MMC?
>>>>>>>>>>>>
>>>>>>>>>>>> It is not just usb. SCSI/sata are behaving in the same way too.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Overall we have a deficiency in the UEFI implementation in that we
>>>>>>>>>>>>> cannot deal with block devices added or removed after initialization.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here integration with the driver model is missing.
>>>>>>>>>>>>
>>>>>>>>>>>> Right. And also there are commands which can create MBR partitions and I
>>>>>>>>>>>> expect when you write image to SD and then run rescan or so you could
>>>>>>>>>>>> get other partitions too.
>>>>>>>>>>>> Maybe hook via part_init()? with removing efi_disk_register.
>>>>>>>>>>>
>>>>>>>>>>> For the record, I have proposed my ideas several times[1], [2].
>>>>>>>>>>> I'm, however, no longer working on this issue as I have shifted
>>>>>>>>>>> my focus to UEFI secure boot and capsule update.
>>>>>>>>>>>
>>>>>>>>>>> -Takahiro Akashi
>>>>>>>>>>>
>>>>>>>>>>> [1] https://lists.denx.de/pipermail/u-boot/2018-November/347491.html
>>>>>>>>>>> [2] https://lists.denx.de/pipermail/u-boot/2019-February/357923.html
>>>>>>>>>>
>>>>>>>>>> I want to continue on this thread. I have disabled
>>>>>>>>>> EFI_CAPSULE_ON_DISK_EARLY some time ago and trying to workaround that
>>>>>>>>>> usb/scsi detection by simply calling usb reset and scsi reset as the
>>>>>>>>>> part of PREBOOT. Then all disks are recorded and visible by grub.
>>>>>>>>>>
>>>>>>>>>> But I found another issue which is kind of weird. We are using
>>>>>>>>>> distroboot with soft of fixed sequence. Important part of sequence is
>>>>>>>>>> sd, usb, scsi.
>>>>>>>>>>
>>>>>>>>>> I have added grub on scsi and when I boot directly via run bootcmd_scsi0
>>>>>>>>>> everything is working fine. When I let distroboot to do the job it or
>>>>>>>>>> run printenv -e before bootcmd_scsi0 I am getting exception.
>>>>>>>>>> From debug it is visible that it is exception called from
>>>>>>>>>> efi_disk_read_blocks.
>>>>>>>>>>
>>>>>>>>>> 0 0x7ff5d188 hang()+20: include/bootstage.h, line 389
>>>>>>>>>> 1 0x7ff5f908 __assert_fail(): lib/panic.c, line 25
>>>>>>>>>> 2 0x7fe976a8 do_irq(): arch/arm/lib/interrupts_64.c, line 123
>>>>>>>>>> 3 0x7fe96a0c _restore_regs()+124: arch/arm/cpu/armv8/exceptions.S,
>>>>>>>>>> line 141
>>>>>>>>>> 4 0x7ff43740 efi_disk_read_blocks()+160: lib/efi_loader/efi_disk.c,
>>>>>>>>>> line 102
>>>>>>>>>
>>>>>>>>> How and when did you get this stack trace?
>>>>>>>>
>>>>>>>> When Abort happened I connected Xilinx debugger via jtag and look at cpu
>>>>>>>> backtrace.
>>>>>>>
>>>>>>> OK, but we are already in grub here and such a trace (in U-Boot)
>>>>>>> doesn't make sense. Right?
>>>>>>
>>>>>> Correct grub already started. But I expect it is still using U-Boot
>>>>>> drivers and all exception handlers are still in place from u-boot.
>>>>>
>>>>> Yeah, but what I didn't understand was:
>>>>>
>>>>> !"Synchronous Abort" handler, esr 0x02000000
>>>>> !elr: ffffffffa816c5b0 lr : 000000000805e218 (reloc)
>>>>> !elr: 00000000200005b0 lr : 000000007fef2218
>>>>> (snip)
>>>>> !Code: 000165fa 0b2d05de 0000ffff 00000000 (20000590)
>>>>> !UEFI image [0x0000000077d48000:0x0000000077de5fff] '/efi\boot\bootaa64.efi'
>>>>>
>>>>> "Code:" at the exception doesn't seem to be sane assembler, and
>>>>> "elr" is not within the code of neither U-Boot nor shim/grub(bootaa64.efi).
>>>>> ("esr" doesn't tell us anything.)
>>>>> So I wondered where the backtrace came from.
>>>>>
>>>>> BTW, can you please confirm which function sits at the address of
>>>>> "lr" (=0x7fe2218)?
>>>>
>>>> I don't have that images anymore.
>>>>
>>>>>
>>>>>> Maybe it is just sata/scsi related issue in EFI but weird is that when
>>>>>> disks are scan just before command everything is working fine.
>>>>>
>>>>> What do you mean by "when disks are scanned just before the command"?
>>>>> The case when you ran "run bootcmd_scsi" without "printenv -e"?
>>>>>
>>>>> Do you reproduce the problem even if you revert the patch,
>>>>> "xilinx: zynqmp: Initialize usb and scsi via preboot", and
>>>>> run the commands, "run scsi_init; [printenv -e;] run bootcmd_scsi?
>>>>>
>>>>> Can you also try other EFI commands, like "efidebug devices"?
>>>>
>>>> I found that there is a difference if you run scsi reset or run
>>>> scsi_init. When scsi_init is used I can't see any issue.
>>>
>>> Here you have tried three cases:
>>> (1) scsi reset; efidebug devices; boot (hence distro_bootcmd)
>>> (2) run scsi_init; efidebug devices; boot
>>> (3) scsi rescan; efidebug devices; boot
>>>
>>> Only case(2) succeeded to boot the system. Right?
>>>
>>> Please double-check that you don't see this problem
>>> in all those cases if you don't execute "efidebug devices"
>>> (or "printenv -e").
>>> # make sure that no efi command will be executed before
>>> # booting from scsi.
>>
>> I tested these 3 cases and all of them works fine.
>
> Thank you for the confirmation.
>
>> scsi reset
>> devtype=scsi
>> run scan_dev_for_boot_part
>>
>> run scsi_init
>> devtype=scsi
>> run scan_dev_for_boot_part
>>
>> scsi rescan
>> devtype=scsi
>> run scan_dev_for_boot_part
>>
>>
>>
>>>
>>>> Variable looks like this
>>>> scsi_init=if ${scsi_need_init}; then scsi_need_init=false; scsi scan; fi
>>>>
>>>> And when you run scsi scan (last log) you see that problem again. It
>>>> means when scsi reset/scan is called twice issue is observed. In all
>>>
>>> If this is true, my guess is:
>>>
>>> * In the scenarios above, all the block devices are enumerated by
>>> scsi_scan() in the first "run reset" or "run rescan" and
>>> new blk_desc's are created.
>>> * efidebug is expected to execute efi_init_obj_list().
>>> Please note:
>>> EFI subsystem uses U-Boot's blk_desc internally to access block devices.
>>> Mapping between U-Boot's blk_desc and UEFI's efi_disk_obj (aka handle)
>>> is created only once and statically at the initialization in
>>> efi_init_obj_list().
>>>
>>> * Now that scsi_scan() is executed again in the scond scsi command, all
>>> the block devices, hence blk_desc structures, will be freed by
>>> blk_unbind_all() and blk_desc's will be *re-created* by scsi probing.
>>> * Nevertheless, the binding between blk_desc and efi_disk_obj is
>>> maintained even at this point, so any succeeding r/w operations
>>> via UEFI interfaces can point to bogus data of old blk_desc and
>>> therefore block accesses will get corrupted.
>>>
>>> My guess above seems to be likely, but it doesn't explain well
>>> that loading/starting "grub" binary succeeds any way.
>
> # The implementation of LoadImage interface doesn't use block_io_protocol,
> # and so we won't see this problem when 'grub' is started.
>
>> That make sense what you described. I print desc and by reset there is
>> new desc created at different address. And origin location is freed in
>> device_unbind. Log is below.
>> The question is how to fix this behavior.
>
> It is a matter of *integration* of U-Boot's DM and UEFI implementation.
> It can be, however, a bit difficult/complicated task to achieve this goal
> in such a way that Simon has expected (for example, see [1]).
>
> -Takahiro Akashi
>
> [1] https://lists.denx.de/pipermail/u-boot/2021-June/452847.html
Ok. Then for me the only reasonable solution which is available now is
to call scsi_init via preboot to get block device before any efi
initialization.
And likely usb_boot should be updated in the same way as scsi to use
variable and not call usb start/reset twice.
Thanks,
Michal
More information about the U-Boot
mailing list