EFI from usb HDD

Michal Simek michal.simek at xilinx.com
Wed Aug 18 11:07:09 CEST 2021



On 8/18/21 7:13 AM, AKASHI Takahiro wrote:
> On Tue, Aug 17, 2021 at 09:20:31AM +0200, Michal Simek wrote:
>>
>>
>> On 8/12/21 11:43 AM, AKASHI Takahiro wrote:
>>> On Fri, Jul 30, 2021 at 08:22:18AM +0200, Michal Simek wrote:
>>>>
>>>>
>>>> On 7/30/21 7:33 AM, AKASHI Takahiro wrote:
>>>>> On Fri, Jul 30, 2021 at 06:41:01AM +0200, Michal Simek wrote:
>>>>>>
>>>>>>
>>>>>> On 7/30/21 4:35 AM, AKASHI Takahiro wrote:
>>>>>>> On Thu, Jul 29, 2021 at 04:09:32PM +0200, Michal Simek wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> On 6/10/21 2:59 PM, AKASHI Takahiro wrote:
>>>>>>>>> On Thu, Jun 10, 2021 at 02:31:46PM +0200, Michal Simek wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 6/10/21 12:51 PM, Heinrich Schuchardt wrote:
>>>>>>>>>>> On 6/10/21 12:04 PM, Michal Simek wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> On 6/10/21 11:47 AM, Heinrich Schuchardt wrote:
>>>>>>>>>>>>> On 6/10/21 10:44 AM, Michal Simek wrote:
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I am playing with booting from USB via EFI. And I see very weird
>>>>>>>>>>>>>> behavior. I have burnt image with grub to USB flashdisk and I have
>>>>>>>>>>>>>> tested it on 3 zynqmp boards. zcu102, zcu104 and SOM Kria board.
>>>>>>>>>>>>>> On zcu102 grub is going to boot menu and everything is working fine as
>>>>>>>>>>>>>> expected.
>>>>>>>>>>>>>> On zcu104 and SOM Kria I am able to get grub not to menu. When I list
>>>>>>>>>>>>>> partitions in grub I see that only SDs are listed:
>>>>>>>>>>>>>> grub> ls
>>>>>>>>>>>>>> (hd0) (hd0,msdos1) (hd1) (hd1,msdos1)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hello Michal,
>>>>>>>>>>>>>
>>>>>>>>>>>>> thanks for sharing your observations.
>>>>>>>>>>>>>
>>>>>>>>>>>>> What devices do hd0 and hd1 relate to?
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On zcu102(working board) I also see usb(gpt) partitions and SD.
>>>>>>>>>>>>>> grub> ls
>>>>>>>>>>>>>> (hd0) (hd0,gpt2) (hd0,gpt1) (hd1) (hd1,msdos1)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> GPT and MBR partitioning are independent of the device type.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On zcu104 I see one more error message
>>>>>>>>>>>>>> "PE image measurement failed"
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is related to CONFIG_EFI_TCG2_PROTOCOL=y. Do you have a TPMv2? This
>>>>>>>>>>>>> will not stop disk enumeration.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> But I can't see it on SOM.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> U-Boot image is just the same for all boards. I am using generic
>>>>>>>>>>>>>> xilinx_zynqmp_virt_defconfig.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> When I compare DT description for USB between zcu102 and zcu104 they
>>>>>>>>>>>>>> are
>>>>>>>>>>>>>> the same. SOM doesn't have usb enabled by default (but I enabled it)
>>>>>>>>>>>>>> but
>>>>>>>>>>>>>> grub starts which means that communication with USB is fine.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It is based on my latest patches available here.
>>>>>>>>>>>>>> u-boot/custodians/u-boot-microblaze.git (usb-efi-issue branch)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Also when I list usb I see all partitions just fine.
>>>>>>>>>>>>>> ZynqMP> part list usb 0
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Partition Map for USB device 0  --   Partition Type: EFI
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Part    Start LBA       End LBA         Name
>>>>>>>>>>>>>>           Attributes
>>>>>>>>>>>>>>           Type GUID
>>>>>>>>>>>>>>           Partition GUID
>>>>>>>>>>>>>>     1     0x00000800      0x001007fe      "Microsoft basic data"
>>>>>>>>>>>>>>           attrs:  0x0000000000000000
>>>>>>>>>>>>>>           type:   ebd0a0a2-b9e5-4433-87c0-68b6b72699c7
>>>>>>>>>>>>>>           type:   data
>>>>>>>>>>>>>>           guid:   0e7f8b3d-296b-4720-be9d-c4687d3c4a77
>>>>>>>>>>>>>>     2     0x00100800      0x001197fe      "Microsoft basic data"
>>>>>>>>>>>>>>           attrs:  0x0000000000000000
>>>>>>>>>>>>>>           type:   ebd0a0a2-b9e5-4433-87c0-68b6b72699c7
>>>>>>>>>>>>>>           type:   data
>>>>>>>>>>>>>>           guid:   8892eddc-231a-4e6e-a5e1-c310f4482fb7
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Do you have any idea why on one system is working fine to get to menu
>>>>>>>>>>>>>> and on others there is an issue to get all partitions even u-boot is
>>>>>>>>>>>>>> able to see them and can work with them.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Michal
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Where is the GRUB binary? - If it is in EFI/boot/bootaa64.efi, it could
>>>>>>>>>>>>> be that the USB sub-system is simply not initialized yet when the boot
>>>>>>>>>>>>> manager is called by distroboot.
>>>>>>>>>>>>>
>>>>>>>>>>>>> For testing partition detection in the UEFI sub-system it is enough
>>>>>>>>>>>>> to run
>>>>>>>>>>>>>
>>>>>>>>>>>>>      efidebug devices
>>>>>>>>>>>>>
>>>>>>>>>>>>> Until yesterday we had a problem with partition numbers >= 10, cf.
>>>>>>>>>>>>>
>>>>>>>>>>>>> efi_loader: partition numbers are hexadecimal
>>>>>>>>>>>>> https://source.denx.de/u-boot/u-boot/-/commit/3dca77b1dc1b6dbf9c8b51572fe4b0553cef009f
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Block devices are enumerated in efi_disk_register(). Please, try to add
>>>>>>>>>>>>> debug output there to elucidate the problem.
>>>>>>>>>>>>
>>>>>>>>>>>> I found where the problem is. First of all zcu102 didn't use the same
>>>>>>>>>>>> image as others (it wasn't updated properly).
>>>>>>>>>>>> When you have CONFIG_EFI_CAPSULE_ON_DISK_EARLY that efi_disk_register()
>>>>>>>>>>>> is called before usb block devices are detected and registered that's
>>>>>>>>>>>> why grub doesn't see them.
>>>>>>>>>>>
>>>>>>>>>>> The problem is CONFIG_EFI_SETUP_EARLY=y required by
>>>>>>>>>>> CONFIG_EFI_CAPSULE_ON_DISK_EARLY.
>>>>>>>>>>>
>>>>>>>>>>> Why is USB initialized later then MMC?
>>>>>>>>>>
>>>>>>>>>> It is not just usb. SCSI/sata are behaving in the same way too.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Overall we have a deficiency in the UEFI implementation in that we
>>>>>>>>>>> cannot deal with block devices added or removed after initialization.
>>>>>>>>>>>
>>>>>>>>>>> Here integration with the driver model is missing.
>>>>>>>>>>
>>>>>>>>>> Right. And also there are commands which can create MBR partitions and I
>>>>>>>>>> expect when you write image to SD and then run rescan or so you could
>>>>>>>>>> get other partitions too.
>>>>>>>>>> Maybe hook via part_init()? with removing efi_disk_register.
>>>>>>>>>
>>>>>>>>> For the record, I have proposed my ideas several times[1], [2].
>>>>>>>>> I'm, however, no longer working on this issue as I have shifted
>>>>>>>>> my focus to UEFI secure boot and capsule update.
>>>>>>>>>
>>>>>>>>> -Takahiro Akashi
>>>>>>>>>
>>>>>>>>> [1] https://lists.denx.de/pipermail/u-boot/2018-November/347491.html
>>>>>>>>> [2] https://lists.denx.de/pipermail/u-boot/2019-February/357923.html
>>>>>>>>
>>>>>>>> I want to continue on this thread. I have disabled
>>>>>>>> EFI_CAPSULE_ON_DISK_EARLY some time ago and trying to workaround that
>>>>>>>> usb/scsi detection by simply calling usb reset and scsi reset as the
>>>>>>>> part of PREBOOT. Then all disks are recorded and visible by grub.
>>>>>>>>
>>>>>>>> But I found another issue which is kind of weird. We are using
>>>>>>>> distroboot with soft of fixed sequence. Important part of sequence is
>>>>>>>> sd, usb, scsi.
>>>>>>>>
>>>>>>>> I have added grub on scsi and when I boot directly via run bootcmd_scsi0
>>>>>>>> everything is working fine. When I let distroboot to do the job it or
>>>>>>>> run printenv -e before bootcmd_scsi0 I am getting exception.
>>>>>>>> From debug it is visible that it is exception called from
>>>>>>>> efi_disk_read_blocks.
>>>>>>>>
>>>>>>>>     0  0x7ff5d188 hang()+20: include/bootstage.h, line 389
>>>>>>>>     1  0x7ff5f908 __assert_fail(): lib/panic.c, line 25
>>>>>>>>     2  0x7fe976a8 do_irq(): arch/arm/lib/interrupts_64.c, line 123
>>>>>>>>     3  0x7fe96a0c _restore_regs()+124: arch/arm/cpu/armv8/exceptions.S,
>>>>>>>> line 141
>>>>>>>>     4  0x7ff43740 efi_disk_read_blocks()+160: lib/efi_loader/efi_disk.c,
>>>>>>>> line 102
>>>>>>>
>>>>>>> How and when did you get this stack trace?
>>>>>>
>>>>>> When Abort happened I connected Xilinx debugger via jtag and look at cpu
>>>>>> backtrace.
>>>>>
>>>>> OK, but we are already in grub here and such a trace (in U-Boot)
>>>>> doesn't make sense. Right?
>>>>
>>>> Correct grub already started. But I expect it is still using U-Boot
>>>> drivers and all exception handlers are still in place from u-boot.
>>>
>>> Yeah, but what I didn't understand was:
>>>
>>> !"Synchronous Abort" handler, esr 0x02000000
>>> !elr: ffffffffa816c5b0 lr : 000000000805e218 (reloc)
>>> !elr: 00000000200005b0 lr : 000000007fef2218
>>> (snip)
>>> !Code: 000165fa 0b2d05de 0000ffff 00000000 (20000590)
>>> !UEFI image [0x0000000077d48000:0x0000000077de5fff] '/efi\boot\bootaa64.efi'
>>>
>>> "Code:" at the exception doesn't seem to be sane assembler, and
>>> "elr" is not within the code of neither U-Boot nor shim/grub(bootaa64.efi).
>>> ("esr" doesn't tell us anything.)
>>> So I wondered where the backtrace came from.
>>>
>>> BTW, can you please confirm which function sits at the address of
>>> "lr" (=0x7fe2218)?
>>
>> I don't have that images anymore.
>>
>>>
>>>> Maybe it is just sata/scsi related issue in EFI but weird is that when
>>>> disks are scan just before command everything is working fine.
>>>
>>> What do you mean by "when disks are scanned just before the command"?
>>> The case when you ran "run bootcmd_scsi" without "printenv -e"?
>>>
>>> Do you reproduce the problem even if you revert the patch,
>>> "xilinx: zynqmp: Initialize usb and scsi via preboot", and
>>> run the commands, "run scsi_init; [printenv -e;] run bootcmd_scsi?
>>>
>>> Can you also try other EFI commands, like "efidebug devices"?
>>
>> I found that there is a difference if you run scsi reset or run
>> scsi_init. When scsi_init is used I can't see any issue.
> 
> Here you have tried three cases:
> (1) scsi reset; efidebug devices; boot (hence distro_bootcmd)
> (2) run scsi_init; efidebug devices; boot
> (3) scsi rescan; efidebug devices; boot
> 
> Only case(2) succeeded to boot the system. Right?
> 
> Please double-check that you don't see this problem
> in all those cases if you don't execute "efidebug devices"
> (or "printenv -e").
> # make sure that no efi command will be executed before
> # booting from scsi.

I tested these 3 cases and all of them works fine.

scsi reset
devtype=scsi
run scan_dev_for_boot_part

run scsi_init
devtype=scsi
run scan_dev_for_boot_part

scsi rescan
devtype=scsi
run scan_dev_for_boot_part



> 
>> Variable looks like this
>> scsi_init=if ${scsi_need_init}; then scsi_need_init=false; scsi scan; fi
>>
>> And when you run scsi scan (last log) you see that problem again. It
>> means when scsi reset/scan is called twice issue is observed. In all
> 
> If this is true, my guess is:
> 
> * In the scenarios above, all the block devices are enumerated by
>   scsi_scan() in the first "run reset" or "run rescan" and
>   new blk_desc's are created.
> * efidebug is expected to execute efi_init_obj_list().
>   Please note:
>   EFI subsystem uses U-Boot's blk_desc internally to access block devices.
>   Mapping between U-Boot's blk_desc and UEFI's efi_disk_obj (aka handle)
>   is created only once and statically at the initialization in
>   efi_init_obj_list().
> 
> * Now that scsi_scan() is executed again in the scond scsi command, all
>   the block devices, hence blk_desc structures, will be freed by
>   blk_unbind_all() and blk_desc's will be *re-created* by scsi probing.
> * Nevertheless, the binding between blk_desc and efi_disk_obj is
>   maintained even at this point, so any succeeding r/w operations
>   via UEFI interfaces can point to bogus data of old blk_desc and
>   therefore block accesses will get corrupted.
> 
> My guess above seems to be likely, but it doesn't explain well
> that loading/starting "grub" binary succeeds any way.

That make sense what you described. I print desc and by reset there is
new desc created at different address. And origin location is freed in
device_unbind. Log is below.
The question is how to fix this behavior.

Thanks,
Michal

ZynqMP> scsi reset

Reset SCSI
scanning bus for devices...
blk_unbind_all: if_type 2
SATA link 0 timeout.
Target spinup took 0 ms.
AHCI 0001.0301 32 slots 2 ports 6 Gbps 0x3 impl SATA mode
flags: 64bit ncq pm clo only pmp fbss pio slum part ccc apst
blk_create_device: devnum -1
blk_create_device: name ahci_scsi.id1lun0, desc 000000007be21340
  Device 0: (1:0) Vendor: ATA Prod.: Maxtor 7V300F0 Rev: VA11
            Type: Hard Disk
            Capacity: 286188.8 MB = 279.4 GB (586114704 x 512)
ZynqMP> efidebug devices
Scanning disk mmc at ff170000.blk...
efi_disk_add_dev: desc 000000007be15b30
efi_disk_add_dev: desc 000000007be15b30
Scanning disk ahci_scsi.id1lun0...
efi_disk_add_dev: desc 000000007be21340
efi_disk_add_dev: desc 000000007be21340
efi_disk_add_dev: desc 000000007be21340
Found 5 disks
** Unable to read file ubootefi.var **
Failed to load EFI variables
Unable to find TPMv2 device
DFU alt info setting: done
Device           Device Path
================ ====================
000000007be21590 /VenHw(e61d73b9-a384-4acc-aeab-82e828f3628b)
000000007be218a0 /VenHw(e61d73b9-a384-4acc-aeab-82e828f3628b)/SD(0)/SD(0)
000000007be21a70
/VenHw(e61d73b9-a384-4acc-aeab-82e828f3628b)/SD(0)/SD(0)/HD(1,0x01,0,0x2000,0x1cd2000)
000000007be21f00 /VenHw(e61d73b9-a384-4acc-aeab-82e828f3628b)/Scsi(1,0)
000000007be222e0
/VenHw(e61d73b9-a384-4acc-aeab-82e828f3628b)/Scsi(1,0)/HD(1,GPT,85b731b6-a4b2-47f4-b1c6-aef6e0f2ce81,0x800,0xfffff)
000000007be22730
/VenHw(e61d73b9-a384-4acc-aeab-82e828f3628b)/Scsi(1,0)/HD(2,GPT,ac600dc7-3160-4f3c-a824-496d00e3d007,0x100800,0x18fff)
000000007be22c80
/VenHw(e61d73b9-a384-4acc-aeab-82e828f3628b)/MAC(000a350370f6,1)
ZynqMP> scsi reset

Reset SCSI
scanning bus for devices...
blk_unbind_all: if_type 2
Removing/unbinding device ahci_scsi.id1lun0
device_unbind: free desc 000000007be21340
blk_create_device: devnum -1
blk_create_device: name ahci_scsi.id1lun0, desc 000000007be3e070
  Device 0: (1:0) Vendor: ATA Prod.: Maxtor 7V300F0 Rev: VA11
            Type: Hard Disk
            Capacity: 286188.8 MB = 279.4 GB (586114704 x 512)
ZynqMP>


More information about the U-Boot mailing list