[PATCH] cmd: usb: Prevent reset in usb tree/info command
Marek Vasut
marex at denx.de
Fri Jun 9 03:20:33 CEST 2023
On 6/8/23 09:39, Xavier Drudis Ferran wrote:
> El Thu, Jun 08, 2023 at 12:05:18AM +0200, Marek Vasut deia:
>> On 6/5/23 17:20, Xavier Drudis Ferran wrote:
>>> Add a check to avoid dommed (by null pointer dereference) recursive
>>> call, not only for UCLASS_BLK.
>>>
>>> When booting my Rock Pi 4B+ with a USB mass storage stick plugged
>>> into one of the USB 2 ports (EHCI), when it is plugged before power
>>> on, when issuing a
>>>
>>> usb tree
>>>
>>> or
>>>
>>> usb info
>>
>> I cannot reproduce the problem. Do you perform any other interaction with
>> the USB stack, like e.g. 'usb start' or 'usb reset' before issuing the
>> aforementioned commands ?
>>
>
> No. Well, in some tests yes and some no, but I got the error in all cases.
This is doubtful. It is mandatory to run 'usb start' or 'usb reset'
before you would get any meaningful result out of 'usb info'. Without
either, you would get 'USB is stopped.' message. Could it be there are
some extra scripts in your environment that manipulate the USB ?
> Btw, I was testing on the next branch. I had those two commits on top
> https://patchwork.ozlabs.org/project/uboot/patch/202013db5a47ecbac4a53c360ed1ca91ca663996.1685974993.git.xdrudis@tinet.cat/
> https://patchwork.ozlabs.org/project/uboot/patch/464111fca83008503022e8ada5305e69ffd1afbd.1685974993.git.xdrudis@tinet.cat/
>
> And minor configuration changes (but I had bootstage active, might or might not be related)
>
> U-Boot was in a microSD card.
Can you test with stock U-Boot ?
Can you test with another USB stick, i.e. is the issue specific to this
USB stick ?
Is the issue specific to this partition layout of this USB stick, i.e.
if you clone (dd if=... of=...) the content of the USB stick to another
USB stick, does the error still occur.
[...]
> Model: Radxa ROCK Pi 4B
> Net: eth0: ethernet at fe300000
> Hit any key to stop autoboot: 2 1 0
> rockchip_pcie pcie at f8000000: PCIe link training gen1 timeout!
> Bus usb at fe380000: USB EHCI 1.00
> Bus usb at fe3c0000: USB EHCI 1.00
> Bus usb at fe800000: Register 2000140 NbrPorts 2
> Starting the controller
> USB XHCI 1.10
> Bus usb at fe900000: Register 2000140 NbrPorts 2
> Starting the controller
> USB XHCI 1.10
> scanning bus usb at fe380000 for devices... 1 USB Device(s) found
> scanning bus usb at fe3c0000 for devices... 2 USB Device(s) found
> scanning bus usb at fe800000 for devices... 1 USB Device(s) found
> scanning bus usb at fe900000 for devices... 1 USB Device(s) found
> rockchip_pcie pcie at f8000000: failed to find ep-gpios property
> ethernet at fe300000 Waiting for PHY auto negotiation to complete......... TIMEOUT !
> Could not initialize PHY ethernet at fe300000
> rockchip_pcie pcie at f8000000: failed to find ep-gpios property
> ethernet at fe300000 Waiting for PHY auto negotiation to complete......... TIMEOUT !
> Could not initialize PHY ethernet at fe300000
Is this some $preboot script which initializes your hardware ?
=> printenv preboot
> => usb tree
> USB device tree:
> 1 Hub (480 Mb/s, 0mA)
> u-boot EHCI Host Controller
>
> 1 Hub (480 Mb/s, 0mA)
> | u-boot EHCI Host Controller
> |
> uclass_id=64
> |+-2 Mass Storage (480 Mb/s, 200mA)
> TDK LoR TF10 07032B6B1D0ACB96
>
> uclass_id=22
> uclass_id=25
> "Synchronous Abort" handler, esr 0x96000010, far 0x948
> elr: 00000000002157d4 lr : 00000000002157d4 (reloc)
> elr: 00000000f3f4f7d4 lr : 00000000f3f4f7d4
Take the u-boot (unstripped elf) which matches this binary, and run
aarch64-...objdump -lSD on it, then search for the $lr value, see
doc/develop/crash_dumps.rst for details. That should tell you where
exactly the crash occurred. Where did it occur ?
> x0 : 0000000000000005 x1 : 0000000000000000
> x2 : 0000000000000020 x3 : 00000000ff1a0000
> x4 : 00000000ff1a0000 x5 : 0000000000000000
> x6 : 00000000f1f1c000 x7 : 0000000000000001
> x8 : 0000000000000001 x9 : 00000000ffffffd0
> x10: 0000000000000006 x11: 000000000001869f
> x12: 0000000000000200 x13: 0000000000000000
> x14: 00000000ffffffff x15: 00000000f1f1bbc9
> x16: 000000007e4f2113 x17: 000000009a11f13e
> x18: 00000000f1f31d80 x19: 0000000000000000
> x20: 00000000f1f1c110 x21: 0000000000000004
> x22: 00000000f1f1c114 x23: 00000000f1f65398
> x24: 00000000f1f653b8 x25: 0000000000000000
> x26: 0000000000000000 x27: 0000000000000000
> x28: 0000000000000000 x29: 00000000f1f1c000
>
> Code: aa0003f5 900003c0 9123b800 940191f5 (f944a660)
> Resetting CPU ...
>
> resetting ...
>
>
>>> command I get a "Synchronous Error" and a reset just after printing the
>>> mass storage device in the usb tree or usb info. It might depend on the
>>> contents of the USB stick too, I'm not sure.
>>>
>>> It seems like I have two devices as children of the mass storage
>>> device. When there's only a UCLASS_BLK it works fine, but when there's
>>> a UCLASS_BLK and a UCLASS_BOOTDEV, it recurses with a null udev as
>>> first parameter and fails.
>>>
>>> Likewise for usb_show_info().
>>>
>>> Not sure if this should be a patch, an RFC or a bug report. There may
>>> be a better way to solve this, I haven't researched commit
>>> 201417d700a2ab09 ("bootstd: Add the bootdev uclass") and bootdev flow
>>> properly, or thought about cases where udev is not null but the
>>> recursive call might need preventing too. Feel free to think it over
>>> before merging (or after).
>>
>> It is unclear what the real issue is, so until that is sorted out, no
>> merging will occur, sorry.
>>
>
> That's ok.
> I'm guessing the issue may be some mismatched assumption on what is
> under USB devices between the bootdev code and the cmd/usb.c code.
>
>
>>> But this at least fixes a reset at an
>>> innocent looking usb tree or usb info command. Maybe we can improve it
>>> later?
>>
>> NAK, please let's not add ad-hoc poorly understood changes into core code.
>
> Ok, sorry.
>
> I'll reply back if/when I get a clearer theory.
The NAK is really only to prevent this from accidentally going on.
Please see above, maybe it could be narrowed down ?
More information about the U-Boot
mailing list