[PATCH v1] usb: xhci: Check return value of wait for TRB_TRANSFER event
Minda Chen
minda.chen at starfivetech.com
Thu Oct 19 04:46:37 CEST 2023
On 2023/10/18 18:55, Marek Vasut wrote:
> On 10/18/23 12:16, Minda Chen wrote:
>>
>>
>> On 2023/10/18 18:11, Marek Vasut wrote:
>>> On 10/18/23 05:46, Minda Chen wrote:
>>>>
>>>>
>>>> On 2023/10/18 10:35, Marek Vasut wrote:
>>>>> On 10/18/23 03:22, Minda Chen wrote:
>>>>>>
>>>>>>
>>>>>> On 2023/10/17 19:20, Marek Vasut wrote:
>>>>>>> On 10/17/23 08:20, Minda Chen wrote:
>>>>>>>> xhci_wait_for_event() waiting TRB_TRANSFER event may return
>>>>>>>> NULL. Checking the return value to avoid crash.
>>>>>>>>
>>>>>>>> Signed-off-by: Minda Chen <minda.chen at starfivetech.com>
>>>>>>>
>>>>>>> How did you trigger this error ? Is there a reproducer ? Details please ...
>>>>>>
>>>>>> While Scanning a lenovo usb2.0 udisk, not 100 % reproduce
>>>>>
>>>>> Can you include Linux
>>>>>
>>>>> lsusb -vvv
>>>>>
>>>>> output for this device and include that information in the commit message ? (or the U-Boot info below, that works too, just please add it into the commit message, it is important for future reference).
>>>>>
>>>> OK, I will add lsusb -vvv Linux udisk message and crash dump info to commit message
>>>
>>> Thank you
>>>
>>>>>> This is log.
>>>>>>
>>>>>> StarFive # usb reset
>>>>>> resetting USB...
>>>>>> Bus xhci_pci: Register 5000420 NbrPorts 5
>>>>>> Starting the controller
>>>>>> USB XHCI 1.00
>>>>>> scanning bus xhci_pci for devices... WARN halted endpoint, queueing URB anyway.
>>>>>> Unexpected XHCI event TRB, skipping... (f77141f0 00000000 13000000 02008401)
>>>>>> Unhandled exception: Load access fault
>>>>>> EPC: 00000000f7f563c6 RA: 00000000f7f563c6 TVAL: 000000000000000c
>>>>>> EPC: 000000004024a3c6 RA: 000000004024a3c6 reloc adjusted
>>>>>
>>>>> Where does the crash point to in code, can you disassemble the PC pointer ? (or maybe you can use scripts/decodecode I think)
>>>>>
>>>> OK, I will add EPC pointer disassemble to commit message
>>>
>>> This part probably doesn't need to be in the commit message. I'd like to know where the crash occurred in the code.
>>
>>
>> 000000004024a376 <abort_td>:
>> {
>> 4024a376: 7179 addi sp,sp,-48
>> 4024a378: f406 sd ra,40(sp)
>> 4024a37a: f022 sd s0,32(sp)
>> 4024a37c: ec26 sd s1,24(sp)
>> 4024a37e: e84a sd s2,16(sp)
>> 4024a380: e44e sd s3,8(sp)
>> 4024a382: e052 sd s4,0(sp)
>> 4024a384: 89ae mv s3,a1
>> 4024a386: 84aa mv s1,a0
>> struct xhci_ctrl *ctrl = xhci_get_ctrl(udev);
>> 4024a388: 8c4fe0ef jal ra,4024844c <xhci_get_ctrl>
>> struct xhci_ring *ring = ctrl->devs[udev->slot_id]->eps[ep_index].ring;
>> 4024a38c: 6785 lui a5,0x1
>> 4024a38e: 94be add s1,s1,a5
>> 4024a390: 9444a603 lw a2,-1724(s1)
>> 4024a394: 00198713 addi a4,s3,1
>> 4024a398: 0712 slli a4,a4,0x4
>> 4024a39a: 02061793 slli a5,a2,0x20
>> 4024a39e: 9381 srli a5,a5,0x20
>> 4024a3a0: 07c9 addi a5,a5,18
>> 4024a3a2: 078e slli a5,a5,0x3
>> 4024a3a4: 97aa add a5,a5,a0
>> 4024a3a6: 679c ld a5,8(a5)
>> xhci_queue_command(ctrl, NULL, udev->slot_id, ep_index, TRB_STOP_RING);
>> 4024a3a8: 2981 sext.w s3,s3
>> 4024a3aa: 86ce mv a3,s3
>> struct xhci_ring *ring = ctrl->devs[udev->slot_id]->eps[ep_index].ring;
>> 4024a3ac: 97ba add a5,a5,a4
>> xhci_queue_command(ctrl, NULL, udev->slot_id, ep_index, TRB_STOP_RING);
>> 4024a3ae: 4581 li a1,0
>> 4024a3b0: 473d li a4,15
>> struct xhci_ring *ring = ctrl->devs[udev->slot_id]->eps[ep_index].ring;
>> 4024a3b2: 0087ba03 ld s4,8(a5) # 1008 <_start-0x401feff8>
>> struct xhci_ctrl *ctrl = xhci_get_ctrl(udev);
>> 4024a3b6: 842a mv s0,a0
>> xhci_queue_command(ctrl, NULL, udev->slot_id, ep_index, TRB_STOP_RING);
>> 4024a3b8: d75ff0ef jal ra,4024a12c <xhci_queue_command>
>> event = xhci_wait_for_event(ctrl, TRB_TRANSFER);
>> 4024a3bc: 02000593 li a1,32
>> 4024a3c0: 8522 mv a0,s0
>> 4024a3c2: ebdff0ef jal ra,4024a27e <xhci_wait_for_event>
>> field = le32_to_cpu(event->trans_event.flags);
>> epc-> 4024a3c6: 455c lw a5,12(a0)
>
> So the fault occurs when reading the controller register(s), do I understand it right ?
>
I think it is right. Actually this error occur in error path, control tx transfer TRB_TRANSFER error occur and jump to error path.
sending TRB_TRANSFER again.
> Could it be the problem is rather some clock, which are turned off after a fault ?
I think not. Just this udisk can reproduce this issue.
More information about the U-Boot
mailing list