[PATCH v1] usb: xhci: Check return value of wait for TRB_TRANSFER event

Wed Oct 18 12:55:37 CEST 2023

On 10/18/23 12:16, Minda Chen wrote:
> 
> 
> On 2023/10/18 18:11, Marek Vasut wrote:
>> On 10/18/23 05:46, Minda Chen wrote:
>>>
>>>
>>> On 2023/10/18 10:35, Marek Vasut wrote:
>>>> On 10/18/23 03:22, Minda Chen wrote:
>>>>>
>>>>>
>>>>> On 2023/10/17 19:20, Marek Vasut wrote:
>>>>>> On 10/17/23 08:20, Minda Chen wrote:
>>>>>>> xhci_wait_for_event() waiting TRB_TRANSFER event may return
>>>>>>> NULL. Checking the return value to avoid crash.
>>>>>>>
>>>>>>> Signed-off-by: Minda Chen <minda.chen at starfivetech.com>
>>>>>>
>>>>>> How did you trigger this error ? Is there a reproducer ? Details please ...
>>>>>
>>>>> While Scanning a lenovo usb2.0 udisk， not 100 % reproduce
>>>>
>>>> Can you include Linux
>>>>
>>>> lsusb -vvv
>>>>
>>>> output for this device and include that information in the commit message ? (or the U-Boot info below, that works too, just please add it into the commit message, it is important for future reference).
>>>>
>>> OK， I will add lsusb -vvv Linux udisk message and crash dump info to commit message
>>
>> Thank you
>>
>>>>> This is log.
>>>>>
>>>>> StarFive # usb reset
>>>>> resetting USB...
>>>>> Bus xhci_pci: Register 5000420 NbrPorts 5
>>>>> Starting the controller
>>>>> USB XHCI 1.00
>>>>> scanning bus xhci_pci for devices... WARN halted endpoint, queueing URB anyway.
>>>>> Unexpected XHCI event TRB, skipping... (f77141f0 00000000 13000000 02008401)
>>>>> Unhandled exception: Load access fault
>>>>> EPC: 00000000f7f563c6 RA: 00000000f7f563c6 TVAL: 000000000000000c
>>>>> EPC: 000000004024a3c6 RA: 000000004024a3c6 reloc adjusted
>>>>
>>>> Where does the crash point to in code, can you disassemble the PC pointer ? (or maybe you can use scripts/decodecode I think)
>>>>
>>> OK， I will add EPC pointer disassemble  to commit message
>>
>> This part probably doesn't need to be in the commit message. I'd like to know where the crash occurred in the code.
> 
> 
> 000000004024a376 <abort_td>:
> {
>      4024a376:   7179                    addi    sp,sp,-48
>      4024a378:   f406                    sd      ra,40(sp)
>      4024a37a:   f022                    sd      s0,32(sp)
>      4024a37c:   ec26                    sd      s1,24(sp)
>      4024a37e:   e84a                    sd      s2,16(sp)
>      4024a380:   e44e                    sd      s3,8(sp)
>      4024a382:   e052                    sd      s4,0(sp)
>      4024a384:   89ae                    mv      s3,a1
>      4024a386:   84aa                    mv      s1,a0
>          struct xhci_ctrl *ctrl = xhci_get_ctrl(udev);
>      4024a388:   8c4fe0ef                jal     ra,4024844c <xhci_get_ctrl>
>          struct xhci_ring *ring =  ctrl->devs[udev->slot_id]->eps[ep_index].ring;
>      4024a38c:   6785                    lui     a5,0x1
>      4024a38e:   94be                    add     s1,s1,a5
>      4024a390:   9444a603                lw      a2,-1724(s1)
>      4024a394:   00198713                addi    a4,s3,1
>      4024a398:   0712                    slli    a4,a4,0x4
>      4024a39a:   02061793                slli    a5,a2,0x20
>      4024a39e:   9381                    srli    a5,a5,0x20
>      4024a3a0:   07c9                    addi    a5,a5,18
>      4024a3a2:   078e                    slli    a5,a5,0x3
>      4024a3a4:   97aa                    add     a5,a5,a0
>      4024a3a6:   679c                    ld      a5,8(a5)
>          xhci_queue_command(ctrl, NULL, udev->slot_id, ep_index, TRB_STOP_RING);
>      4024a3a8:   2981                    sext.w  s3,s3
>      4024a3aa:   86ce                    mv      a3,s3
>          struct xhci_ring *ring =  ctrl->devs[udev->slot_id]->eps[ep_index].ring;
>      4024a3ac:   97ba                    add     a5,a5,a4
>          xhci_queue_command(ctrl, NULL, udev->slot_id, ep_index, TRB_STOP_RING);
>      4024a3ae:   4581                    li      a1,0
>      4024a3b0:   473d                    li      a4,15
>          struct xhci_ring *ring =  ctrl->devs[udev->slot_id]->eps[ep_index].ring;
>      4024a3b2:   0087ba03                ld      s4,8(a5) # 1008 <_start-0x401feff8>
>          struct xhci_ctrl *ctrl = xhci_get_ctrl(udev);
>      4024a3b6:   842a                    mv      s0,a0
>          xhci_queue_command(ctrl, NULL, udev->slot_id, ep_index, TRB_STOP_RING);
>      4024a3b8:   d75ff0ef                jal     ra,4024a12c <xhci_queue_command>
>          event = xhci_wait_for_event(ctrl, TRB_TRANSFER);
>      4024a3bc:   02000593                li      a1,32
>      4024a3c0:   8522                    mv      a0,s0
>      4024a3c2:   ebdff0ef                jal     ra,4024a27e <xhci_wait_for_event>
>          field = le32_to_cpu(event->trans_event.flags);
> epc-> 4024a3c6:   455c                    lw      a5,12(a0)

So the fault occurs when reading the controller register(s), do I 
understand it right ?

Could it be the problem is rather some clock, which are turned off after 
a fault ?