[PATCH] usb: xhci: fix event trb handling missed

Bin Meng bmeng.cn at gmail.com
Tue Nov 10 10:02:40 CET 2020


Hi Ran,

On Tue, Nov 10, 2020 at 4:31 PM Ran Wang <ran.wang_1 at nxp.com> wrote:
>
> Hi Bin,
>
> On Tuesday, November 10, 2020 4:06 PM Ran Wang wrote:
> > Hi Bin,
> >
> > On Tuesday, November 10, 2020 3:47 PM Bin Meng wrote:
> > >
> > > Hi Ran,
> > <snip>
> > > > > > > send request in more than 1 Transfer TRB by chaining them, but
> > > > > > > then handle only 1 event TRB to mark request completed.
> > > > > > >
> > > > > > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we
> > > > > > > observe xhci controller will generated more than 1 event TRB
> > > > > > > sometimes, this cause that
> > > > > >
> > > > > > I am not sure if it's fair to say this is Layerscape specific
> > > > > > behavior. Based on the xHCI spec, the spec indicates 1 event trb
> > > > > > should be generated when IOC/ISP flag is set to 1 or an error occurs.
> > > > >
> > > > > Ah, ISP flag is set if the pipe is from an IN endpoint. Currently we have:
> > > > >
> > > > >         /* Only set interrupt on short packet for IN endpoints */
> > > > >         if (usb_pipein(pipe))
> > > > >             field |= TRB_ISP;
> > > > >
> > > > > Can you verify that if removing the above codes, and without your
> > > > > changes in this patch, the original issue can be resolved on LS1028?
> > > >
> > > > Bingo, removing above codes can resolve my issue, too
> > >
> > > Thank you for your testing.
> > >
> > > >
> > > > > > I will see if I can reproduce your issue on an x86 board.
> > > > > >
> > > > >
> > > > > Note this patch does not apply on top of u-boot/master. Please rebase.
> > > >
> > > > Sure, I can rebase my patch, but I am nor sure my solution is still worth:
> > > > xHCI says "The detection of a USB Short Packet (i.e. the actual
> > > > number of bytes received was less than the expected number of bytes
> > > > defined by the Transfer TRB) during a transfer does not necessarily generate
> > an Event.
> > > "
> > > >
> > >
> > > Yes, that's what the spec says. So in your case, can you add some logs
> > > to verify that there is a transfer event trb generated by the xHC and
> > > the completion code is 13 (short packet)? You can add some debug
> > > output in record_transfer_result().
> >
> > Sure, let me try this can get back to you later.
>
> Confirm completion code is 13 when issue happen (please search ' complete_code:13'):

Thank you very much for your testing.

>
> dev=00000000fbb4f3c0, pipe=c0010283, buffer=00000000fbb4fd80, length=2048----------------------------buffer would cross 64KB boundary, so we will send request in more than 1 TRB, this is the key issue trigger condition
> xhci_bulk_tx()#0.1.running_total:0x280
> xhci_bulk_tx()#0.2.trb_buff_len:0x280
> xhci_bulk_tx()#0.3.running_total:0x280
> xhci_bulk_tx()#0.4.num_trbs:0x2--------------------------2 Transfer TRB
> xhci_bulk_tx()#0.5.running_total:0x10280
> xhci_bulk_tx()#0.start_trb:0x00000000fbb47140
> ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47140----------Assemble 1st Transfer TRB
> xhci_bulk_tx()#0.addr:0xfbb4fd80
> xhci_bulk_tx()#0.trb_buff_len:0x280
> xhci_bulk_tx()#0.running_total:0x280
> xhci_bulk_tx()#0.length:0x800
> xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000
> ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47150----------Assemble 2nd Transfer TRB
> xhci_bulk_tx()#0.addr:0xfbb50000
> xhci_bulk_tx()#0.trb_buff_len:0x580
> xhci_bulk_tx()#0.running_total:0x800
> xhci_bulk_tx()#0.length:0x800
> xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000
> xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb47140----------------handle event TRB for 1st Transfer TRB
> xhci_bulk_tx()#0.event->trans_event.transfer_len:0xd000180
> xhci_bulk_tx()#0.event->trans_event.flags:0x1058001
> xhci_bulk_tx()#0.event->len:384
> xhci_bulk_tx()#0.event->complete_code:13
> xhci_bulk_tx()#1.field:0x1058001
> xhci_bulk_tx()#2.TRB_TO_EP_ENDEX(field):0x4
> xhci_bulk_tx()#3.ep_index:0x4----------------------------------------------------------xhci_bulk_tx() return
>
> dev=00000000fbb4f3c0, pipe=c0018203, buffer=00000000fbb2f9c0, length=110----------an other call comming, and for different EP (ep_index = 05)
> xhci_bulk_tx()#0.1.running_total:0x640
> xhci_bulk_tx()#0.2.trb_buff_len:0x640
> xhci_bulk_tx()#0.3.running_total:0x640
> xhci_bulk_tx()#0.4.num_trbs:0x1
> xhci_bulk_tx()#0.5.running_total:0x640
> xhci_bulk_tx()#0.start_trb:0x00000000fbb47610
> ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47610---------queue 1st transfer TRB
> xhci_bulk_tx()#0.addr:0xfbb2f9c0
> xhci_bulk_tx()#0.trb_buff_len:0x6e
> xhci_bulk_tx()#0.running_total:0x6e
> xhci_bulk_tx()#0.length:0x6e
> xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000
> xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb47150------------hand event TRB, note that this buffer is for last Transfer TRB in last call (see above '2nd Transfer TRB')
> xhci_bulk_tx()#0.event->trans_event.transfer_len:0x1000000
> xhci_bulk_tx()#0.event->trans_event.flags:0x1058000
> xhci_bulk_tx()#0.event->len:0
> xhci_bulk_tx()#0.event->complete_code:1
> xhci_bulk_tx()#1.field:0x1058000
> xhci_bulk_tx()#2.TRB_TO_EP_ENDEX(field):0x4----------------ep_index is 4 rather than 5, then cause BUG()
> xhci_bulk_tx()#3.ep_index:0x5-------------------------------------
> ep_index mis-match wait for event again!!!!!!!!!----------------------------------
> BUG at drivers/usb/host/xhci-ring.c:759/xhci_bulk_tx()!
> BUG!
> ### ERROR ### Please RESET the board ###
>

Based on your log, it matches what I suspected and we are getting
close to the root cause.

Could you please try the following simple test case to see if it can
trigger the issue with a USB disk?
After U-Boot boot on LS1028, with a USB flash disk attached, then type
"usb start" to see if the USB disk can be recognized by U-Boot?

commit 60e68ed667362870c20b36ae26dacc1af903e81e
Author: Bin Meng <bmeng.cn at gmail.com>
Date:   Tue Nov 10 16:19:06 2020 +0800

    WIP: usb: A simple test case to trigger the TRB 64K boundary issue
with mass storage device

    This should not be applied as it aims to provide a simple test case to
    trigger the TRB 64K boundary issue as reported by Ran Wang @ NXP.

    Signed-off-by: Bin Meng <bmeng.cn at gmail.com>

diff --git a/common/usb_storage.c b/common/usb_storage.c
index ff25441..c8aec2e 100644
--- a/common/usb_storage.c
+++ b/common/usb_storage.c
@@ -710,7 +710,15 @@ static int usb_stor_BBB_transport(struct scsi_cmd
*srb, struct us_data *us)
        int dir_in;
        int actlen, data_actlen;
        unsigned int pipe, pipein, pipeout;
+#if 0
        ALLOC_CACHE_ALIGN_BUFFER(struct umass_bbb_csw, csw, 1);
+#else
+       struct umass_bbb_csw *csw_org, *csw;
+       csw_org = malloc(0x10000 + UMASS_BBB_CSW_SIZE);
+       csw = (struct umass_bbb_csw *)roundup((ulong)csw_org, 0x10000);
+       csw = (struct umass_bbb_csw *)((ulong)csw - 2);
+       printf("csw org %p, adjusted to %p\n", csw_org, csw);
+#endif
 #ifdef BBB_XPORT_TRACE
        unsigned char *ptr;
        int index;
@@ -824,6 +832,7 @@ again:
                return USB_STOR_TRANSPORT_FAILED;
        }

+       free(csw_org);
        return result;
 }


Regards,
Bin


More information about the U-Boot mailing list