[PATCH] usb: ehci: Fix "EHCI timed out on TD - token=XXXX" error on ehci-hcd

Lukasz Majewski lukma at denx.de
Sun Mar 1 18:59:53 CET 2020


Hi Marek,

> On 3/1/20 6:19 PM, Lukasz Majewski wrote:
> > Hi Marek,  
> 
> Hi,
> 
> >> On 2/26/20 12:29 PM, Lukasz Majewski wrote:  
> >>> This patch aims to improve robustness of 'usb' command operation
> >>> on the ehci-hcd IP block as it ports to contemporary U-Boot the
> >>> patch described and provided in [1] (originally applicable to
> >>> U-Boot 2016.05).
> >>>
> >>> According to the fix author - "rayvt" (from [1]):    
> >>
> >> [...]
> >>  
> >>> diff --git a/common/usb_storage.c b/common/usb_storage.c
> >>> index 097b6729c1..48c8c2ae64 100644
> >>> --- a/common/usb_storage.c
> >>> +++ b/common/usb_storage.c
> >>> @@ -111,6 +111,18 @@ int usb_stor_get_info(struct usb_device *dev,
> >>> struct us_data *us, struct blk_desc *dev_desc);
> >>>  int usb_storage_probe(struct usb_device *dev, unsigned int ifnum,
> >>>  		      struct us_data *ss);
> >>> +
> >>> +#ifdef CONFIG_USB_EHCI_HCD
> >>> +	/*
> >>> +	 * The U-Boot EHCI driver can handle any transfer length
> >>> as long as
> >>> +	 * there is enough free heap space left, but the SCSI
> >>> READ(10) and
> >>> +	 * WRITE(10) commands are limited to 65535 blocks.
> >>> +	 */
> >>> +int usb_max_xfer_blk = 4096;
> >>> +#else
> >>> +int usb_max_xfer_blk = 20;
> >>> +#endif    
> >>
> >> This all looks horribly wrong and exactly what
> >> 7d6fd7f0ba71cd93d94079132f958d9630f27a89 and
> >> 02b0e1a36c5bc20174299312556ec4e266872bd680 fixed properly.
> >>
> >> All those "dynamic reduction of transfer size" attempts are
> >> nonsensical, the real solution (sadly) is to reduce the transfer
> >> size to cater for the most limited devices and profile/fix the
> >> remaining delays in the USB stack (which should have already been
> >> done, see the commits above). This is also what the Linux USB
> >> stack does.
> >>
> >> What is the problem you are trying to solve here ?  
> > 
> > As it was stated in the commit message:
> > 
> > On TPC70 (i.MX6Q) once per ~10 times (without this patch):
> > 
> > Bus usb at 2184200: USB EHCI 1.00
> > scanning bus usb at 2184200 for devices... 2 USB Device(s) found
> >        scanning usb for storage devices... 1 Storage Device(s) found
> > EHCI timed out on TD - token=0x1f8c80
> > 
> > 
> > The problem is that even with the mentioned fix in:
> > 7d6fd7f0ba71cd93d94079132f958d9630f27a89 from time to time I do see
> > the:
> > 
> > EHCI timed out on TD - token=0x1f8c80
> > 
> > error. The same issue was already reported by Tom.  
> 
> OK. Why do you see the error ? Can you dig in further ?

It is all explained in the link:
https://forum.doozan.com/read.php?3,35295,35295#msg-35295

> 
> > The description of the fix written by the original author:
> > https://forum.doozan.com/read.php?3,35295,35295#msg-35295
> > 
> > states that the reduction of the transfer is done just for "spin
> > up"/"detection" of the pen drive. The issue is that not all pen
> > drive disks are able to be properly detected in the first place.  
> 
> But the commit message claims this patch reduces the transfer rate by
> 20% always. So that's a NAK right away, sorry, 

Marek, now we do have from time to time NOT functional USB (once per
ten attempts on iMX6Q - tpc70 board). If somebody relies on 'usb start'
for recovery/normal operation - it is bad.

In such a use case - the 20% performance drop is acceptable.

> you need to find a
> better solution which doesn't have such an enormous impact.
> 
> I'm fine with the spin-up handling, but this auto-detection of
> transfer size should be pulled into separate patch and is likely
> wrong anyway.

I cannot agree that is it "likely wrong anyway" - please correct me if
I'm wrong, but up till now we were fiddling with transfer size up till
having "good enough" results.

The above link describes what needs to be done in case of the "EHCI
timed out on TD - token=0x1f8c80" error.

Do we handle it now? In my opinion we don't.

> 
> So try to separate this into two patches, one for handling the spin
> up, another one for this auto-detection of the transfer size. And I
> think you will end up needed only the first one.

This is an open question if only the spin up fixes the error. It would
be great if you could look for the changes introduced in this patch and
compare it with your experience of ehci driver development.

> 
> > Last but not least - the mainline is still broken. The
> > "token=0x1f8c80" errors pops up from time to time. However, after
> > applying the approach from this fix - the error is gone (and pass
> > some acceptance tests).  
> 
> It works for me.

Also works for Tom.

> Do you have a specific device which fails ?

Yes. On my desk - tpc70 with i.MX6Q. And hdc with i.MX53. I can confirm
that from U-Boot 2018.04 up till now the rate of:

"EHCI timed out on TD - token=0x1f8c80" error has been reduced
significantly for their use case (reading ~16MiB of binary from several
pen drives). 

Unfortunately the errors are still present from time to time.

> Which
> device and on which hardware ?
> 
> Also, what is "some acceptance test" ?
> 

The set of automated tests to assess if "usb start" don't break.


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma at denx.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20200301/e180647c/attachment.sig>


More information about the U-Boot mailing list