[U-Boot] [PATCH v2 2/5] ehci-hcd: Boost transfer speed

Stefan Herbrechtsmeier stefan at herbrechtsmeier.net
Mon Jul 23 15:35:25 CEST 2012


Am 20.07.2012 17:35, schrieb Benoît Thébaudeau:
> On Friday 20 July 2012 17:15:13 Stefan Herbrechtsmeier wrote:
>> Am 20.07.2012 17:03, schrieb Benoît Thébaudeau:
>>> On Friday 20 July 2012 16:51:33 Stefan Herbrechtsmeier wrote:
>>>> Am 20.07.2012 15:56, schrieb Benoît Thébaudeau:
>>>>> Dear Marek Vasut,
>>>>>
>>>>> On Friday 20 July 2012 15:44:01 Marek Vasut wrote:
>>>>>>> On Friday 20 July 2012 13:37:37 Stefan Herbrechtsmeier wrote:
>>>>>>>> Am 20.07.2012 13:26, schrieb Benoît Thébaudeau:
>>>>>>>>> +			int xfr_bytes = min(left_length,
>>>>>>>>> +					    (QT_BUFFER_CNT * 4096 -
>>>>>>>>> +					     ((uint32_t)buf_ptr & 4095)) &
>>>>>>>>> +					    ~4095);
>>>>>>>> Why you align the length to 4096?
>>>>>>> It's to guarantee that each transfer length is a multiple of
>>>>>>> the
>>>>>>> max packet
>>>>>>> length. Otherwise, early short packets are issued, which breaks
>>>>>>> the
>>>>>>> transfer and results in time-out error messages.
>>>>>> Early short packets ? What do you mean?
>>>>> During a USB transfer, all packets must have a length of max
>>>>> packet
>>>>> length for
>>>>> the pipe/endpoint, except the final one that can be a short
>>>>> packet.
>>>>> Without the
>>>>> alignment I make for xfr_bytes, short packets can occur within a
>>>>> transfer,
>>>>> because the hardware starts a new packet for each new queued qTD
>>>>> it
>>>>> handles.
>>>> But if I am right, the max packet length is 512 for bulk and 1024
>>>> for
>>>> Interrupt transfer.
>>> There are indeed different max packet lengths for different
>>> transfer types, but
>>> it does not matter since the chosen alignment guarantees a multiple
>>> of all these
>>> possible max packet lengths.
>> But thereby you limit the transfer to 4 qT buffers for unaligned
>> transfers.
> Not exactly. The 5 qt_buffers are used for page-unaligned buffers, but that
> results in only 4 full pages of unaligned data, requiring 5 aligned pages.
Sorry I mean 4 full pages of unaligned data.
>
> For page-aligned buffers, the 5 qt_buffers result in 5 full pages of aligned
> data.
Sure.
>
> The unaligned case could be a little bit improved to always use as many packets
> as possible per qTD, but that would over-complicate things for a very negligible
> speed and memory gain.
In my use case (fragmented file on usb storage)  the gain would be 
nearly 20%. The reason is that the data are block aligned (512) and 
could be aligned to 4096 with the first transfer (5 qt_buffers).

My suggestion would be to truncate the xfr_bytes with the max 
wMaxPacketSize (1024) and for the qtd_count use:

if ((uint32_t)buffer & 1023)    /* wMaxPacketSize unaligned */
     qtd_count += DIV_ROUND_UP(((uint32_t)buffer & 4095) +
             length, (QT_BUFFER_CNT - 1) * 4096);
else                /* wMaxPacketSize aligned */
     qtd_count += DIV_ROUND_UP(((uint32_t)buffer & 4095) +
             length, QT_BUFFER_CNT * 4096);

This allows 50% of unaligned block data (512) to be transferred with min 
qTDs.



More information about the U-Boot mailing list