[U-Boot] [PATCH] fastboot: Fix OUT transaction length alignment

Steve Rae steve.rae at broadcom.com
Mon Apr 18 17:43:39 CEST 2016


On Mon, Apr 18, 2016 at 6:55 AM, Roger Quadros <rogerq at ti.com> wrote:
> On 18/04/16 16:47, Lukasz Majewski wrote:
>> Hi Roger, Steve,
>>
>>> +Lukazs
>>>
>>> On 18/04/16 10:56, Roger Quadros wrote:
>>>> On 15/04/16 22:44, Steve Rae wrote:
>>>>>
>>>>>
>>>>> On Thu, Apr 14, 2016 at 3:18 AM, Roger Quadros <rogerq at ti.com
>>>>> <mailto:rogerq at ti.com>> wrote:
>>>>>
>>>>>     Hi,
>>>>>
>>>>>     On 13/04/16 19:56, Sam Protsenko wrote:
>>>>>     > On Wed, Apr 13, 2016 at 3:32 PM, Roger Quadros
>>>>>     > <rogerq at ti.com <mailto:rogerq at ti.com>> wrote:
>>>>>     >> Hi,
>>>>>     >>
>>>>>     >> On 13/04/16 15:01, Semen Protsenko wrote:
>>>>>     >>> From: Sam Protsenko <semen.protsenko at linaro.org
>>>>>     >>> <mailto:semen.protsenko at linaro.org>>
>>>>>     >>>
>>>>>     >>> Some UDC controllers may require buffer size to be aligned
>>>>>     >>> to wMaxPacketSize. It's indicated by
>>>>>     >>> gadget->quirk_ep_out_aligned_size field being set to
>>>>>     >>> "true" (in UDC driver code). In that case
>>>>>     >>> rx_bytes_expected must be aligned to wMaxPacket size,
>>>>>     >>> otherwise stuck on transaction will happen. For example,
>>>>>     >>> it's required by DWC3 controller data manual:
>>>>>     >>>
>>>>>     >>>     section 8.2.3.3 Buffer Size Rules and Zero-Length
>>>>>     >>> Packets:
>>>>>     >>>
>>>>>     >>>     For OUT endpoints, the following rules apply:
>>>>>     >>>     - The BUFSIZ field must be ≥ 1 byte.
>>>>>     >>>     - The total size of a Buffer Descriptor must be a
>>>>>     >>> multiple of MaxPacketSize
>>>>>     >>>     - A received zero-length packet still requires a
>>>>>     >>> MaxPacketSize buffer. Therefore, if the expected amount of
>>>>>     >>> data to be received is a multiple of MaxPacketSize,
>>>>>     >>> software should add MaxPacketSize bytes to the buffer to
>>>>>     >>> sink a possible zero-length packet at the end of the
>>>>>     >>> transfer.
>>>>>     >>>
>>>>>     >>> But other UDC controllers don't need such alignment, so
>>>>>     >>> mentioned field is set to "false". If buffer size is
>>>>>     >>> aligned to wMaxPacketSize, those controllers may stuck on
>>>>>     >>> transaction. The example is DWC2.
>>>>>     >>>
>>>>>     >>> This patch checks gadget->quirk_ep_out_aligned_size field
>>>>>     >>> and aligns rx_bytes_expected to wMaxPacketSize only when
>>>>>     >>> it's needed.
>>>>>     >>>
>>>>>     >>> Signed-off-by: Sam Protsenko <semen.protsenko at linaro.org
>>>>>     >>> <mailto:semen.protsenko at linaro.org>> ---
>>>>>     >>>  drivers/usb/gadget/f_fastboot.c | 9 +++++++++
>>>>>     >>>  1 file changed, 9 insertions(+)
>>>>>     >>>
>>>>>     >>> diff --git a/drivers/usb/gadget/f_fastboot.c
>>>>>     >>> b/drivers/usb/gadget/f_fastboot.c index 2e87fee..54dcce0
>>>>>     >>> 100644 --- a/drivers/usb/gadget/f_fastboot.c
>>>>>     >>> +++ b/drivers/usb/gadget/f_fastboot.c
>>>>>     >>> @@ -58,6 +58,7 @@ static unsigned int
>>>>>     >>> fastboot_flash_session_id; static unsigned int
>>>>>     >>> download_size; static unsigned int download_bytes;
>>>>>     >>>  static bool is_high_speed;
>>>>>     >>> +static bool quirk_ep_out_aligned_size;
>>>>>     >>>
>>>>>     >>>  static struct usb_endpoint_descriptor fs_ep_in = {
>>>>>     >>>       .bLength            = USB_DT_ENDPOINT_SIZE,
>>>>>     >>> @@ -240,6 +241,8 @@ static int fastboot_set_alt(struct
>>>>>     >>> usb_function *f, debug("%s: func: %s intf: %d alt: %d\n",
>>>>>     >>>             __func__, f->name, interface, alt);
>>>>>     >>>
>>>>>     >>> +     quirk_ep_out_aligned_size =
>>>>>     >>> gadget->quirk_ep_out_aligned_size; +
>>>>>     >>>       /* make sure we don't enable the ep twice */
>>>>>     >>>       if (gadget->speed == USB_SPEED_HIGH) {
>>>>>     >>>               ret = usb_ep_enable(f_fb->out_ep,
>>>>>     >>> &hs_ep_out); @@ -435,12 +438,18 @@ static unsigned int
>>>>>     >>> rx_bytes_expected(unsigned int maxpacket) return 0;
>>>>>     >>>       if (rx_remain > EP_BUFFER_SIZE)
>>>>>     >>>               return EP_BUFFER_SIZE;
>>>>>     >>> +
>>>>>     >>> +     if (!quirk_ep_out_aligned_size)
>>>>>     >>> +             goto out;
>>>>>     >>> +
>>>>>     >>>       if (rx_remain < maxpacket) {
>>>>>     >>>               rx_remain = maxpacket;
>>>>>     >>>       } else if (rx_remain % maxpacket != 0) {
>>>>>     >>>               rem = rx_remain % maxpacket;
>>>>>     >>>               rx_remain = rx_remain + (maxpacket - rem);
>>>>>     >>>       }
>>>>>     >>> +
>>>>>     >>> +out:
>>>>>     >>>       return rx_remain;
>>>>>     >>>  }
>>>>>     >>>
>>>>>     >>>
>>>>>     >>
>>>>>     >> Why do we need a special flag for this driver if other
>>>>>     >> drivers e.g. mass storage are doing perfectly fine without
>>>>>     >> it?
>>>>>     >>
>>>>>     >
>>>>>     > I don't know how it works in other gadgets, but please see
>>>>>     > this patch in kernel: [1]. That patch is doing just the same
>>>>>     > as I did (and also in gadget code), using
>>>>>     > usb_ep_align_maybe() function for alignment.
>>>>>
>>>>>     NOTE: there haven't been such quirks in the kernel drivers
>>>>> except for that one driver that has a user mode interface and
>>>>> needs more moral policing for user provided buffers. So that
>>>>> example is not something we should be using as reference. Our
>>>>> buffers are alreay aligned to maxpacket size. The only thing we
>>>>> need to worry about is the length of the last transaction that is
>>>>> not integral multiple of maxpacket size.
>>>>>
>>>>>     If my understanding is right all USB controllers should work
>>>>> fine with bulk OUT requests that are integral multiple of
>>>>> maxpacket size. So we shouldn't be needing any quirk flags.
>>>>>
>>>>>     >
>>>>>     >> This patch is just covering up the real problem, by
>>>>>     >> bypassing the faulty code with a flag.
>>>>>     >>
>>>>>     >> The buffer size is EP_BUFFER_SIZE and is already aligned to
>>>>>     >> wMaxPacketSize so the problem shouldn't have happened in
>>>>>     >> the first place. But it is happening indicating something
>>>>>     >> else is wrong.
>>>>>     >>
>>>>>     >
>>>>>     > There is what I'm observing on platform with DWC3 controller:
>>>>>     >  - when doing "fastboot flash xloader MLO":
>>>>>     >  - the whole size of data to send is 70964 bytes
>>>>>     >  - the size of all packets (except of last packet) is 4096
>>>>>     > bytes
>>>>>     >  - so those are being sent just fine (in req->complete,
>>>>>     > which is rx_handler_dl_image() function)
>>>>>     >  - but last packet has size of 1332 bytes (because 70964 %
>>>>>     > 4096 = 1332)
>>>>>     >  - when its req->length is aligned to wMaxPacketSize (so
>>>>>     > it's 1536 bytes): after we send it using usb_ep_queue(), the
>>>>>     > req->complete callback is called one last time and we see
>>>>>     > that transmission is finished (download_bytes >=
>>>>>     > download_size)
>>>>>     >  - but when its req->length is not aligned to wMaxPacketSize
>>>>>     > (so it's 1332 bytes): req->complete callback is not called
>>>>>     > last time, so transaction is not finished and we are stuck
>>>>>     > in "fastboot flash"
>>>>>     >
>>>>>     > So I guess the issue is real and related to DWC3 quirk. If
>>>>>     > you have any thoughts regarding other possible causes of
>>>>>     > this problem -- please share. I can't predict all possible
>>>>>     > causes as I'm not USB expert.
>>>>>
>>>>>     I've tried to clean up the bulk out handling code in the below
>>>>> patch. Note you will need to apply this on top of the 2 patches I
>>>>> sent earlier. https://patchwork.ozlabs.org/patch/609417/
>>>>>     https://patchwork.ozlabs.org/patch/609896/
>>>>>
>>>>>     Steve, do let me know if it works for you. If it doesn't then
>>>>> we need to figure out why. Can you please share details about the
>>>>> USB controller on your board? Does it not like OUT requests that
>>>>> are aligned to maxpacket size? What does ep->maxpacket show for
>>>>> your board?
>>>>>
>>>>>
>>>>> This (still) does not work....
>>>>> - using the standard U-Boot DWC2 driver,
>>>>> - do not know if it doesn't like "OUT requests that are aligned to
>>>>> maxpacket size" -- perhaps @Lukasz could answer this....
>>>>> -  ep->maxpacket is 512
>>>>>
>>>>> with logging in "drivers/usb/gadget/dwc2_udc_otg.c" enabled,
>>>>> output is (for the last transactions...):
>>>>>
>>>>> *** dwc2_udc_irq : GINTSTS=0x14088028(on state WAIT_FOR_SETUP),
>>>>> GINTMSK : 0x800c3800,DAINT : 0x40000, DAINTMSK : 0x50003 ***
>>>>> process_ep_out_intr: EP OUT interrupt : DAINT = 0x40000 EP2-OUT :
>>>>> DOEPINT = 0x2011 complete_rx: RX DMA done : ep = 2, rx bytes =
>>>>> 4096/4096, is_short = 0, DOEPTSIZ = 0x0, remained bytes = 4096
>>>>> complete_rx: Next Rx request start... setdma_rx: EP2 RX DMA
>>>>> start : DOEPDMA = 0xffb84f80,DOEPTSIZ = 0x401000, DOEPCTL =
>>>>> 0x80098200 buf = 0xffb84f80, pktcnt = 8, xfersize = 4096
>>>>
>>
>> It is _real_ hard for me to debug protocol which I'm not using on HW
>> which I don't posses.
>>
>> However, I will do my best to fix this bug. Hence please be patient with
>> my questions.
>>
>> Steve, how much bytes do you send?
>>
>>>> OK so we asked for 4096 bytes and looks like we received 3968 bytes,
>>>> which is probably the end of transfer.
>>
>> I think that you refer to the below code.
>>
>>>>
>>>>>
>>>>> *** dwc2_udc_irq : GINTSTS=0x14088028(on state WAIT_FOR_SETUP),
>>>>> GINTMSK : 0x800c3800,DAINT : 0x40000, DAINTMSK : 0x50003 ***
>>>>> process_ep_out_intr: EP OUT interrupt : DAINT = 0x40000 EP2-OUT :
>>>>> DOEPINT = 0x2011 complete_rx: RX DMA done : ep = 2, rx bytes =
>>>>> 3968/4096,
>>
>> So we have following situation here:
>> We received 3968B from 4096B (missing 128B)
>>
>>> is_short = 0
>>
>> This should be equal to 1.
>
> This should be fixed by the patch I sent inline.
>
>>
>>> , DOEPTSIZ = 0x80,
>>
>> We are supposed (now) to receive 128 B more.
>>
>>> remained bytes = 3968
>>
>> This is totally wrong. Here we should have 128 B.
>
> Exactly. The debug print is using the old xfersize to print remaining size.
> That must be fixed as well.
> I can do that along with the official patch once Steve confirms that it works.
>
>>
>>>>> setdma_rx: EP2 RX DMA start : DOEPDMA = 0xffb85f00,DOEPTSIZ =
>>>>> 0x80080, DOEPCTL = 0x80098200 buf = 0xffb85f00, pktcnt = 1,
>>>>> xfersize = 128
>>>>>
>>>>>>>>> and it hangs here!!!
>>>>
>>>> The dwc2 driver should have returned then and not queued another
>>>> 128 bytes. IMO there is a bug in the dwc2 driver.
>>>>
>>>> 3968 = 7 x 512 + 384
>>>> This means the last packet (384 bytes) was a short packet and it
>>>> signals end of transfer so the dwc2 driver shouldn't be queuing
>>>> another transfer. It should end the usb ep_queue request.
>>>>
>>>> So, question to dwc2 mantainers.
>>>> Can we modify dwc2 driver to not automatically queue a pending
>>>> transfer if the transfer ended in a short packet?
>>>
>>> BTW, this is mentioned in the USB Specification.
>>> Ref: USB2.0 Secification: Section. 5.3.2 Pipes
>>>
>>> "An IRP (I/O request packet) may require multiple data payloads to
>>> move the client data over the bus. The data payloads for such a
>>> multiple data payload IRP are expected to be of the maximum packet
>>> size until the last data payload that contains the remainder of the
>>> overall IRP. See the description of each transfer type for more
>>> details. For such an IRP, short packets (i.e., less than
>>> maximum-sized data payloads)
>>> on input that do not completely fill an
>>> IRP data buffer can have one of two possible meanings, depending upon
>>> the expectations of a client: • A client can expect a variable-sized
>>> amount of data in an IRP. In this case, a short packet that does not
>>> fill an IRP data buffer can be used simply as an in-band delimiter to
>>> indicate “end of unit of data.” The IRP should be retired without
>>> error and the Host Controller should advance to the next IRP. • A
>>> client can expect a specific-sized amount of data. In this case, a
>>> short packet that does not fill an IRP data buffer is an indication
>>> of an error. The IRP should be retired, the pipe should be stalled,
>>> and any pending IRPs associated with the pipe should also be retired."
>>>
>>> I think we want the controller to behave as the first case since we
>>> haven't set the short_not_ok flag in the USB request.
>>
>> short_not_ok flag is used solely in USB Mass Storage gadget (which is
>> correct for it).
>>
>> However, dwc2 UDC driver is not explicitly supporting it (but
>> unfortunately does it implicitly).
>>
>> This of course should be fixed.
>>
>>
>> One question to Steve - could you check if below Roger's change fixes
>> your problem?
>>
>> I will also test it - however, current tests aren't covering this
>> situation:
>>
>> 1. DFU uses EP0 (not EPn bulk) - this works since we can send or receive
>> e.g. 65 B
>>
>> 2. USB Mass Storage expects (UMS) transmission to be a multiple
>> of wMaxPacketSize (there is some caching done) and uses short_not_ok
>> flag.
>>
>> 3. THOR protocol is padded to LBA size (512 B) as it is convenient for
>> eMMC flashing.
>>
>> I do need to come up with new one.
>>
>>
>> Roger, thanks for your support and effort.
>
> No problem :).
>>
>>>
>>> So the below patch to the dwc2 driver should fix it.
>>>
>>> --
>>> cheers,
>>> -roger
>>>
>>> diff --git a/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
>>> b/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c index bce9c30..a31d875
>>> 100644 --- a/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
>>> +++ b/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
>>> @@ -229,7 +229,7 @@ static void complete_rx(struct dwc2_udc *dev, u8
>>> ep_num) ROUND(xfer_size, CONFIG_SYS_CACHELINE_SIZE));
>>>
>>>      req->req.actual += min(xfer_size, req->req.length -
>>> req->req.actual);
>>> -    is_short = (xfer_size < ep->ep.maxpacket);
>>> +    is_short = xfer_size % ep->ep.maxpacket;
>>>
>>>      debug_cond(DEBUG_OUT_EP != 0,
>>>                 "%s: RX DMA done : ep = %d, rx bytes = %d/%d, "
>>>

Roger, Lukasz - this gets past the "hang" on my boards, and it
completes successfully! BTW, the last packet is now:

+++++ SRAE: (new) rx_remain=520, ep->maxpacket=512

complete_rx: Next Rx request start...
setdma_rx: EP2 RX DMA start : DOEPDMA = 0xffb84f80,DOEPTSIZ =
0x100400, DOEPCTL = 0x80098200
        buf = 0xffb84f80, pktcnt = 2, xfersize = 1024

*** dwc2_udc_irq : GINTSTS=0x14088028(on state WAIT_FOR_SETUP),
GINTMSK : 0x800c3800,DAINT : 0x40000, DAINTMSK : 0x50003
*** process_ep_out_intr: EP OUT interrupt : DAINT = 0x40000
        EP2-OUT : DOEPINT = 0x2011
complete_rx: RX DMA done : ep = 2, rx bytes = 520/1024, is_short = 8,
DOEPTSIZ = 0x1f8, remained bytes = 520
dwc2_queue: ep_is_in, DWC2_UDC_OTG_GINTSTS=0x14008028
setdma_tx:EP1 TX DMA start : DIEPDMA0 = 0xffb85fc0,DIEPTSIZ0 =
0x80004, DIEPCTL0 = 0x80498200
        buf = 0xffb85fc0, pktcnt = 1, xfersize = 4

downloading of 258568 bytes finished


>>
>> I will test this change.
>>

Lukasz: would you _like_ to be able to test fastboot?
- do you have a board that uses a GPT table, and has USB-OTG support?
- do you want a Linux (Ubuntu) version of the fastboot client (Windows
is much more difficult - but it does work...)
Let me know, and I'll help you get it setup....

>
> Thanks.
>
> cheers,
> -roger


More information about the U-Boot mailing list