[U-Boot] [PATCH] fastboot: Fix OUT transaction length alignment
Lukasz Majewski
l.majewski at samsung.com
Mon Apr 18 15:47:44 CEST 2016
Hi Roger, Steve,
> +Lukazs
>
> On 18/04/16 10:56, Roger Quadros wrote:
> > On 15/04/16 22:44, Steve Rae wrote:
> >>
> >>
> >> On Thu, Apr 14, 2016 at 3:18 AM, Roger Quadros <rogerq at ti.com
> >> <mailto:rogerq at ti.com>> wrote:
> >>
> >> Hi,
> >>
> >> On 13/04/16 19:56, Sam Protsenko wrote:
> >> > On Wed, Apr 13, 2016 at 3:32 PM, Roger Quadros
> >> > <rogerq at ti.com <mailto:rogerq at ti.com>> wrote:
> >> >> Hi,
> >> >>
> >> >> On 13/04/16 15:01, Semen Protsenko wrote:
> >> >>> From: Sam Protsenko <semen.protsenko at linaro.org
> >> >>> <mailto:semen.protsenko at linaro.org>>
> >> >>>
> >> >>> Some UDC controllers may require buffer size to be aligned
> >> >>> to wMaxPacketSize. It's indicated by
> >> >>> gadget->quirk_ep_out_aligned_size field being set to
> >> >>> "true" (in UDC driver code). In that case
> >> >>> rx_bytes_expected must be aligned to wMaxPacket size,
> >> >>> otherwise stuck on transaction will happen. For example,
> >> >>> it's required by DWC3 controller data manual:
> >> >>>
> >> >>> section 8.2.3.3 Buffer Size Rules and Zero-Length
> >> >>> Packets:
> >> >>>
> >> >>> For OUT endpoints, the following rules apply:
> >> >>> - The BUFSIZ field must be ≥ 1 byte.
> >> >>> - The total size of a Buffer Descriptor must be a
> >> >>> multiple of MaxPacketSize
> >> >>> - A received zero-length packet still requires a
> >> >>> MaxPacketSize buffer. Therefore, if the expected amount of
> >> >>> data to be received is a multiple of MaxPacketSize,
> >> >>> software should add MaxPacketSize bytes to the buffer to
> >> >>> sink a possible zero-length packet at the end of the
> >> >>> transfer.
> >> >>>
> >> >>> But other UDC controllers don't need such alignment, so
> >> >>> mentioned field is set to "false". If buffer size is
> >> >>> aligned to wMaxPacketSize, those controllers may stuck on
> >> >>> transaction. The example is DWC2.
> >> >>>
> >> >>> This patch checks gadget->quirk_ep_out_aligned_size field
> >> >>> and aligns rx_bytes_expected to wMaxPacketSize only when
> >> >>> it's needed.
> >> >>>
> >> >>> Signed-off-by: Sam Protsenko <semen.protsenko at linaro.org
> >> >>> <mailto:semen.protsenko at linaro.org>> ---
> >> >>> drivers/usb/gadget/f_fastboot.c | 9 +++++++++
> >> >>> 1 file changed, 9 insertions(+)
> >> >>>
> >> >>> diff --git a/drivers/usb/gadget/f_fastboot.c
> >> >>> b/drivers/usb/gadget/f_fastboot.c index 2e87fee..54dcce0
> >> >>> 100644 --- a/drivers/usb/gadget/f_fastboot.c
> >> >>> +++ b/drivers/usb/gadget/f_fastboot.c
> >> >>> @@ -58,6 +58,7 @@ static unsigned int
> >> >>> fastboot_flash_session_id; static unsigned int
> >> >>> download_size; static unsigned int download_bytes;
> >> >>> static bool is_high_speed;
> >> >>> +static bool quirk_ep_out_aligned_size;
> >> >>>
> >> >>> static struct usb_endpoint_descriptor fs_ep_in = {
> >> >>> .bLength = USB_DT_ENDPOINT_SIZE,
> >> >>> @@ -240,6 +241,8 @@ static int fastboot_set_alt(struct
> >> >>> usb_function *f, debug("%s: func: %s intf: %d alt: %d\n",
> >> >>> __func__, f->name, interface, alt);
> >> >>>
> >> >>> + quirk_ep_out_aligned_size =
> >> >>> gadget->quirk_ep_out_aligned_size; +
> >> >>> /* make sure we don't enable the ep twice */
> >> >>> if (gadget->speed == USB_SPEED_HIGH) {
> >> >>> ret = usb_ep_enable(f_fb->out_ep,
> >> >>> &hs_ep_out); @@ -435,12 +438,18 @@ static unsigned int
> >> >>> rx_bytes_expected(unsigned int maxpacket) return 0;
> >> >>> if (rx_remain > EP_BUFFER_SIZE)
> >> >>> return EP_BUFFER_SIZE;
> >> >>> +
> >> >>> + if (!quirk_ep_out_aligned_size)
> >> >>> + goto out;
> >> >>> +
> >> >>> if (rx_remain < maxpacket) {
> >> >>> rx_remain = maxpacket;
> >> >>> } else if (rx_remain % maxpacket != 0) {
> >> >>> rem = rx_remain % maxpacket;
> >> >>> rx_remain = rx_remain + (maxpacket - rem);
> >> >>> }
> >> >>> +
> >> >>> +out:
> >> >>> return rx_remain;
> >> >>> }
> >> >>>
> >> >>>
> >> >>
> >> >> Why do we need a special flag for this driver if other
> >> >> drivers e.g. mass storage are doing perfectly fine without
> >> >> it?
> >> >>
> >> >
> >> > I don't know how it works in other gadgets, but please see
> >> > this patch in kernel: [1]. That patch is doing just the same
> >> > as I did (and also in gadget code), using
> >> > usb_ep_align_maybe() function for alignment.
> >>
> >> NOTE: there haven't been such quirks in the kernel drivers
> >> except for that one driver that has a user mode interface and
> >> needs more moral policing for user provided buffers. So that
> >> example is not something we should be using as reference. Our
> >> buffers are alreay aligned to maxpacket size. The only thing we
> >> need to worry about is the length of the last transaction that is
> >> not integral multiple of maxpacket size.
> >>
> >> If my understanding is right all USB controllers should work
> >> fine with bulk OUT requests that are integral multiple of
> >> maxpacket size. So we shouldn't be needing any quirk flags.
> >>
> >> >
> >> >> This patch is just covering up the real problem, by
> >> >> bypassing the faulty code with a flag.
> >> >>
> >> >> The buffer size is EP_BUFFER_SIZE and is already aligned to
> >> >> wMaxPacketSize so the problem shouldn't have happened in
> >> >> the first place. But it is happening indicating something
> >> >> else is wrong.
> >> >>
> >> >
> >> > There is what I'm observing on platform with DWC3 controller:
> >> > - when doing "fastboot flash xloader MLO":
> >> > - the whole size of data to send is 70964 bytes
> >> > - the size of all packets (except of last packet) is 4096
> >> > bytes
> >> > - so those are being sent just fine (in req->complete,
> >> > which is rx_handler_dl_image() function)
> >> > - but last packet has size of 1332 bytes (because 70964 %
> >> > 4096 = 1332)
> >> > - when its req->length is aligned to wMaxPacketSize (so
> >> > it's 1536 bytes): after we send it using usb_ep_queue(), the
> >> > req->complete callback is called one last time and we see
> >> > that transmission is finished (download_bytes >=
> >> > download_size)
> >> > - but when its req->length is not aligned to wMaxPacketSize
> >> > (so it's 1332 bytes): req->complete callback is not called
> >> > last time, so transaction is not finished and we are stuck
> >> > in "fastboot flash"
> >> >
> >> > So I guess the issue is real and related to DWC3 quirk. If
> >> > you have any thoughts regarding other possible causes of
> >> > this problem -- please share. I can't predict all possible
> >> > causes as I'm not USB expert.
> >>
> >> I've tried to clean up the bulk out handling code in the below
> >> patch. Note you will need to apply this on top of the 2 patches I
> >> sent earlier. https://patchwork.ozlabs.org/patch/609417/
> >> https://patchwork.ozlabs.org/patch/609896/
> >>
> >> Steve, do let me know if it works for you. If it doesn't then
> >> we need to figure out why. Can you please share details about the
> >> USB controller on your board? Does it not like OUT requests that
> >> are aligned to maxpacket size? What does ep->maxpacket show for
> >> your board?
> >>
> >>
> >> This (still) does not work....
> >> - using the standard U-Boot DWC2 driver,
> >> - do not know if it doesn't like "OUT requests that are aligned to
> >> maxpacket size" -- perhaps @Lukasz could answer this....
> >> - ep->maxpacket is 512
> >>
> >> with logging in "drivers/usb/gadget/dwc2_udc_otg.c" enabled,
> >> output is (for the last transactions...):
> >>
> >> *** dwc2_udc_irq : GINTSTS=0x14088028(on state WAIT_FOR_SETUP),
> >> GINTMSK : 0x800c3800,DAINT : 0x40000, DAINTMSK : 0x50003 ***
> >> process_ep_out_intr: EP OUT interrupt : DAINT = 0x40000 EP2-OUT :
> >> DOEPINT = 0x2011 complete_rx: RX DMA done : ep = 2, rx bytes =
> >> 4096/4096, is_short = 0, DOEPTSIZ = 0x0, remained bytes = 4096
> >> complete_rx: Next Rx request start... setdma_rx: EP2 RX DMA
> >> start : DOEPDMA = 0xffb84f80,DOEPTSIZ = 0x401000, DOEPCTL =
> >> 0x80098200 buf = 0xffb84f80, pktcnt = 8, xfersize = 4096
> >
It is _real_ hard for me to debug protocol which I'm not using on HW
which I don't posses.
However, I will do my best to fix this bug. Hence please be patient with
my questions.
Steve, how much bytes do you send?
> > OK so we asked for 4096 bytes and looks like we received 3968 bytes,
> > which is probably the end of transfer.
I think that you refer to the below code.
> >
> >>
> >> *** dwc2_udc_irq : GINTSTS=0x14088028(on state WAIT_FOR_SETUP),
> >> GINTMSK : 0x800c3800,DAINT : 0x40000, DAINTMSK : 0x50003 ***
> >> process_ep_out_intr: EP OUT interrupt : DAINT = 0x40000 EP2-OUT :
> >> DOEPINT = 0x2011 complete_rx: RX DMA done : ep = 2, rx bytes =
> >> 3968/4096,
So we have following situation here:
We received 3968B from 4096B (missing 128B)
> is_short = 0
This should be equal to 1.
> , DOEPTSIZ = 0x80,
We are supposed (now) to receive 128 B more.
> remained bytes = 3968
This is totally wrong. Here we should have 128 B.
> >> setdma_rx: EP2 RX DMA start : DOEPDMA = 0xffb85f00,DOEPTSIZ =
> >> 0x80080, DOEPCTL = 0x80098200 buf = 0xffb85f00, pktcnt = 1,
> >> xfersize = 128
> >>
> >>>>>> and it hangs here!!!
> >
> > The dwc2 driver should have returned then and not queued another
> > 128 bytes. IMO there is a bug in the dwc2 driver.
> >
> > 3968 = 7 x 512 + 384
> > This means the last packet (384 bytes) was a short packet and it
> > signals end of transfer so the dwc2 driver shouldn't be queuing
> > another transfer. It should end the usb ep_queue request.
> >
> > So, question to dwc2 mantainers.
> > Can we modify dwc2 driver to not automatically queue a pending
> > transfer if the transfer ended in a short packet?
>
> BTW, this is mentioned in the USB Specification.
> Ref: USB2.0 Secification: Section. 5.3.2 Pipes
>
> "An IRP (I/O request packet) may require multiple data payloads to
> move the client data over the bus. The data payloads for such a
> multiple data payload IRP are expected to be of the maximum packet
> size until the last data payload that contains the remainder of the
> overall IRP. See the description of each transfer type for more
> details. For such an IRP, short packets (i.e., less than
> maximum-sized data payloads)
> on input that do not completely fill an
> IRP data buffer can have one of two possible meanings, depending upon
> the expectations of a client: • A client can expect a variable-sized
> amount of data in an IRP. In this case, a short packet that does not
> fill an IRP data buffer can be used simply as an in-band delimiter to
> indicate “end of unit of data.” The IRP should be retired without
> error and the Host Controller should advance to the next IRP. • A
> client can expect a specific-sized amount of data. In this case, a
> short packet that does not fill an IRP data buffer is an indication
> of an error. The IRP should be retired, the pipe should be stalled,
> and any pending IRPs associated with the pipe should also be retired."
>
> I think we want the controller to behave as the first case since we
> haven't set the short_not_ok flag in the USB request.
short_not_ok flag is used solely in USB Mass Storage gadget (which is
correct for it).
However, dwc2 UDC driver is not explicitly supporting it (but
unfortunately does it implicitly).
This of course should be fixed.
One question to Steve - could you check if below Roger's change fixes
your problem?
I will also test it - however, current tests aren't covering this
situation:
1. DFU uses EP0 (not EPn bulk) - this works since we can send or receive
e.g. 65 B
2. USB Mass Storage expects (UMS) transmission to be a multiple
of wMaxPacketSize (there is some caching done) and uses short_not_ok
flag.
3. THOR protocol is padded to LBA size (512 B) as it is convenient for
eMMC flashing.
I do need to come up with new one.
Roger, thanks for your support and effort.
>
> So the below patch to the dwc2 driver should fix it.
>
> --
> cheers,
> -roger
>
> diff --git a/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
> b/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c index bce9c30..a31d875
> 100644 --- a/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
> +++ b/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
> @@ -229,7 +229,7 @@ static void complete_rx(struct dwc2_udc *dev, u8
> ep_num) ROUND(xfer_size, CONFIG_SYS_CACHELINE_SIZE));
>
> req->req.actual += min(xfer_size, req->req.length -
> req->req.actual);
> - is_short = (xfer_size < ep->ep.maxpacket);
> + is_short = xfer_size % ep->ep.maxpacket;
>
> debug_cond(DEBUG_OUT_EP != 0,
> "%s: RX DMA done : ep = %d, rx bytes = %d/%d, "
>
I will test this change.
--
Best regards,
Lukasz Majewski
Samsung R&D Institute Poland (SRPOL) | Linux Platform Group
More information about the U-Boot
mailing list