[U-Boot] [PATCH] net: fec: Avoid MX28 bus sync issue

Marek Vasut marex at denx.de
Thu Sep 12 12:50:36 CEST 2013


Dear Hector Palacios,

> Hello,
> 
> Going back to this old thread I have some news regarding the problem with
> TFTP transmissions blocking (timed out) after 10 seconds on the FEC of the
> MX28. See below:
> 
> On 07/17/2013 05:55 PM, Hector Palacios wrote:
> > Dear Marek,
> > 
> > On 07/16/2013 06:44 AM, Marek Vasut wrote:
> >> Dear Fabio Estevam,
> >> 
> >>> On Tue, Jul 16, 2013 at 12:51 AM, Fabio Estevam <festevam at gmail.com> 
wrote:
> >>>> On Mon, Jul 15, 2013 at 12:09 PM, Hector Palacios
> >>>> 
> >>>> <hector.palacios at digi.com> wrote:
> >>>>> @Fabio: could you manually run the command 'tftp ${loadaddr}
> >>>>> file100M' in your EVK?
> >>>> 
> >>>> Yes, this is what I have been running since the beginning.
> >>>> 
> >>>>> If it doesn't fail, could you try running it again after playing with
> >>>>> the environment (setting/printing some variables).
> >>>> 
> >>>> I can't reproduce the problem here.
> >>>> 
> >>>>> As I said, this issue appeared with different TFTP servers and is
> >>>>> independent of whether the dcache is or not enabled.
> >>>> 
> >>>> Can you transfer from a PC to another PC via TFTP?
> > 
> > Yes I can.
> > 
> >> Another thing of interest would be a 'tcpdump' pcap capture of that
> >> connection.
> > 
> > I was initially filtering out only TFTP packets of my wireshark trace and
> > all looked correct. After taking a second look to the full trace I see
> > now a hint. Around 7 seconds after starting the TFTP transfer the server
> > is sending an ARP to the target asking for the owner of the target's IP.
> > The target is receiving this ARP and apparently responding (at least
> > this is what my debug code shows as it gets into arp.c:ArpReceive(),
> > case ARPOP_REQUEST and sending a packet), but this ARP reply from the
> > target is not reaching the network. My sniffer does not capture this
> > reply.
> > 
> > The server resends the ARP request twice more (seconds 8 and 9) to the
> > target and since it doesn't get a reply then sends a broadcast ARP
> > (seconds 10) asking who has that IP. Since nobody responds it stops
> > sending data.
> > 
> > The times that it works (and I don't know the magic behind using a
> > numeric address versus using ${loadaddr} when they have the same value),
> > the ARP replies do reach the network and the server continues the
> > transmission normally.
> > 
> > Using a v2009 U-Boot, the behaviour is exactly the same, but the target's
> > ARP replies always reach the network, and the transfers always succeed.
> > 
> > Since Fabio cannot reproduce it I guess it must be a local ghost. :o(
> 
> We tracked down the issue to an ARP request from the server that was never
> answered by the target. We later noticed that the problem did not happen
> anymore when building U-Boot with a different toolchain and that the issue
> seemed to be in the alignment of the RX buffer in the stack, which old GCC
> compilers seem to do wrong.
> 
> Here is a patch:
> 
> From: Robert Hodaszi <robert.hodaszi at digi.com>
> Date: Fri, 6 Sep 2013 09:50:52 +0200
> Subject: [PATCH] net: fec: fix invalid temporary RX buffer alignment
> because of GCC bug
> 
> Older GCC versions don't handle well alignment on stack variables.
> The temporary RX buffer is a local variable, so it is on the stack.
> Because the FEC driver is using DMA for transmission, receive and
> transmit buffers should be aligned on 64 byte. The transmit buffers are
> not allocated in the driver internally, it sends the packets directly
> as it gets them. So these packets should be aligned.
> When the ARP layer wants to reply to an ARP request, it uses the FEC
> driver's temporary RX buffer (used to pass data to the ARP layer) to
> store the new packet, and pass it back to the FEC driver's send function.
> Because of a GCC bug

Can you point to this GCC bug in the GCC bugzilla or something?

> this buffer is not aligned well, and when the
> driver tries to send it, it first rounds the address down to the
> alignment boundary. That causes invalid data.
> 
> To fix it, don't put the temporary onto the stack.
> 
> Signed-off-by: Robert Hodaszi <robert.hodaszi at digi.com>
> Signed-off-by: Hector Palacios <hector.palacios at digi.com>
> ---
>   drivers/net/fec_mxc.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/fec_mxc.c b/drivers/net/fec_mxc.c
> index f4f72b7..315017e 100644
> --- a/drivers/net/fec_mxc.c
> +++ b/drivers/net/fec_mxc.c
> @@ -828,7 +828,10 @@ static int fec_recv(struct eth_device *dev)
>          uint16_t bd_status;
>          uint32_t addr, size, end;
>          int i;
> -       uchar buff[FEC_MAX_PKT_SIZE] __aligned(ARCH_DMA_MINALIGN);
> +       /* Don't place this variable on the stack, because older GCC
> version +        * doesn't handle alignement on stack well.
> +        */
> +       static uchar buff[FEC_MAX_PKT_SIZE] __aligned(ARCH_DMA_MINALIGN);

The buffer might as well be allocated using ALLOC_CACHE_ALIGN_BUFFER() from 
include/common.h . Still, are you _really_ sure the buffer is unaligned ? Do you 
have a testcase maybe ?

btw. I am able to replicate this issue sometimes even using GCC 4.8.0 .

Best regards,
Marek Vasut


More information about the U-Boot mailing list