[U-Boot] tegra124 jetson-tk1 ethernet problems.

Stephen Warren swarren at wwwdotorg.org
Tue Aug 2 18:56:02 CEST 2016


On 08/01/2016 09:20 PM, Peter Chubb wrote:
>
> Hi Folks,
>    Since patch 96350f729c42 "dm: tegra: net: Convert tegra boards to
>    driver model for Ethernet" booting via dhcp has been broken on the
>    Jetson TK1.

Best to Cc the net and DM maintainers (Joe Hershberger and Simon Glass) 
on this to make sure they see the issue. I've done so here.

>    I tried applying "net: Probe PCI before looking for ethernet
>    devices"; this `works' in that the ethernet device is detected and
>    works,  but I end up with huge numbers of
>       CACHE: Misaligned operation at range [fffb8c00, fffb8c2e]
>    messages on the serial console.
 >
>    These come from the flush_cache() calls in net/rtl8169.c.  I
>    suggest the attached patch (or something like it):

Interesting; in my automated testing system, I do see these cache error 
messages, yet ping and TFTP work for me. Admittedly my test setup uses a 
static IP configuration rather than DHCP.

> diff --git a/drivers/net/rtl8169.c b/drivers/net/rtl8169.c

>  static void rtl_flush_rx_desc(struct RxDesc *desc)
>  {
>  #ifndef CONFIG_SYS_NONCACHED_MEMORY
> -	flush_cache((unsigned long)desc, sizeof(*desc));
> +    	unsigned long start = (unsigned long)desc & ~(ARCH_DMA_MINALIGN - 1);
> +	unsigned long size = ALIGN(sizeof(*desc), ARCH_DMA_MINALIGN);
> +
> +	flush_cache(start, size);
>  #endif
>  }
>
> @@ -493,21 +496,28 @@ static void rtl_inval_tx_desc(struct TxDesc *desc)
>  static void rtl_flush_tx_desc(struct TxDesc *desc)
>  {
>  #ifndef CONFIG_SYS_NONCACHED_MEMORY
> -	flush_cache((unsigned long)desc, sizeof(*desc));
> +	unsigned long start = (unsigned long)desc & ~(ARCH_DMA_MINALIGN - 1);
> +	unsigned long sz = ALIGN(sizeof *desc, ARCH_DMA_MINALIGN);
> +
> +	flush_cache(start, sz);
>  #endif
>  }

Those two are wrong. Hopefully neither of those changes do anything on 
Jetson TK1, since CONFIG_SYS_NONCACHED_MEMORY is enabled there.

The cache line is likely larger than the individual descriptor size, so 
rounding up the flush length to a whole cache line will likely end up 
flushing more descriptors than you want. This will eventually result in 
SW over-writing a HW update to another descriptor, and so at least lose 
packets. For this reason, CONFIG_SYS_NONCACHED_MEMORY must be set if the 
descriptor and cache line sizes don't match, except for systems where IO 
is fully cache-coherent and hence the cache operations are no-ops.

>  static void rtl_inval_buffer(void *buf, size_t size)
>  {
> -	unsigned long start = (unsigned long)buf & ~(ARCH_DMA_MINALIGN - 1);
> -	unsigned long end = ALIGN(start + size, ARCH_DMA_MINALIGN);
> +	unsigned long end = ALIGN((unsigned long)buf + size, ARCH_DMA_MINALIGN);
>
> -	invalidate_dcache_range(start, end);
> +        /* buf is aligned to RTL8169_ALIGN,
> +         * which is a multiple of ARCH_DMA_ALIGN
> +         */
> +	invalidate_dcache_range((unsigned long)buf, end);
>  }
>
>  static void rtl_flush_buffer(void *buf, size_t size)
>  {
> -	flush_cache((unsigned long)buf, size);
> +	unsigned long sz = ALIGN(size, ARCH_DMA_MINALIGN);
> +
> +	flush_cache((unsigned long)buf, sz);
>  }

I believe the correct approach is for the caller (network core code) to 
provide cache-aligned buffers, rather than for each driver to align the 
start/size when performing cache operations. Again, this is to ensure 
that cache operations don't affect any other data adjacent to the 
buffer. Can you see why the network core code isn't using cache-aligned 
buffers when DM_ETH is enabled? Perhaps DM_ETH isn't the trigger, but 
just changed something about the memory layout that exposed some other 
pre-existing issue.


More information about the U-Boot mailing list