[U-Boot] tegra124 jetson-tk1 ethernet problems.
Joe Hershberger
joe.hershberger at gmail.com
Thu Aug 4 23:48:08 CEST 2016
On Thu, Aug 4, 2016 at 4:34 PM, Joe Hershberger
<joe.hershberger at gmail.com> wrote:
> On Thu, Aug 4, 2016 at 4:22 PM, Joe Hershberger
> <joe.hershberger at gmail.com> wrote:
>> Hi Stephen,
>>
>> On Tue, Aug 2, 2016 at 11:56 AM, Stephen Warren <swarren at wwwdotorg.org> wrote:
>>> On 08/01/2016 09:20 PM, Peter Chubb wrote:
>>>>
>>>>
>>>> Hi Folks,
>>>> Since patch 96350f729c42 "dm: tegra: net: Convert tegra boards to
>>>> driver model for Ethernet" booting via dhcp has been broken on the
>>>> Jetson TK1.
>>>
>>>
>>> Best to Cc the net and DM maintainers (Joe Hershberger and Simon Glass) on
>>> this to make sure they see the issue. I've done so here.
>>>
>>>> I tried applying "net: Probe PCI before looking for ethernet
>>>> devices"; this `works' in that the ethernet device is detected and
>>>> works, but I end up with huge numbers of
>>>> CACHE: Misaligned operation at range [fffb8c00, fffb8c2e]
>>>> messages on the serial console.
>>>
>>>>
>>>>
>>>> These come from the flush_cache() calls in net/rtl8169.c. I
>>>> suggest the attached patch (or something like it):
>>>
>>>
>>> Interesting; in my automated testing system, I do see these cache error
>>> messages, yet ping and TFTP work for me. Admittedly my test setup uses a
>>> static IP configuration rather than DHCP.
>>>
>>>> diff --git a/drivers/net/rtl8169.c b/drivers/net/rtl8169.c
>>>
>>>
>>>> static void rtl_flush_rx_desc(struct RxDesc *desc)
>>>> {
>>>> #ifndef CONFIG_SYS_NONCACHED_MEMORY
>>>> - flush_cache((unsigned long)desc, sizeof(*desc));
>>>> + unsigned long start = (unsigned long)desc & ~(ARCH_DMA_MINALIGN -
>>>> 1);
>>>> + unsigned long size = ALIGN(sizeof(*desc), ARCH_DMA_MINALIGN);
>>>> +
>>>> + flush_cache(start, size);
>>>> #endif
>>>> }
>>>>
>>>> @@ -493,21 +496,28 @@ static void rtl_inval_tx_desc(struct TxDesc *desc)
>>>> static void rtl_flush_tx_desc(struct TxDesc *desc)
>>>> {
>>>> #ifndef CONFIG_SYS_NONCACHED_MEMORY
>>>> - flush_cache((unsigned long)desc, sizeof(*desc));
>>>> + unsigned long start = (unsigned long)desc & ~(ARCH_DMA_MINALIGN -
>>>> 1);
>>>> + unsigned long sz = ALIGN(sizeof *desc, ARCH_DMA_MINALIGN);
>>>> +
>>>> + flush_cache(start, sz);
>>>> #endif
>>>> }
>>>
>>>
>>> Those two are wrong. Hopefully neither of those changes do anything on
>>> Jetson TK1, since CONFIG_SYS_NONCACHED_MEMORY is enabled there.
>>>
>>> The cache line is likely larger than the individual descriptor size, so
>>> rounding up the flush length to a whole cache line will likely end up
>>> flushing more descriptors than you want. This will eventually result in SW
>>> over-writing a HW update to another descriptor, and so at least lose
>>> packets. For this reason, CONFIG_SYS_NONCACHED_MEMORY must be set if the
>>> descriptor and cache line sizes don't match, except for systems where IO is
>>> fully cache-coherent and hence the cache operations are no-ops.
>>
>> I agree... it this should (and does) require
>> CONFIG_SYS_NONCACHED_MEMORY in that case. This is in the top of this
>> driver:
>>
>> /*
>> * Warn if the cache-line size is larger than the descriptor size. In such
>> * cases the driver will likely fail because the CPU needs to flush the cache
>> * when requeuing RX buffers, therefore descriptors written by the hardware
>> * may be discarded.
>> *
>> * This can be fixed by defining CONFIG_SYS_NONCACHED_MEMORY which will cause
>> * the driver to allocate descriptors from a pool of non-cached memory.
>> */
>> #if RTL8169_DESC_SIZE < ARCH_DMA_MINALIGN
>> #if !defined(CONFIG_SYS_NONCACHED_MEMORY) && \
>> !defined(CONFIG_SYS_DCACHE_OFF) && !defined(CONFIG_X86)
>> #warning cache-line size is larger than descriptor size
>> #endif
>> #endif
>>
>>>> static void rtl_inval_buffer(void *buf, size_t size)
>>>> {
>>>> - unsigned long start = (unsigned long)buf & ~(ARCH_DMA_MINALIGN -
>>>> 1);
>>>> - unsigned long end = ALIGN(start + size, ARCH_DMA_MINALIGN);
>>>> + unsigned long end = ALIGN((unsigned long)buf + size,
>>>> ARCH_DMA_MINALIGN);
>>>>
>>>> - invalidate_dcache_range(start, end);
>>>> + /* buf is aligned to RTL8169_ALIGN,
>>>> + * which is a multiple of ARCH_DMA_ALIGN
>>>> + */
>>>> + invalidate_dcache_range((unsigned long)buf, end);
>>>> }
>>>>
>>>> static void rtl_flush_buffer(void *buf, size_t size)
>>>> {
>>>> - flush_cache((unsigned long)buf, size);
>>>> + unsigned long sz = ALIGN(size, ARCH_DMA_MINALIGN);
>>>> +
>>>> + flush_cache((unsigned long)buf, sz);
>>>> }
>>>
>>>
>>> I believe the correct approach is for the caller (network core code) to
>>> provide cache-aligned buffers, rather than for each driver to align the
>>> start/size when performing cache operations. Again, this is to ensure that
>>> cache operations don't affect any other data adjacent to the buffer. Can you
>>> see why the network core code isn't using cache-aligned buffers when DM_ETH
>>> is enabled? Perhaps DM_ETH isn't the trigger, but just changed something
>>> about the memory layout that exposed some other pre-existing issue.
>>
>> This should already be the case. The transmit buffer is setup like this:
>>
>> net_tx_packet = &net_pkt_buf[0] + (PKTALIGN - 1);
>> net_tx_packet -= (ulong)net_tx_packet % PKTALIGN;
>>
>> in net/net.c:net_init().
>>
>> The recv buffers are defined by the drivers and passed back to the
>> core network code. In this case, the misalignment is caused by the
>> rtl8169 driver...
>>
>> rtl8169_init_ring() does this:
>>
>> tpc->RxBufferRing[i] = &rxb[i * RX_BUF_SIZE];
>>
>> ...but the size breaks alignment for all but the first entry:
>>
>> #define RX_BUF_SIZE 1536 /* Rx Buffer size */
>>
>> This should be fixed by defining this instead:
>>
>> #define RX_BUF_SIZE ALIGN(1536, RTL8169_ALIGN) /* Rx
>> Buffer size */
>>
>> Also, there is an extra buffer that is memcpy'ed to, and then that is
>> passed back to the core net code instead of the actual buffer that was
>> recv'd into; I don't know why:
>>
>> static unsigned char rxdata[RX_BUF_LEN];
>
> I also noticed that the code to setup the tx ring buffers are
> completely wrong. It's a good thing that NUM_TX_DESC is defined to be
> 1. From rtl8169_init_ring():
>
> for (i = 0; i < NUM_TX_DESC; i++) {
> tpc->Tx_skbuff[i] = &txb[i];
> }
>
> That would set the buffer not only to be unaligned after the index 0,
> but also to be one byte after the previous, thus they would all stomp
> on each other.
>
> It should be:
>
> tpc->Tx_skbuff[i] = &txb[i * RX_BUF_SIZE];
>
> It's probably also worth making another define for TX_BUF_SIZE even if
> it is just defined to RX_BUF_SIZE to keep it from looking like a bug.
Essentially this patch was incomplete; only ensuring that the first of
the buffers in each ring were aligned:
https://patchwork.ozlabs.org/patch/419406/
Cheers,
-Joe
More information about the U-Boot
mailing list