[PATCH] net: dwc_eth_qos: Pad descriptors to cacheline size

Marek Vasut marex at denx.de
Wed Apr 29 23:41:38 CEST 2020


On 4/29/20 11:25 PM, Stephen Warren wrote:
> On 4/29/20 1:56 PM, Marek Vasut wrote:
>> On 4/29/20 9:51 PM, Stephen Warren wrote:
>>> On 4/29/20 1:14 PM, Marek Vasut wrote:
>>>> The DWMAC4 IP has the possibility to skip up to 7 64bit words after
>>>> the descriptor, use this to pad the descriptors to cacheline size.
>>>
>>>> diff --git a/drivers/net/dwc_eth_qos.c b/drivers/net/dwc_eth_qos.c
>>>
>>>>  /* Descriptors */
>>>> +/*
>>>
>>> Nit: Blank line between those two comments to match the existing style.
>>>
>>>> + * #if EQOS_DESCRIPTOR_SIZE < ARCH_DMA_MINALIGN fails if we use sizeof()
>>>> + * or offsetof() to calculate descriptor size, since neither is allowed
>>>> + * in C preprocessor macros, so just hard-code this to 16=4*sizeof(u32).
>>>> + */
>>>> +#define EQOS_DESCRIPTOR_SIZE	16
>>>> +#define EQOS_DESCRIPTOR_PAD	((ARCH_DMA_MINALIGN - EQOS_DESCRIPTOR_SIZE) / 4)
>>>> +struct eqos_desc {
>>>> +	u32 des0;
>>>> +	u32 des1;
>>>> +	u32 des2;
>>>> +	u32 des3;
>>>> +	u32 __pad[EQOS_DESCRIPTOR_PAD];
>>>> +};
>>>
>>> I think the padding needs to be optional based on the same condition in
>>> the following existing warning logic, and the warning logic may need to
>>> be removed/updated to account for the fact that padding is now being
>>> used to avoid the issue:
>>>
>>> (quoting this from existing code)
>>>> /*
>>>>  * Warn if the cache-line size is larger than the descriptor size. In such
>>>>  * cases the driver will likely fail because the CPU needs to flush the cache
>>>>  * when requeuing RX buffers, therefore descriptors written by the hardware
>>>>  * may be discarded. Architectures with full IO coherence, such as x86, do not
>>>>  * experience this issue, and hence are excluded from this condition.
>>>>  *
>>>>  * This can be fixed by defining CONFIG_SYS_NONCACHED_MEMORY which will cause
>>>>  * the driver to allocate descriptors from a pool of non-cached memory.
>>>>  */
>>>> #if EQOS_DESCRIPTOR_SIZE < ARCH_DMA_MINALIGN
>>>> #if !defined(CONFIG_SYS_NONCACHED_MEMORY) && \
>>>>         !CONFIG_IS_ENABLED(SYS_DCACHE_OFF) && !defined(CONFIG_X86)
>>>> #warning Cache line size is larger than descriptor size
>>>> #endif
>>>> #endif
>>>
>>> (back to quoting from the patch)
>>>> -#define EQOS_DESCRIPTOR_WORDS	4
>>>> -#define EQOS_DESCRIPTOR_SIZE	(EQOS_DESCRIPTOR_WORDS * 4)
>>> ...
>>>> -struct eqos_desc {
>>>> -	u32 des0;
>>>> -	u32 des1;
>>>> -	u32 des2;
>>>> -	u32 des3;
>>>> -};
>>>
>>> This patch would be a lot smaller if those definitions were kept in
>>> their existing location instead of moving them...
>>
>> Could you at least test it on the tegra ?
> 
> This patch (applied on top of current u-boot/master) does cause network
> failures on Jetson TX2 (Tegra186), which obviously uses this driver.

That's what I was afraid of.

> The docs for our version of the IP block do indicate that the DSL shift
> is supported, so I think the patch should work. I wonder if it's because
> of the bus width of our EQoS IP block?

The STM32MP1 docs say that the shift is in 64bit words, maybe that
actually translate into bus words rather than 8 bytes indeed ?

btw what is your ARCH_DMA_MINALIGN on t186, 64 ?
I'm worried that if something has it set to 128 or more, then this DSL
padding won't be able to cover it, since it can only skip up to 7 64bit
words.

> Our docs indicate that DSL is a
> multiple of the bus width, so describes a skip of word/dword/lword count
> based on 32/64/128 bus width. Perhaps the bus width needs to be
> parametrized when calculating the value? Our docs don't seem to indicate
> which bus width our HW uses:-( Aha, but changing the "/2" to "/4" in the
> code in this patch which calculates DSL value does yield a working
> system, so I guess we have a 128-bit bus width. And indeed this matches
> a comment I wrote in the driver:
> 
>         /*
>          * Burst length must be < 1/2 FIFO size.
>          * FIFO size in tqs is encoded as (n / 256) - 1.
>          * Each burst is n * 8 (PBLX8) * 16 (AXI width) == 128 bytes.
> 
> 16 byte AXI width == 128 bit width.

So how do we parametrize that, based on compatible string I guess ?
We already have params for the tegra186 and stm32mp1 variants of the IP,
so adding one more should be OK.

Also, thanks for looking into it.

-- 
Best regards,
Marek Vasut


More information about the U-Boot mailing list