[PATCH] net: dwc_eth_qos: Pad descriptors to cacheline size
Stephen Warren
swarren at wwwdotorg.org
Wed Apr 29 23:51:03 CEST 2020
On 4/29/20 3:41 PM, Marek Vasut wrote:
> On 4/29/20 11:25 PM, Stephen Warren wrote:
>> On 4/29/20 1:56 PM, Marek Vasut wrote:
>>> On 4/29/20 9:51 PM, Stephen Warren wrote:
>>>> On 4/29/20 1:14 PM, Marek Vasut wrote:
>>>>> The DWMAC4 IP has the possibility to skip up to 7 64bit words after
>>>>> the descriptor, use this to pad the descriptors to cacheline size.
>>>>
>>>>> diff --git a/drivers/net/dwc_eth_qos.c b/drivers/net/dwc_eth_qos.c
>>>>
>>>>> /* Descriptors */
>>>>> +/*
>>>>
>>>> Nit: Blank line between those two comments to match the existing style.
>>>>
>>>>> + * #if EQOS_DESCRIPTOR_SIZE < ARCH_DMA_MINALIGN fails if we use sizeof()
>>>>> + * or offsetof() to calculate descriptor size, since neither is allowed
>>>>> + * in C preprocessor macros, so just hard-code this to 16=4*sizeof(u32).
>>>>> + */
>>>>> +#define EQOS_DESCRIPTOR_SIZE 16
>>>>> +#define EQOS_DESCRIPTOR_PAD ((ARCH_DMA_MINALIGN - EQOS_DESCRIPTOR_SIZE) / 4)
>>>>> +struct eqos_desc {
>>>>> + u32 des0;
>>>>> + u32 des1;
>>>>> + u32 des2;
>>>>> + u32 des3;
>>>>> + u32 __pad[EQOS_DESCRIPTOR_PAD];
>>>>> +};
>>>>
>>>> I think the padding needs to be optional based on the same condition in
>>>> the following existing warning logic, and the warning logic may need to
>>>> be removed/updated to account for the fact that padding is now being
>>>> used to avoid the issue:
>>>>
>>>> (quoting this from existing code)
>>>>> /*
>>>>> * Warn if the cache-line size is larger than the descriptor size. In such
>>>>> * cases the driver will likely fail because the CPU needs to flush the cache
>>>>> * when requeuing RX buffers, therefore descriptors written by the hardware
>>>>> * may be discarded. Architectures with full IO coherence, such as x86, do not
>>>>> * experience this issue, and hence are excluded from this condition.
>>>>> *
>>>>> * This can be fixed by defining CONFIG_SYS_NONCACHED_MEMORY which will cause
>>>>> * the driver to allocate descriptors from a pool of non-cached memory.
>>>>> */
>>>>> #if EQOS_DESCRIPTOR_SIZE < ARCH_DMA_MINALIGN
>>>>> #if !defined(CONFIG_SYS_NONCACHED_MEMORY) && \
>>>>> !CONFIG_IS_ENABLED(SYS_DCACHE_OFF) && !defined(CONFIG_X86)
>>>>> #warning Cache line size is larger than descriptor size
>>>>> #endif
>>>>> #endif
>>>>
>>>> (back to quoting from the patch)
>>>>> -#define EQOS_DESCRIPTOR_WORDS 4
>>>>> -#define EQOS_DESCRIPTOR_SIZE (EQOS_DESCRIPTOR_WORDS * 4)
>>>> ...
>>>>> -struct eqos_desc {
>>>>> - u32 des0;
>>>>> - u32 des1;
>>>>> - u32 des2;
>>>>> - u32 des3;
>>>>> -};
>>>>
>>>> This patch would be a lot smaller if those definitions were kept in
>>>> their existing location instead of moving them...
>>>
>>> Could you at least test it on the tegra ?
>>
>> This patch (applied on top of current u-boot/master) does cause network
>> failures on Jetson TX2 (Tegra186), which obviously uses this driver.
>
> That's what I was afraid of.
>
>> The docs for our version of the IP block do indicate that the DSL shift
>> is supported, so I think the patch should work. I wonder if it's because
>> of the bus width of our EQoS IP block?
>
> The STM32MP1 docs say that the shift is in 64bit words, maybe that
> actually translate into bus words rather than 8 bytes indeed ?
I assume the STM32 doc writer only wrote the specific case that applies
to their HW, which presumably has a 64-bit bus. Our docs are IIRC
unmodified from the generic IP docs that Synopsis supplied, so mention
all the cases and leave the reader to figure out which option applies!
> btw what is your ARCH_DMA_MINALIGN on t186, 64 ?
It's 64 I believe, since ARM64 selects SYS_CACHE_SHIFT_6, which then
triggers the logic below, and I don't think Tegra overrides any of this:
config SYS_CACHELINE_SIZE
int
default 64 if SYS_CACHE_SHIFT_6
> I'm worried that if something has it set to 128 or more, then this DSL
> padding won't be able to cover it, since it can only skip up to 7 64bit
> words.
Oh yes, I meant to mention that in the patch; it would be good to
validate the calculated DSL value doesn't overflow its field width.
>> Our docs indicate that DSL is a
>> multiple of the bus width, so describes a skip of word/dword/lword count
>> based on 32/64/128 bus width. Perhaps the bus width needs to be
>> parametrized when calculating the value? Our docs don't seem to indicate
>> which bus width our HW uses:-( Aha, but changing the "/2" to "/4" in the
>> code in this patch which calculates DSL value does yield a working
>> system, so I guess we have a 128-bit bus width. And indeed this matches
>> a comment I wrote in the driver:
>>
>> /*
>> * Burst length must be < 1/2 FIFO size.
>> * FIFO size in tqs is encoded as (n / 256) - 1.
>> * Each burst is n * 8 (PBLX8) * 16 (AXI width) == 128 bytes.
>>
>> 16 byte AXI width == 128 bit width.
>
> So how do we parametrize that, based on compatible string I guess ?
> We already have params for the tegra186 and stm32mp1 variants of the IP,
> so adding one more should be OK.
Yes, adding another field to struct eqos_config would make sense.
> Also, thanks for looking into it.
More information about the U-Boot
mailing list