[U-Boot] [PATCH 7/9] net: tsec: Use portable types and accessors for BDs

Wed Oct 2 16:16:38 CEST 2013

On 10/1/2013 9:50 PM, Scott Wood wrote:
> On Tue, 2013-10-01 at 14:38 +0300, Claudiu Manoil wrote:
>>
>> On 10/1/2013 2:22 AM, Scott Wood wrote:
>>> On Mon, 2013-09-30 at 12:44 +0300, Claudiu Manoil wrote:
>>>> +static RTXBD rtx __aligned(8);
>>>> +#define RXBD(i) rtx.rxbd[i]
>>>> +#define TXBD(i) rtx.txbd[i]
>>>> +#define GET_BD_STAT(T, i) be16_to_cpu((__force __be16)T##BD(i).status)
>>>> +#define SET_BD_STAT(T, i, v) T##BD(i).status = (__force __u16)cpu_to_be16(v)
>>>> +#define GET_BD_BLEN(T, i) be16_to_cpu((__force __be16)T##BD(i).length)
>>>> +#define SET_BD_BLEN(T, i, v) T##BD(i).length = (__force __u16)cpu_to_be16(v)
>>>> +#define GET_BD_BPTR(T, i) be32_to_cpu((__force __be32)T##BD(i).bufptr)
>>>> +#define SET_BD_BPTR(T, i, v) T##BD(i).bufptr = (__force __u32)cpu_to_be32(v)
>>>
>>> Why the forcing?  Can't you declare the data with __be16/__be32?
>>
>> The __force annotation is obviously needed to suppress the sparse
>> warnings due to BD data declaration type not being __bexx, but the
>> generic uintxx_t, as you noticed as well.
>> Now, why I didn't use __bexx instead?  The main reason would be
>> maintainability: the DMA descriptors may not be in big endian format
>> exclusively.  The eTSEC hw may actually have an option to interpret
>> the DMA descriptors in little endian format.
>
> May have or does have?
>
> If it does have such a feature, do you plan to use it?  Usually I have
> not seen such features used for (e.g.) little-endian PCI devices on big
> endian hosts.
>
I has that option, but I don't really plan to use it, clearly not for 
big endian hosts. The "may have" was for future little endian hosts.
But I think this option is not really needed by the uboot driver
so doing the byte swapping in software should be ok (i.e. performance
wise).

>>> What's wrong with:
>>>
>>> for (t = 0; in_be16(&rtx.rxbd[rx_idx].status) & RXBD_EMPTY; t++) {
>>>
>>> Or if you don't want to use I/O accessors on the DMA descriptors (Is
>>> synchronization ensured some other way?  We had problems with this in
>>> the Linux driver before...):
>>>
>>
>> Synchronization here is is insured by declaring the RTXBD structure
>> type as "volatile" (see RTXBD declaration, a couple of lines above).
>
> That does not achieve hardware synchronization, and even the effects on
> the compiler are questionable due to volatile's vague semantics.
>
>> The existing code has been working this way for quite a while on the
>> mpc85xx platforms,
>
> It was "working" for a while in Linux as well, until we encountered a
> workload where it didn't (though granted, there was no volatile in that
> case).  See Linux commit 3b6330ce2a3e1f152f79a6203f73d23356e243a7
>

Good point, I guess it would be safer too use some memory barriers
around accesses to BD fields in the uboot driver too.  However some
portable barriers would be needed, eieio() doesn't have an equivalent
for ARM.

> FWIW, I see some other places in U-Boot's TSEC driver that use
> out_be32() on the descriptors (e.g. startup_tsec after "Point to the
> buffer descriptors").
>

That's only to program the tbase and rbase registers with the physical
address of the Tx/Rx BD rings' base.

>>   so I thought it would be better not to change this
>> approach.  Using i/o accessors for the Buffer Descriptors would be a
>> significant change, and I don't see how to argue such a change: why
>> would the I/O accessors be better than the current approach - i.e.
>> declaring the DMA descriptors as volatile?  Is there something wrong
>> with about volatile usage in this case? (afaik, this is a case were
>> the volatile declaration is permitted)
>
>
>
>>
>> Also, there doesn't seem to be other net drivers using I/O accessors
>> for the BD rings.
>
> I picked some random examples, and the first driver in Linux in which I
> could quickly find the BD rings uses I/O accessors
> (drivers/net/ethernet/realtek/8319too.c).  I then checked its U-Boot
> eqivalent (drivers/net/rtl8139.c) and it also uses I/O accessors for the
> descriptors.
>

I actually meant accessing buffer descriptor fields (like status, 
length, dma address). As you can see in this example from linux
8319too.c's Rx routine, there's no I/O accessor involved:

/* read size+status of next frame from DMA ring buffer */
rx_status = le32_to_cpu (*(__le32 *) (rx_ring + ring_offset));

However the driver does use a rmb() just before this line.

>>> for (t = 0; be16_to_cpup(&rtx.rxbd[rx_idx].status) & RXBD_EMPTY; t++) {
>>>
>>
>> I opted for using macros instead due to code maintainability,
>
> Obfuscatory macros do not help.
>
>>   and to avoid overly long lines (80)
>
> You could factor out an rxbd_empty() function, or factor that loop out
> to be its own function, or have a local variable point to
> rtx.rxbd[rx_idx]...
>
>> and to better control these changes.
>> For instance, if etsec were to access it's BDs in little endian format
>> in the future,
>
> Either don't do that (preferred option), or at that point add
> tsec16_to_cpup() and friends.
>

I'll have a try with the portable I/O accessors (in_be, out_be) to see
how it works that way.

Thanks,
Claudiu