[U-Boot] [PATCH 2/4] net: fec_mxc: allow use with cache enabled

Eric Nelson eric.nelson at boundarydevices.com
Mon Mar 5 00:13:59 CET 2012


On 03/02/2012 04:39 PM, Marek Vasut wrote:
>> 	ensure that transmit and receive buffers are cache-line aligned
>>          invalidate cache after each packet received
>>          flush cache before transmitting
>>
>> Signed-off-by: Eric Nelson<eric.nelson at boundarydevices.com>
 >> ---
 >>   drivers/net/fec_mxc.c |  248
 >> ++++++++++++++++++++++++++++++++++++------------- drivers/net/fec_mxc.h |
 >>   19 +----
 >>   2 files changed, 184 insertions(+), 83 deletions(-)
 >>
 >> diff --git a/drivers/net/fec_mxc.c b/drivers/net/fec_mxc.c
 >> index 1fdd071..f72304b 100644
 >> --- a/drivers/net/fec_mxc.c
 >> +++ b/drivers/net/fec_mxc.c
 >>
 >> <snip>
 >>
>>   	/*
>>   	 * This routine transmits one frame.  This routine only accepts
>> @@ -650,15 +708,21 @@ static int fec_send(struct eth_device *dev, volatile
>> void* packet, int length) }
>>
>>   	/*
>> -	 * Setup the transmit buffer
>> -	 * Note: We are always using the first buffer for transmission,
>> -	 * the second will be empty and only used to stop the DMA engine
>> +	 * Setup the transmit buffer. We are always using the first buffer for
>> +	 * transmission, the second will be empty and only used to stop the DMA
>> +	 * engine. We also flush the packet to RAM here to avoid cache trouble.
>>   	 */
>>   #ifdef	CONFIG_FEC_MXC_SWAP_PACKET
>>   	swap_packet((uint32_t *)packet, length);
>>   #endif
>> -	writew(length,&fec->tbd_base[fec->tbd_index].data_length);
>> -	writel((uint32_t)packet,&fec->tbd_base[fec->tbd_index].data_pointer);
>> +
>> +	addr = (uint32_t)packet;
>> +	size = roundup(length, CONFIG_FEC_DATA_ALIGNMENT);
>> +	flush_dcache_range(addr, addr + size);
>
> What if size of the packet isn't aligned on cacheline boundary? Won't it choke
> here?
>

Here's a quandary...

flush_dcache_range() can potentially force writes into unintended areas,
which could conceivably include areas owned by the ethernet receiver if
the source object weren't aligned to a cache-line boundary and size.

In practice, it appears that net/net.c only transmits from one of two
places:
	- a dedicated transmit buffer (NetTxPacket), which is guaranteed
	to be aligned to PKTALIGN (32).
	- a receive buffer (ICMP and ARP replies)

The networking API certainly allows for transmits from arbitrary
areas in memory, but I'm not seeing where or how this can be invoked.

Because there's no way to query how the memory for a packet is
allocated, the only way around this I can see is to memcpy to
a dedicated transmit buffer within fec_mxc.c, which would defeat
any benefit of cache.

AFAICT, all of the other calls to 'flush_dcache_range()' are okay because
they're dealing with buffers allocated by the driver.

>> +
>> +	fec->tbd_base[fec->tbd_index].data_length = length;
>> +	fec->tbd_base[fec->tbd_index].data_pointer = addr;
>> +
>>   	/*
>>   	 * update BD's status now
>>   	 * This block:
>> @@ -667,9 +731,18 @@ static int fec_send(struct eth_device *dev, volatile
>> void* packet, int length) * - might be the last BD in the list, so the
>> address counter should *   wrap (->  keep the WRAP flag)
>>   	 */
>> -	status = readw(&fec->tbd_base[fec->tbd_index].status)&  FEC_TBD_WRAP;
>> +	status = fec->tbd_base[fec->tbd_index].status&  FEC_TBD_WRAP;
>>   	status |= FEC_TBD_LAST | FEC_TBD_TC | FEC_TBD_READY;
>> -	writew(status,&fec->tbd_base[fec->tbd_index].status);
>> +	fec->tbd_base[fec->tbd_index].status = status;
>> +
>> +	/*
>> +	 * Flush data cache. This code flushes both TX descriptors to RAM.
>> +	 * After this code this code, the descritors will be safely in RAM
>> +	 * and we can start DMA.
>> +	 */
>> +	size = roundup(2 * sizeof(struct fec_bd), CONFIG_FEC_DATA_ALIGNMENT);
>> +	addr = (uint32_t)fec->tbd_base;
>> +	flush_dcache_range(addr, addr + size);
>
> Same concern here and everywhere else ... I believe aligning it like this is
> quite unsafe :-(

This one looks safe because you've controlled the allocation of tbd_base.

>>
>> <snip>
>>
>> @@ -751,6 +846,13 @@ static int fec_recv(struct eth_device *dev)
>>   			frame = (struct nbuf *)readl(&rbd->data_pointer);
>>   			frame_length = readw(&rbd->data_length) - 4;
>>   			/*
>> +			 * Invalidate data cache over the buffer
>> +			 */
>> +			addr = (uint32_t)frame;
>> +			size = roundup(frame_length, CONFIG_FEC_DATA_ALIGNMENT);
>> +			invalidate_dcache_range(addr, addr + size);
>
> DTTO here, frame length might not be aligned properly, or will it be? Network
> stack must be properly analyzed here.
>

The hardware won't return an unaligned value here, so this should be good.

>>
 >> <snip>
 >>
>> @@ -765,11 +867,27 @@ static int fec_recv(struct eth_device *dev)
>>   						(ulong)rbd->data_pointer,
>>   						bd_status);
>>   		}
>> +
>>   		/*
>> -		 * free the current buffer, restart the engine
>> -		 * and move forward to the next buffer
>> +		 * Free the current buffer, restart the engine and move forward
>> +		 * to the next buffer. Here we check if the whole cacheline of
>> +		 * descriptors was already processed and if so, we mark it free
>> +		 * as whole.
>>   		 */
>> -		fec_rbd_clean(fec->rbd_index == (FEC_RBD_NUM - 1) ? 1 : 0, rbd);
>> +		size = (CONFIG_FEC_DATA_ALIGNMENT / sizeof(struct fec_bd)) - 1;
>> +		if ((fec->rbd_index&  size) == size) {
>> +			i = fec->rbd_index - size;
>> +			addr = (uint32_t)&fec->rbd_base[i];
>> +			for (; i<= fec->rbd_index + size; i++) {
>> +				if (i == (FEC_RBD_NUM - 1))
>> +					fec_rbd_clean(1,&fec->rbd_base[i]);
>> +				else
>> +					fec_rbd_clean(0,&fec->rbd_base[i]);
>> +			}
>> +			flush_dcache_range(addr,
>> +				addr + CONFIG_FEC_DATA_ALIGNMENT);
>> +		}
>> +
>
> This was the worst part. I don't quite remember how and why I did those
> alignment decisions (well it's obvious here, you flush rxdesc once whole
> cacheline is done), but some of the pieces are far less obvious now that I read
> the code.
>

I'm not grokking this one either. Definitely needs fresher eyes than I have at
the moment.


More information about the U-Boot mailing list