[U-Boot] [PATCH 2/4] net: fec_mxc: allow use with cache enabled
Marek Vasut
marex at denx.de
Mon Mar 5 02:49:33 CET 2012
> On 03/02/2012 04:39 PM, Marek Vasut wrote:
> >> ensure that transmit and receive buffers are cache-line aligned
> >>
> >> invalidate cache after each packet received
> >> flush cache before transmitting
> >>
> >> Signed-off-by: Eric Nelson<eric.nelson at boundarydevices.com>
> >>
> >> ---
> >>
> >> drivers/net/fec_mxc.c | 248
> >>
> >> ++++++++++++++++++++++++++++++++++++------------- drivers/net/fec_mxc.h
> >> |
> >>
> >> 19 +----
> >> 2 files changed, 184 insertions(+), 83 deletions(-)
> >>
> >> diff --git a/drivers/net/fec_mxc.c b/drivers/net/fec_mxc.c
> >> index 1fdd071..f72304b 100644
> >> --- a/drivers/net/fec_mxc.c
> >> +++ b/drivers/net/fec_mxc.c
> >>
> >> <snip>
> >>
> >> /*
> >>
> >> * This routine transmits one frame. This routine only accepts
> >>
> >> @@ -650,15 +708,21 @@ static int fec_send(struct eth_device *dev,
> >> volatile void* packet, int length) }
> >>
> >> /*
> >>
> >> - * Setup the transmit buffer
> >> - * Note: We are always using the first buffer for transmission,
> >> - * the second will be empty and only used to stop the DMA engine
> >> + * Setup the transmit buffer. We are always using the first buffer for
> >> + * transmission, the second will be empty and only used to stop the
> >> DMA + * engine. We also flush the packet to RAM here to avoid cache
> >> trouble.
> >>
> >> */
> >>
> >> #ifdef CONFIG_FEC_MXC_SWAP_PACKET
> >>
> >> swap_packet((uint32_t *)packet, length);
> >>
> >> #endif
> >>
> >> - writew(length,&fec->tbd_base[fec->tbd_index].data_length);
> >> - writel((uint32_t)packet,&fec->tbd_base[fec->tbd_index].data_pointer);
> >> +
> >> + addr = (uint32_t)packet;
> >> + size = roundup(length, CONFIG_FEC_DATA_ALIGNMENT);
> >> + flush_dcache_range(addr, addr + size);
> >
> > What if size of the packet isn't aligned on cacheline boundary? Won't it
> > choke here?
>
> Here's a quandary...
>
> flush_dcache_range() can potentially force writes into unintended areas,
> which could conceivably include areas owned by the ethernet receiver if
> the source object weren't aligned to a cache-line boundary and size.
>
> In practice, it appears that net/net.c only transmits from one of two
> places:
You can also invalidate memory and loose some information. Though I'm not quite
sure this driver can be the case.
> - a dedicated transmit buffer (NetTxPacket), which is guaranteed
> to be aligned to PKTALIGN (32).
> - a receive buffer (ICMP and ARP replies)
>
> The networking API certainly allows for transmits from arbitrary
> areas in memory, but I'm not seeing where or how this can be invoked.
OK
>
> Because there's no way to query how the memory for a packet is
> allocated, the only way around this I can see is to memcpy to
> a dedicated transmit buffer within fec_mxc.c, which would defeat
> any benefit of cache.
Indeed ... a bounce buffer. But such a bounce buffer should be implemented close
to the place where the misalignment originates. If the net core can align
packets everywhere well, we're safe and happy.
>
> AFAICT, all of the other calls to 'flush_dcache_range()' are okay because
> they're dealing with buffers allocated by the driver.
>
I'd love to see someone else review this. This is very important to be made
right.
> >> +
> >> + fec->tbd_base[fec->tbd_index].data_length = length;
> >> + fec->tbd_base[fec->tbd_index].data_pointer = addr;
> >> +
> >>
> >> /*
> >>
> >> * update BD's status now
> >>
> >> * This block:
> >> @@ -667,9 +731,18 @@ static int fec_send(struct eth_device *dev,
> >> volatile void* packet, int length) * - might be the last BD in the
> >> list, so the address counter should * wrap (-> keep the WRAP flag)
> >>
> >> */
> >>
> >> - status = readw(&fec->tbd_base[fec->tbd_index].status)& FEC_TBD_WRAP;
> >> + status = fec->tbd_base[fec->tbd_index].status& FEC_TBD_WRAP;
> >>
> >> status |= FEC_TBD_LAST | FEC_TBD_TC | FEC_TBD_READY;
> >>
> >> - writew(status,&fec->tbd_base[fec->tbd_index].status);
> >> + fec->tbd_base[fec->tbd_index].status = status;
> >> +
> >> + /*
> >> + * Flush data cache. This code flushes both TX descriptors to RAM.
> >> + * After this code this code, the descritors will be safely in RAM
> >> + * and we can start DMA.
> >> + */
> >> + size = roundup(2 * sizeof(struct fec_bd), CONFIG_FEC_DATA_ALIGNMENT);
> >> + addr = (uint32_t)fec->tbd_base;
> >> + flush_dcache_range(addr, addr + size);
> >
> > Same concern here and everywhere else ... I believe aligning it like this
> > is quite unsafe :-(
>
> This one looks safe because you've controlled the allocation of tbd_base.
>
> >> <snip>
> >>
> >> @@ -751,6 +846,13 @@ static int fec_recv(struct eth_device *dev)
> >>
> >> frame = (struct nbuf *)readl(&rbd->data_pointer);
> >> frame_length = readw(&rbd->data_length) - 4;
> >> /*
> >>
> >> + * Invalidate data cache over the buffer
> >> + */
> >> + addr = (uint32_t)frame;
> >> + size = roundup(frame_length, CONFIG_FEC_DATA_ALIGNMENT);
> >> + invalidate_dcache_range(addr, addr + size);
> >
> > DTTO here, frame length might not be aligned properly, or will it be?
> > Network stack must be properly analyzed here.
>
> The hardware won't return an unaligned value here, so this should be good.
Are you sure? You can't receive frame aligned to 8 bytes boundary?
>
> >> <snip>
> >>
> >> @@ -765,11 +867,27 @@ static int fec_recv(struct eth_device *dev)
> >>
> >> (ulong)rbd->data_pointer,
> >> bd_status);
> >>
> >> }
> >>
> >> +
> >>
> >> /*
> >>
> >> - * free the current buffer, restart the engine
> >> - * and move forward to the next buffer
> >> + * Free the current buffer, restart the engine and move forward
> >> + * to the next buffer. Here we check if the whole cacheline of
> >> + * descriptors was already processed and if so, we mark it free
> >> + * as whole.
> >>
> >> */
> >>
> >> - fec_rbd_clean(fec->rbd_index == (FEC_RBD_NUM - 1) ? 1 : 0, rbd);
> >> + size = (CONFIG_FEC_DATA_ALIGNMENT / sizeof(struct fec_bd)) - 1;
> >> + if ((fec->rbd_index& size) == size) {
> >> + i = fec->rbd_index - size;
> >> + addr = (uint32_t)&fec->rbd_base[i];
> >> + for (; i<= fec->rbd_index + size; i++) {
> >> + if (i == (FEC_RBD_NUM - 1))
> >> + fec_rbd_clean(1,&fec->rbd_base[i]);
> >> + else
> >> + fec_rbd_clean(0,&fec->rbd_base[i]);
> >> + }
> >> + flush_dcache_range(addr,
> >> + addr + CONFIG_FEC_DATA_ALIGNMENT);
> >> + }
> >> +
> >
> > This was the worst part. I don't quite remember how and why I did those
> > alignment decisions (well it's obvious here, you flush rxdesc once whole
> > cacheline is done), but some of the pieces are far less obvious now that
> > I read the code.
>
> I'm not grokking this one either. Definitely needs fresher eyes than I have
> at the moment.
Indeed, I'm thinking the same thing. Let's sleep on this!
Thanks for your good work on putting this into shape!
M
More information about the U-Boot
mailing list