[RFC PATCH 2/2] net: Add Support for GENET Ethernet controller

Andre Przywara andre.przywara at arm.com
Mon Dec 16 16:08:11 CET 2019


On Mon, 16 Dec 2019 18:15:01 +0530
Amit Tomer <amittomer25 at gmail.com> wrote:

Hi,

> Thanks for having the look.

thanks for the reply and the comments.

[ ... ]

> > > +#define TDMA_RING_REG_BASE(QUEUE_NUMBER) (GENET_TDMA_REG_OFF \
> > > +                        + (DMA_RING_SIZE * (QUEUE_NUMBER)))  
> >
> > So it seems like we are *always* using the default queue number. For improved readability we could drop this parameter and hardcode this here, so that users just need to use TDMA_RING_REG_BASE.  
> 
> Yeah, this can be done, only thing if in future we would like to work
> on some specific queue , keeping that in my mind I put it.

As I mentioned for Florian earlier, there is absolutely no reason for U-Boot to ever support multiple queues. So we can very safely just ignore everything other than the default queue.

[ ... ]

> > > +static void bcmgenet_umac_reset(struct bcmgenet_eth_priv *priv)
> > > +{
> > > +     u32 reg;
> > > +
> > > +     reg = readl(priv->mac_reg + SYS_RBUF_FLUSH_CTRL);
> > > +     reg |= BIT(1);
> > > +     writel(reg, (priv->mac_reg + SYS_RBUF_FLUSH_CTRL));  
> >
> > Please use setbits_le32() and friends for those manipulations. This is much easier to read and less error-prone. This applies to all other bit manipulations below as well.  
> 
> Sure.
> 
> > > +     udelay(10);
> > > +
> > > +     reg &= ~BIT(1);
> > > +     writel(reg, (priv->mac_reg + SYS_RBUF_FLUSH_CTRL));
> > > +     udelay(10);
> > > +}
> > > +
> > > +static void reset_umac(struct bcmgenet_eth_priv *priv)  
> >
> > What is the difference to the function above? The name is confusingly similar.  
> I think only function is needed, this is something I took my older
> version of  Linux driver and is leftover
> which is not needed.

So what I figured is this: Each of those three functions (reset_umac(), bcmgenet_umac_reset(), init_umac()) has exactly one caller:
bcmgenet_eth_init() calls:
	....
	bcmgenet_umac_reset(priv);
        init_umac(priv);
	(which calls:)
		reset_umac(priv);
		
	...

So that means we can unify those into one function, which will be called from eth_init().
 
> Will check this again.
> > > +{
> > > +     writel(0, (priv->mac_reg + SYS_RBUF_FLUSH_CTRL));
> > > +     udelay(10);
> > > +
> > > +     writel(0, priv->mac_reg + UMAC_CMD);
> > > +
> > > +     writel(CMD_SW_RESET | CMD_LCL_LOOP_EN, priv->mac_reg + UMAC_CMD);
> > > +     udelay(2);
> > > +     writel(0, priv->mac_reg + UMAC_CMD);
> > > +}
> > > +
> > > +static void init_umac(struct bcmgenet_eth_priv *priv)
> > > +{
> > > +     u32 reg;
> > > +
> > > +     reset_umac(priv);
> > > +
> > > +     /* clear tx/rx counter */
> > > +     writel(MIB_RESET_RX | MIB_RESET_TX | MIB_RESET_RUNT,
> > > +            priv->mac_reg + UMAC_MIB_CTRL);
> > > +     writel(0, priv->mac_reg + UMAC_MIB_CTRL);
> > > +
> > > +     writel(ENET_MAX_MTU_SIZE, priv->mac_reg + UMAC_MAX_FRAME_LEN);
> > > +
> > > +     /* init rx registers, enable ip header optimization */
> > > +     reg = readl(priv->mac_reg + RBUF_CTRL);
> > > +     reg |= RBUF_ALIGN_2B;
> > > +     writel(reg, (priv->mac_reg + RBUF_CTRL));
> > > +
> > > +     writel(1, (priv->mac_reg + RBUF_TBUF_SIZE_CTRL));
> > > +}
> > > +
> > > +static int _bcmgenet_write_hwaddr(struct bcmgenet_eth_priv *priv,
> > > +                               unsigned char *addr)
> > > +{
> > > +     writel_relaxed(((addr[0] << 24) | (addr[1] << 16) | (addr[2] << 8)
> > > +                    | addr[3]), priv->mac_reg + UMAC_MAC0);
> > > +     writel_relaxed((addr[4] << 8) | addr[5], priv->mac_reg + UMAC_MAC1);
> > > +
> > > +     return 0;
> > > +}  
> >
> > Why this function? Can't you just use the one below and put the actual code in there?  
> 
> Actually , we are calling it after reset as well and is trigger from two places.

Yes, but in all cases we pass the very same platdata->enetaddr value in, so you can call the ops function directly.

> I would place it in single function and test it.

[ ... ]

> > > +static int _bcmgenet_gmac_eth_recv(struct bcmgenet_eth_priv *priv, uchar **packetp)
> > > +{
> > > +     u32 len_stat;
> > > +     u32 len;
> > > +     u32 addr;
> > > +     u32 length = 0;
> > > +     void *desc_base = (priv->rx_desc_base + (priv->rx_index * DMA_DESC_SIZE));
> > > +
> > > +     len_stat = readl(desc_base + DMA_DESC_LENGTH_STATUS);
> > > +
> > > +     if (!(len_stat & DMA_OWN)) {  
> >
> > I think it would be cleaner to negate this and just bail out early.
> > But also: Is this bit safe to use? Does the MAC update this?
> >  
> > > +             len  = readl(desc_base + DMA_DESC_LENGTH_STATUS);
> > > +             addr = readl(desc_base + DMA_DESC_ADDRESS_LO);
> > > +
> > > +             length = (len >> DMA_BUFLENGTH_SHIFT) & DMA_BUFLENGTH_MASK;
> > > +
> > > +             invalidate_dcache_range((uintptr_t)addr,
> > > +                                     (addr + RX_TOTAL_BUFSIZE));  
> >
> > There is no need to invalidate *all* buffers, so just use RX_BUF_LENGTH.
> > Actually it looks like we might need lose data (our indexes!), since you invalidate cache lines beyond the buffers.  
> 
> Ok, I tried this and seen some performance improvement while tftp but
> it didn't solve the issue of tftp larger files.

Please keep in mind that it's a correctness issue, not a performance one. Just using invalidate might lose data. I am actually surprised that this didn't explode before.

> > > +
> > > +             /*
> > > +              * two dummy bytes are added for IP alignment, this can be
> > > +              * avoided by not programming RBUF_ALIGN_2B bit in RBUF_CTRL
> > > +              */
> > > +             *packetp = (uchar *)(ulong)addr + RX_BUF_OFFSET;
> > > +
> > > +             return (length - RX_BUF_OFFSET);
> > > +     }
> > > +
> > > +     return -EAGAIN;
> > > +}
> > > +
> > > +static int _bcmgenet_free_pkt(struct bcmgenet_eth_priv *priv, int len)
> > > +{
> > > +     priv->c_index = (priv->c_index + 1) & 0xFFFF;
> > > +     writel(priv->c_index,
> > > +            priv->mac_reg + RDMA_RING_REG_BASE(DEFAULT_Q) + RDMA_CONS_INDEX);
> > > +     udelay(1000);  
> >
> > What is this udelay about?  
> 
> If I don't use this, tftp never breaks and hashes are keep coming.But
> since we have moved to polling prod/cons stuff
> it is not required anymore.

Odd, that sounds like it covered a bug then.

> > > +
> > > +     priv->rx_index++;
> > > +     if (!(priv->rx_index % TOTAL_DESC)) {
> > > +             priv->rx_index = 0;
> > > +     }  

[ ... ]

> > > +static void rx_ring_init(struct bcmgenet_eth_priv *priv)
> > > +{
> > > +     writel(DMA_MAX_BURST_LENGTH,
> > > +            (priv->mac_reg + RDMA_REG_BASE + DMA_SCB_BURST_SIZE));
> > > +     writel(0x0,
> > > +            (priv->mac_reg + RDMA_RING_REG_BASE(DEFAULT_Q) + DMA_START_ADDR));
> > > +     writel(0x0,
> > > +            (priv->mac_reg + RDMA_RING_REG_BASE(DEFAULT_Q) + RDMA_READ_PTR));
> > > +     writel(0x0,
> > > +            (priv->mac_reg + RDMA_RING_REG_BASE(DEFAULT_Q) + RDMA_WRITE_PTR));
> > > +     writel((TOTAL_DESC * (DMA_DESC_SIZE - 1)),  
> >
> > If I get the Linux code correctly, this is supposed to point at the last *byte* of the descriptors.  
> 
> Thanks for stressing on this.Actually done some experiments with
> DMA_END_ADDRESSES and found
> that its three worlds per descriptor and counted in numbers.

Ah, so that's a good find. I think this is the bug that prevented it from working. Will check later tonight.

> 
> This allowed to tftp Bigger files and I can now tftp Kernel Image file
> of (19 Mb) with data transfer rate
> of 4-5 MB/s.
> 
> > > +            (priv->mac_reg + RDMA_RING_REG_BASE(DEFAULT_Q) + DMA_END_ADDR));
> > > +     writel(0x0,
> > > +            (priv->mac_reg + RDMA_RING_REG_BASE(DEFAULT_Q) + RDMA_PROD_INDEX));
> > > +     writel(0x0,
> > > +            (priv->mac_reg + RDMA_RING_REG_BASE(DEFAULT_Q) + RDMA_CONS_INDEX));
> > > +     writel(((TOTAL_DESC << DMA_RING_SIZE_SHIFT) | RX_BUF_LENGTH),
> > > +            (priv->mac_reg + RDMA_RING_REG_BASE(DEFAULT_Q) + DMA_RING_BUF_SIZE));
> > > +     writel(((DMA_FC_THRESH_LO << DMA_XOFF_THRESHOLD_SHIFT) | DMA_FC_THRESH_HI),
> > > +            (priv->mac_reg + RDMA_RING_REG_BASE(DEFAULT_Q) + RDMA_XON_XOFF_THRESH));
> > > +     writel((1 << DEFAULT_Q),
> > > +            (priv->mac_reg + RDMA_REG_BASE + DMA_RING_CFG));
> > > +}
> > > +
> > > +static void tx_ring_init(struct bcmgenet_eth_priv *priv)
> > > +{
> > > +     writel(DMA_MAX_BURST_LENGTH,
> > > +            (priv->mac_reg + TDMA_REG_BASE + DMA_SCB_BURST_SIZE));  
> >
> > I think would be better to include both offsets into one value, since you always seem to use them together. You can keep the base in the definition:
> > #define DMA_SCB_BURST_SIZE      (TDMA_REG_BASE + 0x0c)  
> 
> But this is used for both Rx and Tx , having two instances(one for rx
> and one for tx) looks bit odd to me.

But there are two registers, one for the TX burst size, and one for the RX burst size, right? So two symbols look like the right thing to do.

> Shall , we keep this as it is ?
> 
> > > +     writel(0x0,
> > > +            (priv->mac_reg + TDMA_RING_REG_BASE(DEFAULT_Q) + DMA_START_ADDR));  
> >
> > As mentioned above, the DEFAULT_Q parameter is redundant in our case (this deserves to be mentioned somewhere, btw.).
> > So at the very least drop this parameter, but also consider to include the mandatory offset into the definition of DMA_START_ADDR, as above.
> >  
> > > +     writel(0x0,
> > > +            (priv->mac_reg + TDMA_RING_REG_BASE(DEFAULT_Q) + TDMA_READ_PTR));
> > > +     writel(0x0,
> > > +            (priv->mac_reg + TDMA_RING_REG_BASE(DEFAULT_Q) + TDMA_WRITE_PTR));
> > > +     writel((TOTAL_DESC * (DMA_DESC_SIZE - 1)),  
> >
> > Same as above, I don't think this is right.  
> Yeah :)
> 
> I wish some one has written Linux code this way.
> 
> diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
> b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
> index d857df8..a967a7d 100644
> --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
> +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
> @@ -2102,7 +2102,7 @@ static void bcmgenet_init_tx_ring(struct
> bcmgenet_priv *priv,
>                                   TDMA_READ_PTR);
>         bcmgenet_tdma_ring_writel(priv, index, start_ptr * words_per_bd,
>                                   TDMA_WRITE_PTR);
> -       bcmgenet_tdma_ring_writel(priv, index, end_ptr * words_per_bd - 1,
> +       bcmgenet_tdma_ring_writel(priv, index, (end_ptr * words_per_bd) - 1,

But this is basic math operator precedence, so not even C specific. I think redundant parentheses are frowned upon.

Cheers,
Andre.

>                                   DMA_END_ADDR);
>  }
> 
> But yeah, its my stupidity not able to read operators precedence here.
> 


More information about the U-Boot mailing list