[BUG] network is broken on Orange Pi PC

Andre Przywara andre.przywara at arm.com
Thu Jun 3 18:57:49 CEST 2021


On Thu, 3 Jun 2021 18:09:25 +0200
Heinrich Schuchardt <xypron.glpk at gmx.de> wrote:

Hi,

> On 6/3/21 1:04 PM, Andre Przywara wrote:
> > On Thu, 3 Jun 2021 12:20:34 +0200
> > Heinrich Schuchardt <xypron.glpk at gmx.de> wrote:
> >
> > Hi,
> >  
> >> On 6/3/21 11:04 AM, Andre Przywara wrote:  
> >>> On Thu, 3 Jun 2021 09:46:48 +0200
> >>> Heinrich Schuchardt <xypron.glpk at gmx.de> wrote:
> >>>
> >>> Hi Heinrich,
> >>>  
> >>>> On 6/2/21 3:08 PM, Ramon Fried wrote:  
> >>>>> On Tue, Jun 1, 2021 at 12:35 AM Heinrich Schuchardt <xypron.glpk at gmx.de> wrote:  
> >>>>>>
> >>>>>> Dear all,
> >>>>>>
> >>>>>> network is broken in U-Boot on orangepi_pc_defconfig:  
> >>>
> >>> Thanks for the report!
> >>>  
> >>>>>>
> >>>>>> U-Boot 2021.07-rc3-00059-gd8729a114e (May 31 2021 - 21:26:56 +0000)
> >>>>>> Allwinner Technology
> >>>>>> eth0: ethernet at 1c30000  
> >>>>>> => dhcp  
> >>>>>> sun8i_emac_eth_start: Timeout
> >>>>>>
> >>>>>> Best regards
> >>>>>>
> >>>>>> Heinrich
> >>>>>>  
> >>>>> Hi Heinrich, I don't have OrangePi. can you bisect and tell me when it broke ?
> >>>>> Thanks.
> >>>>> Ramon.
> >>>>>  
> >>>>
> >>>> Git bisect points to:
> >>>>
> >>>> commit 4f0278dac56a658ef1e0967fec0bb95372a875bd
> >>>> Author: Andre Przywara <andre.przywara at arm.com>
> >>>> Date:   Mon Jul 6 01:40:45 2020 +0100
> >>>>
> >>>>        net: sun8i-emac: Lower MDIO frequency
> >>>>
> >>>> Reverting the patch solves the problem for the OrangePi PC.
> >>>>
> >>>> According to the commit message the change was only needed for needed
> >>>> for external PHYs.  
> >>>
> >>> The external PHY on the Pine64 (non-plus) was the trigger, however
> >>> both the manual and the Linux driver point to that we definitely need a
> >>> higher divider. From what I can see, AHB2 (the EMAC clock) runs at 200
> >>> MHz (AHB2=AHB1/1=PLL6/3=200 MHz). So just having "/ 16" results in 12.5
> >>> MHz MDIO frequency. Can you check whether any other divider values fix
> >>> this for you as well?  
> >>
> >> MDIO_CMD_MII_CLK_CSR_DIV_16 = 0 and this is the value that was used
> >> before your patch.
> >>
> >> MDIO_CMD_MII_CLK_CSR_DIV_16 = 0 =>
> >> 	Waiting for PHY auto negotiation to complete. done
> >> MDIO_CMD_MII_CLK_CSR_DIV_32 = 1 => sun8i_emac_eth_start: Timeout
> >> MDIO_CMD_MII_CLK_CSR_DIV_64 = 2 => sun8i_emac_eth_start: Timeout
> >> MDIO_CMD_MII_CLK_CSR_DIV_128 = 3 => sun8i_emac_eth_start: Timeout
> >>
> >> What is wrong about the approach in
> >>
> >> [PATCH 1/1] net: sun8i-emac: fix MDIO frequency
> >> https://lists.denx.de/pipermail/u-boot/2021-June/451305.html ?  
> >
> > The MDIO spec says that the maximum frequency on the MDIO bus should be
> > 2.5 MHz. On the H3 with 200 MHz AHB and with a divider of 16 it's 12.5  
> 
> The dts says for the MDIO:
> 
> int_mii_phy: ethernet-phy at 1 {
> 	clocks = <&ccu CLK_BUS_EPHY>;

I think that is the clock to drive the PHY logic, not the MDIO bus.
It's a gate for AHB1, in this case, which seems to be 200 MHz as well
(because rate(AHB2)=rate(AHB1)/1).

> "ahb" refers to CLK_BUS_SPI1:
> 
> spi1: spi at 1c69000 {
> 	clocks = <&ccu CLK_BUS_SPI1>, <&ccu CLK_SPI1>;
> 	clock-names = "ahb", "mod";
> 
> Why are you mentioning AHB?

AHB stands for "Advanced High-performance Bus", which is a traditional
term for a (faster, data centric) bus interface in the ARM
bus world. I am not sure the exact term is actually correct anymore
with recent bus implementations, but the H3 manual refers to AHB
everywhere, in particular as the bus name for the EMAC IP (figure 4-1 System
Bus Tree). Apparently the bus clock is used as the clock source for the
MDIO bus as well - after dividing it.

> > MHz. Apparently the internal H3 PHY can tolerate those high
> > frequencies, but it's beyond the spec. And certainly it should not fail
> > with lower frequencies. And indeed it works for me on the H2+ (which I
> > understand is just a binned version of the H3).
> >  
> >> The Linux 5.10 driver runs fine no matter what value we choose for the
> >> divider in U-Boot.  
> >
> > Because it programs the divider correctly:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c#n316
> > I take it you have no problems with Linux, so the divider being too
> > high cannot be the root cause of the problem.  
> 
> Function stmmac_clk_csr_set() depends on the rate of the clock named
> "stmmaceth":
> 
> ethernet at 1c30000 {
> 	clocks = <&ccu CLK_BUS_EMAC>;
> 	clock-names = "stmmaceth";
> 
> How do we get the frequency of this clock?

Have you checked /sys/kernel/debug/clk/clk_summary? You could also
insert some printk to be sure. Otherwise the definition of bus_emac_clk
and CLK_BUS_EMAC in drivers/clk/sunxi-ng/ccu-sun8i-h3.c tell you that
it's a gate for ahb2. All of those point to 200 MHz on my OPi Zero.

Hope that helps!

Cheers,
Andre
> 
> Best regards
> 
> Heinrich
> 
> >
> > So I'd rather fix this problem properly and correctly, especially as it
> > seems to be more related to the EMAC soft reset (where your error
> > message comes from).
> >  
> >> But I see a message:
> >>
> >> dwmac-sun8i 1c30000.ethernet:
> >> Current syscon value is not the default 148000 (expect 58000)
> >>
> >> @default_syscon_value:
> >> The default value of the EMAC register in syscon
> >> This value is used for disabling properly EMAC
> >> and used as a good starting value in case of the
> >> boot process(uboot) leave some stuff.  
> >
> > Yeah, I never got the reason for this message. If Linux wants a
> > certain value, it should program that. Whatever was left by someone
> > else (be it U-Boot or some other software/OS) should not matter.
> >
> > Cheers,
> > Andre
> >  
> >>
> >> Best regards
> >>
> >> Heinrich
> >>  
> >>>  
> >>>> Can't we let the change depend on priv->use_internal_phy?
> >>>>
> >>>> diff --git a/drivers/net/sun8i_emac.c b/drivers/net/sun8i_emac.c
> >>>> index 5a1b38bf80..d7553fe163 100644
> >>>> --- a/drivers/net/sun8i_emac.c
> >>>> +++ b/drivers/net/sun8i_emac.c
> >>>> @@ -211,7 +211,9 @@ static int sun8i_mdio_read(struct mii_dev *bus, int
> >>>> addr, int devad, int reg)
> >>>>             * The EMAC clock is either 200 or 300 MHz, so we need a divider
> >>>>             * of 128 to get the MDIO frequency below the required 2.5 MHz.
> >>>>             */
> >>>> -       mii_cmd |= MDIO_CMD_MII_CLK_CSR_DIV_128 <<
> >>>> MDIO_CMD_MII_CLK_CSR_SHIFT;
> >>>> +       if (!priv->use_internal_phy)
> >>>> +               mii_cmd |= MDIO_CMD_MII_CLK_CSR_DIV_128 <<
> >>>> +                          MDIO_CMD_MII_CLK_CSR_SHIFT;
> >>>>
> >>>>            mii_cmd |= MDIO_CMD_MII_BUSY;
> >>>>
> >>>> @@ -242,7 +244,9 @@ static int sun8i_mdio_write(struct mii_dev *bus, int
> >>>> addr, int devad, int reg,
> >>>>             * The EMAC clock is either 200 or 300 MHz, so we need a divider
> >>>>             * of 128 to get the MDIO frequency below the required 2.5 MHz.
> >>>>             */
> >>>> -       mii_cmd |= MDIO_CMD_MII_CLK_CSR_DIV_128 <<
> >>>> MDIO_CMD_MII_CLK_CSR_SHIFT;
> >>>> +       if (!priv->use_internal_phy)
> >>>> +               mii_cmd |= MDIO_CMD_MII_CLK_CSR_DIV_128 <<
> >>>> +                          MDIO_CMD_MII_CLK_CSR_SHIFT;
> >>>>
> >>>>            mii_cmd |= MDIO_CMD_MII_WRITE;
> >>>>            mii_cmd |= MDIO_CMD_MII_BUSY;
> >>>>
> >>>> Best regards
> >>>>
> >>>> Heinrich
> >>>>
> >>>> I would assume the problem hits all H3 boards.  
> >>>
> >>> And that's the confusing part: it does not. I tested this on my
> >>> OrangePi Zero (H2+ with internal PHY), both back then with the original
> >>> MDIO frequency patch and also now after your report. It always worked
> >>> reliably for me.
> >>> Also: I am still puzzled how one influences the other: The error you
> >>> get is from the *MAC* soft reset: I would think this is an independent
> >>> operation from any communication attempts with the PHY.
> >>>
> >>> There is this thread here about the same symptom on a V3s:
> >>> https://lists.denx.de/pipermail/u-boot/2021-May/450315.html
> >>> v2 of this original patch (in the same thread) proposes some other
> >>> solution, which I am also not very happy with. But just to get some more
> >>> data points: can you check whether skipping the soft reset fixes this
> >>> for you? I will have a closer look tonight to check the order of soft
> >>> reset and PHY communication. Maybe we should only do the soft reset
> >>> *once* when the MAC probes, and not on every .start call?
> >>>
> >>> Thanks,
> >>> Andre
> >>>  
> >>  
> >  
> 



More information about the U-Boot mailing list