[PATCH] net: phy: xilinx: Break while loop over ethernet phy

Bin Meng bmeng.cn at gmail.com
Wed Apr 28 18:36:40 CEST 2021


Hi Michal,

On Thu, Apr 29, 2021 at 12:17 AM Michal Simek <michal.simek at xilinx.com> wrote:
>
> Hi Bin,
>
> On 4/28/21 5:57 PM, Bin Meng wrote:
> > Hi Michal,
> >
> > On Wed, Apr 28, 2021 at 11:03 PM Michal Simek <michal.simek at xilinx.com> wrote:
> >>
> >> Hi Bin,
> >>
> >> On 4/28/21 4:37 PM, Bin Meng wrote:
> >>> Hi Michal,
> >>>
> >>> On Tue, Apr 27, 2021 at 7:22 PM Michal Simek <michal.simek at xilinx.com> wrote:
> >>>>
> >>>> Hi Bin,
> >>>>
> >>>> On 4/27/21 7:17 AM, Bin Meng wrote:
> >>>>> Hi Michal,
> >>>>>
> >>>>> On Mon, Apr 26, 2021 at 8:31 PM Michal Simek <michal.simek at xilinx.com> wrote:
> >>>>>>
> >>>>>> The commit 6c993815bbea ("net: phy: xilinx: Be compatible with live OF
> >>>>>> tree") change driver behavior to while loop which wasn't correct because
> >>>>>> the driver was looping over again and again. The reason was that
> >>>>>> ofnode_valid() is taking 0 as correct value.
> >>>>>
> >>>>> I am still trying to understand the problem. The changes in
> >>>>> 6c993815bbea sound correct from an fdtdec <=> OF API mapping
> >>>>> perspective. If the new OF API does not work, the old fdtdec may fail
> >>>>> too. Could you please explain a little bit?
> >>>>
> >>>> here is behavior of origin code.
> >>>>
> >>>> ZYNQ GEM: ff0e0000, mdio bus ff0e0000, phyaddr 12, interface rgmii-id
> >>>> phy_connect_gmii2rgmii sn 11348
> >>>> phy_connect_gmii2rgmii 1off -1
> >>>> phy_connect_gmii2rgmii 2off -1
> >>>> phy_connect_gmii2rgmii sn2 11752
> >>>> phy_connect_gmii2rgmii 1off -1
> >>>> phy_connect_gmii2rgmii 2off -1
> >>>> phy_connect_gmii2rgmii sn2 -1
> >>>> phy_connect_gmii2rgmii phydev 0000000000000000
> >>>> eth0: ethernet at ff0e0000
> >>>> Scanning disk mmc at ff170000.blk...
> >>>> Found 4 disks
> >>>> Hit any key to stop autoboot:  0
> >>>> ZynqMP>
> >>>>
> >>>>
> >>>> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> >>>> index 89e3076bfd25..d0960d93ae08 100644
> >>>> --- a/drivers/net/phy/phy.c
> >>>> +++ b/drivers/net/phy/phy.c
> >>>> @@ -956,22 +956,28 @@ static struct phy_device
> >>>> *phy_connect_gmii2rgmii(struct mii_dev *bus,
> >>>>         int sn = dev_of_offset(dev);
> >>>>         int off;
> >>>>
> >>>> +       printf("%s sn %d\n", __func__, sn);
> >>>>         while (sn > 0) {
> >>>>                 off = fdt_node_offset_by_compatible(gd->fdt_blob, sn,
> >>>>
> >>>> "xlnx,gmii-to-rgmii-1.0");
> >>>> +               printf("%s 1off %d\n", __func__, off);
> >>>>                 if (off > 0) {
> >>>>                         phydev = phy_device_create(bus, off,
> >>>>                                                    PHY_GMII2RGMII_ID, false,
> >>>>                                                    interface);
> >>>>                         break;
> >>>>                 }
> >>>> -               if (off == -FDT_ERR_NOTFOUND)
> >>>> +               printf("%s 2off %d\n", __func__, off);
> >>>> +
> >>>> +               if (off == -FDT_ERR_NOTFOUND) {
> >>>>                         sn = fdt_first_subnode(gd->fdt_blob, sn);
> >>>> -               else
> >>>> +                       printf("%s sn2 %d\n", __func__, sn);
> >>>> +               } else
> >>>>                         printf("%s: Error finding compat string:%d\n",
> >>>>                                __func__, off);
> >>>>         }
> >>>>
> >>>> +       printf("%s phydev %p\n", __func__, phydev);
> >>>>         return phydev;
> >>>>  }
> >>>>  #endif
> >>>>
> >>>>
> >>>> With latest code and some prints
> >>>> ZYNQ GEM: ff0e0000, mdio bus ff0e0000, phyaddr 12, interface rgmii-id
> >>>> ofnode_valid: node.of_offset 11348
> >>>> ofnode_valid: node.of_offset 11348
> >>>> ofnode_valid: node.of_offset 11348
> >>>> ofnode_valid: node.of_offset 2952
> >>>> ofnode_valid: node.of_offset 11348
> >>>> ofnode_valid: node.of_offset -1
> >>>> ofnode_valid: node.of_offset 11348
> >>>> ofnode_valid: node.of_offset 11348
> >>>> phy_connect_gmii2rgmii 1valid 1 ethernet at ff0e0000
> >>>> ofnode_valid: node.of_offset 11348
> >>>> ofnode_valid: node.of_offset 11348
> >>>> ofnode_valid: node.of_offset 11348
> >>>> phy_connect_gmii2rgmii 2valid 1 ethernet at ff0e0000
> >>>> ofnode_valid: node.of_offset -1
> >>>> ofnode_valid: node.of_offset -1
> >>>> ofnode_valid: node.of_offset 0
> >>>> ofnode_valid: node.of_offset 0
> >>>> phy_connect_gmii2rgmii 3valid 1
> >>>> ofnode_valid: node.of_offset 0
> >>>> ofnode_valid: node.of_offset 0
> >>>> ofnode_valid: node.of_offset 0
> >>>> phy_connect_gmii2rgmii 2valid 1
> >>>> ofnode_valid: node.of_offset -1
> >>>> ofnode_valid: node.of_offset -1
> >>>> ofnode_valid: node.of_offset 0
> >>>> ofnode_valid: node.of_offset 0
> >>>> phy_connect_gmii2rgmii 3valid 1
> >>>> ...
> >>>>
> >>>>
> >>>> [u-boot](debian-sent)$ git diff
> >>>> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> >>>> index dcdef9e661d6..56072ad55216 100644
> >>>> --- a/drivers/net/phy/phy.c
> >>>> +++ b/drivers/net/phy/phy.c
> >>>> @@ -950,7 +950,10 @@ static struct phy_device
> >>>> *phy_connect_gmii2rgmii(struct mii_dev *bus,
> >>>>         struct phy_device *phydev = NULL;
> >>>>         ofnode node = dev_ofnode(dev);
> >>>>
> >>>> +       printf("%s 1valid %d %s\n", __func__, ofnode_valid(node),
> >>>> ofnode_get_name(node));
> >>>> +
> >>>>         while (ofnode_valid(node)) {
> >>>> +               printf("%s 2valid %d %s\n", __func__,
> >>>> ofnode_valid(node), ofnode_get_name(node));
> >>>>                 node = ofnode_by_compatible(node, "xlnx,gmii-to-rgmii-1.0");
> >>>>                 if (ofnode_valid(node)) {
> >>>>                         phydev = phy_device_create(bus, 0,
> >>>> @@ -962,6 +965,7 @@ static struct phy_device
> >>>> *phy_connect_gmii2rgmii(struct mii_dev *bus,
> >>>>                 }
> >>>>
> >>>>                 node = ofnode_first_subnode(node);
> >>>> +               printf("%s 3valid %d %s\n", __func__,
> >>>> ofnode_valid(node), ofnode_get_name(node));
> >>>>         }
> >>>>
> >>>>         return phydev;
> >>>>
> >>>>
> >>>> diff --git a/include/dm/ofnode.h b/include/dm/ofnode.h
> >>>> index 2c0597c40739..7bfc06165e92 100644
> >>>> --- a/include/dm/ofnode.h
> >>>> +++ b/include/dm/ofnode.h
> >>>> @@ -128,8 +128,10 @@ static inline bool ofnode_valid(ofnode node)
> >>>>  {
> >>>>         if (of_live_active())
> >>>>                 return node.np != NULL;
> >>>> -       else
> >>>> +       else {
> >>>> +               printf("ofnode_valid: node.of_offset %ld\n",
> >>>> node.of_offset);
> >>>>                 return node.of_offset >= 0;
> >>>> +       }
> >>>>  }
> >>>>
> >>>>  /**
> >>>>
> >>>>
> >>>> It means ofnode_first_subnode is returning any node which has
> >>>> node.of_offset == 0 which is consider based on ofnode_valid() as valid
> >>>
> >>> This sounds suspicious to me that ofnode_first_subnode() returns a
> >>> node with node.of_offset being zero. I think it should return 11752?
> >>
> >> Gem driver is pointing via phy-handle directly to now which should be
> >> used. That's why there is no subnode but maybe it can be.
> >>
> >> Do you have any idea how to debug this to see that node content for
> >> OF_LIVE and !OF_LIVE?
> >
> > I created a test case below, and reproduced the endless loop you
> > mentioned, but on 2021.04, IOW, without my OF API change patch.
> >
> > diff --git a/arch/sandbox/dts/test.dts b/arch/sandbox/dts/test.dts
> > index 2600360224..8543fbe497 100644
> > --- a/arch/sandbox/dts/test.dts
> > +++ b/arch/sandbox/dts/test.dts
> > @@ -1360,6 +1360,18 @@
> >
> >         mdio: mdio-test {
> >                 compatible = "sandbox,mdio";
> > +               #address-cells = <1>;
> > +               #size-cells = <0>;
> > +
> > +               phy: ethernet-phy at 0 {
> > +                       reg = <0>;
> > +               };
> > +
> > +               gmiitorgmii: gmiitorgmii at 8 {
> > +                       compatible = "xlnx,gmii-to-rgmii-1.0";
> > +                       reg = <8>;
> > +                       phy-handle = <&phy>;
> > +               };
> >         };
> >
> >         pm-bus-test {
> > diff --git a/test/dm/mdio.c b/test/dm/mdio.c
> > index 64347e1275..19fa5a01e9 100644
> > --- a/test/dm/mdio.c
> > +++ b/test/dm/mdio.c
> > @@ -25,8 +25,9 @@
> >  static int dm_test_mdio(struct unit_test_state *uts)
> >  {
> >         struct uclass *uc;
> > -       struct udevice *dev;
> > +       struct udevice *dev, *eth_dev;
> >         struct mdio_ops *ops;
> > +       struct phy_device *phy;
> >         u16 reg;
> >
> >         ut_assertok(uclass_get(UCLASS_MDIO, &uc));
> > @@ -52,6 +53,10 @@ static int dm_test_mdio(struct unit_test_state *uts)
> >                         SANDBOX_PHY_REG);
> >         ut_asserteq(reg, 0);
> >
> > +       ut_assertok(uclass_first_device(UCLASS_ETH, &eth_dev));
> > +       phy = dm_mdio_phy_connect(dev, 0, eth_dev, PHY_INTERFACE_MODE_MII);
> > +       ut_assertnonnull(phy);
> > +
> >         return 0;
> >  }
> >
> > So this means the original fdtdec_ version codes also have an issue,
> > but I have no idea why you did not encounter such an issue before.
>
> I think it is related that zynq gem driver has all the time pointing to
> bridge directly not generic mdio node with several subnodes.
> And when you point directly to one node without others you will never
> end up in that loop.
>
> And for that changes you have done it looks like they behaves
> differently in sense of accepting 0 as valid node which is causing that
> issue.
>
> Maybe because of direct pointing to phy we shouldn't really change any
> other node and remove that loop completely because it is not needed in
> our scenario.

I might know what is wrong.

In original codes, fdt_node_offset_by_compatible() is used, and this
API will go through each DT node one by one, so this API will always
return OK and break the while() loop.

        if (off == -FDT_ERR_NOTFOUND)
            sn = fdt_first_subnode(gd->fdt_blob, sn);
        else
            printf("%s: Error finding compat string:%d\n",
                   __func__, off);

So above codes to call fdt_first_subnode() is never reachable. These
are dead codes.

With the new change, ofnode_by_compatible() only checks one node, and
does not go through every DT node, so calls to ofnode_first_subnode()
is now reachable and is causing issue if the first subnode is not the
bridge. I will see if I can create a complete test case for this.

For this patch,
Reviewed-by: Bin Meng <bmeng.cn at gmail.com>

Regards,
Bin


More information about the U-Boot mailing list