[PATCH v3] dm: core: Do not stop uclass iteration on error

Michal Suchánek msuchanek at suse.de
Wed Aug 31 09:39:03 CEST 2022


Hello,

On Tue, Aug 30, 2022 at 09:15:12PM -0600, Simon Glass wrote:
> Hi Michal,
> 
> On Tue, 30 Aug 2022 at 10:48, Michal Suchánek <msuchanek at suse.de> wrote:
> >
> > On Tue, Aug 30, 2022 at 09:56:52AM -0600, Simon Glass wrote:
> > > Hi Michal,
> > >
> > > On Tue, 30 Aug 2022 at 04:23, Michal Suchánek <msuchanek at suse.de> wrote:
> > > >
> > > > On Sat, Aug 27, 2022 at 07:52:27PM -0600, Simon Glass wrote:
> > > > > Hi Michal,
> > > > >
> > > > > On Fri, 19 Aug 2022 at 14:23, Michal Suchanek <msuchanek at suse.de> wrote:
> > > > > >
> > > > > > When probing a device fails NULL pointer is returned, and other devices
> > > > > > cannot be iterated. Skip to next device on error instead.
> > > > > >
> > > > > > Fixes: 6494d708bf ("dm: Add base driver model support")
> > > > >
> > > > > I think you should drop this as you are doing a change of behaviour,
> > > > > not fixing a bug!
> > > >
> > > > You can hardly fix a bug without a change in behavior.
> > > >
> > > > These functions are used for iterating devices, and are not iterating
> > > > devices. That's clearly a bug.
> > >
> > > If it were clear I would have changed this long ago. The new way you
> > > have this function ignores errors, so they cannot be reported.
> > >
> > > We should almost always report errors, which is why I think your
> > > methods should be named differently.
> > >
> > > >
> > > > > > Signed-off-by: Michal Suchanek <msuchanek at suse.de>
> > > > > > ---
> > > > > > v2: - Fix up tests
> > > > > > v3: - Fix up API doc
> > > > > >     - Correctly forward error from uclass_get
> > > > > >     - Do not return an error when last device fails to probe
> > > > > >     - Drop redundant initialization
> > > > > >     - Wrap at 80 columns
> > > > > > ---
> > > > > >  drivers/core/uclass.c | 32 ++++++++++++++++++++++++--------
> > > > > >  include/dm/uclass.h   | 13 ++++++++-----
> > > > > >  test/dm/test-fdt.c    | 20 ++++++++++++++++----
> > > > > >  3 files changed, 48 insertions(+), 17 deletions(-)
> > > > >
> > > > > Unfortunately this still fails one test. Try 'make qcheck' to see it -
> > > > > it is ethernet.
> > > >
> > > > I will look at that.
> > > >
> > > > > I actually think you should create new functions for this feature,
> > > > > e.g.uclass_first_device_ok(), since it makes it impossible to see what
> > > > > when wrong with a device in the middle.
> > > > >
> > > > > I have long had all this in my mind. One idea for a future change is
> > > > > to return the error, but set dev, so that the caller knows there is a
> > > > > device, which failed. When we are at the end, dev is set to NULL.
> > > >
> > > > We already have uclass_first_device_check() and
> > > > uclass_next_device_check() to iterate all devices, including broken
> > > > ones, and getting the errors as well.
> > > >
> > > > That's for the case you want all the details, and these are for the case
> > > > you just want to get devices and don't care about the details.
> > > >
> > > > That's AFAICT as much as this iteration interface can provide, and we
> > > > have both cases covered.
> > >
> > > I see three cases:
> > > - want to see the next device, returning the error if it cannot be
> > > probed - uclass_first_device()
> >
> > And the point of this is what exactly?
> 
> Please can you adjust your tone, It seems too aggressive for this
> mailing list. Thank you.
> 
> >
> > The device order in the uclass is not well defined - at any time a new
> > device which will become the first can be added, fail probe, and block
> > what was assumed a loop iterating the uclass from returning any devices
> > at all. That's exactly what happened with the new sysreset.
> 
> The order only changes if the device is unbound and rebound. Otherwise
> the order set by the device tree is used.

So the order is defined by device tree. That does not make it
well-defined from the point of view of any kind of code.

The point of device tree is that it can be replaced with another device
tree describing another board and the code should still work. Otherwise
we would not need device trees, and could keep using board files.

> > What is exactly the point of returning the error and not the pointer to
> > the next device?
> 
> Partly, we have existing code which uses the interface, checking 'dev'
> to see if the device is valid. I would be happy to change that, so
> that the device is always returned. In fact I think it would be
> better. But it does need a bit of work with coccinelle, etc.

I suppose changing the return type to void would catch the users that do
something with the return value but it would still need building all
the code.

And it does not work for users of uclass_first_device_err which is
basically useless after this change but pretty much all users use the
return value.

> > The only point of these simplified iterators is that the caller can
> > check only one value (device pointer) and then not check the error
> > because they don't care. If they do cate uclass_first_device_check()
> > provides all the details available.
> 
> Yes I think we can have just two sets of iterators, but in that case
> it should be:
> 
> - want to see the next device, returning the error if it cannot be
> probed, with dev updated to the next device in any case - new version
> of uclass_first_device() - basically rename
> uclass_first_device_check() to that

About 2/3 of users of uclass_first_device don't use the return value at
all in current code. Changing uclass_first_device to
uclass_first_device_check is counterproductive. The current
documentation basically implies the new behavior, and there are a lot of
examples in the core code that use uclass_first_device in a for loop
without assigning the return value at all.

Also renaming uclass_first_device_check would break the 3 existing users
of it.

> - want to see next device which probes OK - your new function, perhaps
> uclass_first_device_ok() ?

I don't think any amount of renaming is going to solve the problem at
hand: we have bazillion of users of uclass_first_device, and because it
was not documented that it does not in fact iterate uclass devices there
are users that use it for the purpose. There are also users that expect
maningful return value which is basically bogus - they do get a return
value of something, but not something specific.

What can be done is adding the simple iterator under new name, convert
the obvious existing users, and mark the old function deprecated in some
way so that any code that uses it generates a warning.

Thanks

Michal


More information about the U-Boot mailing list