[PATCH v4 07/10] watchdog: wdt-uclass.c: handle all DM watchdogs in watchdog_reset()

Simon Glass sjg at chromium.org
Thu Aug 12 16:12:19 CEST 2021


Hi Tom,

On Thu, 12 Aug 2021 at 07:48, Tom Rini <trini at konsulko.com> wrote:
>
> On Thu, Aug 12, 2021 at 08:40:21AM +0200, Wolfgang Denk wrote:
> > Dear Tom,
> >
> > In message <20210811124318.GT858 at bill-the-cat> you wrote:
> > >
> > > > This argument fits on all types or effors: they are supposed to
> > > > never ever happen - at least in theory; in reality they do, and more
> > > > often than we like.
> > > >
> > > > And a proper error message is mandatory for correct error handling.
> > >
> > > Error messages are a fine line to walk.  We've got to have been very
> > > badly corrupted to go down this error path.  There's going to be lots of
> > > other error messages popping out.  Saving a bit of .text is good.  It
> > > makes it easier to justify spending a little .text later.
> >
> > Letting errors slip through unnoticed when there is a trival way to
> > at least inform the user of the problem is simply unacceptable.
> >
> > Please do not let U-Boot degrade into such a crappy piece of code.
> >
> > There are tons of other places where we don't even mention code
> > size, so if you want to save on that, there are many bette4r places
> > to save than error handling.
>
> Alright, lets take a look at what kind of area of the code we're talking
> about.  uclass_get is a pretty fundamental thing.  If that fails, your
> system is on fire.  Things are massively corrupt.  Lets look at other
> existing callers to see what happens.  Most callers check the return
> code, like you need to, and pass it up the chain to deal with.  We have
> a few board specific ones such as
> board/Marvell/octeontx2/board.c::board_quiesce_devices() that is also
> conceptually like the x530 case in the next part of the series.  That
> does print on failure.  The rest of the ones that print an error message
> are about commands and it's somewhat helpful there.
>
> So yes, return codes need to be checked and passed.  But no, not every
> single error path needs to print to the user along every part of an
> error path either.
>
> > > And here I agree, catch an error code, pass the error code back to the
> > > caller.  That's far more important than making sure that every error
> > > code we catch logs a message by default every time.
> >
> > It does not matter where the error is reported - in the called
> > function, or in some caller firther up the call tree.  But it _must_
> > be reportet at least once.
> >
> > So if we don't issue an error message here, we need to check and fix
> > the callers, too.
>
> That would be the next patch in the series where the BSP author isn't
> currently checking the return value, and this series doesn't change
> that.  Perhaps it should, and CC the maintainer.  But I think has been
> said a few times over the course of this series, what exactly is one
> going to do about the failure?  Getting specific for a moment, if you're
> in the case of "shutdown the watchdog" and the watchdog doesn't shutdown
> like you want it to, do you hang and hope the watchdog is alive to kick
> things still?  hang and lock the system?  Figure the system is on fire
> anyhow but add another message to the failure spew?
>
> Again, I think the change that's needed to this patch is to make it
> return the error code to the caller.  Let the caller decide.  And make
> sure to CC the board maintainer on the next go-round so they can chime
> in about that.

I strongly agree with this.I try to encourage people not to report
errors inside drivers, since the caller should be able to deal with
it, particularly if the error number provides useful information. It
bloats the code.

But we do have these strange cases where there is no obvious thing to
do. Where we are probing several devices to fire them up and one
fails, people worry that this means that some of them won't get
probed. In that case I tend to continue on and probe them all, but
then return error if any of them failed. At least the caller knows.

Also I like the new log_ret() and log_msg_ret() functions, which can
log an error as it goes up the stack if LOG_ERROR_RETURN is enabled.
It is nice to be able to see where an error came from. We could
perhaps improve this by logging the error when it is created (the
first time some calls 'return -Exxx').

I'm not a fan of board code which calls a function and doesn't check
the error. The board may not operate correctly, or it may limp along,
but the board author should be able to get all the bugs of it at the
start so that we are not requested invalid clocks, etc.

Regards,
Simon


More information about the U-Boot mailing list