[PATCH v3 0/6] Improved sysreset/watchdog uclass integration

Andre Przywara andre.przywara at arm.com
Sat Nov 6 02:52:30 CET 2021


On Fri, 5 Nov 2021 18:56:34 -0400
Tom Rini <trini at konsulko.com> wrote:

> On Fri, Nov 05, 2021 at 09:38:50PM +0100, Heinrich Schuchardt wrote:
> > On 11/5/21 20:17, Tom Rini wrote:  
> > > On Fri, Nov 05, 2021 at 07:37:02PM +0100, Heinrich Schuchardt wrote:  
> > > > On 11/5/21 17:12, Simon Glass wrote:  
> > > > > Hi,
> > > > > 
> > > > > On Fri, 5 Nov 2021 at 08:21, Tom Rini <trini at konsulko.com> wrote:  
> > > > > > 
> > > > > > On Fri, Nov 05, 2021 at 12:14:47PM +0100, Stefan Roese wrote:  
> > > > > > > Hi Andre,
> > > > > > > 
> > > > > > > Added Tom to Cc.
> > > > > > > 
> > > > > > > On 05.11.21 11:04, Andre Przywara wrote:  
> > > > > > > > On Thu, 4 Nov 2021 20:02:41 -0600
> > > > > > > > Simon Glass <sjg at chromium.org> wrote:
> > > > > > > > 
> > > > > > > > Hi,
> > > > > > > >   
> > > > > > > > > On Thu, 4 Nov 2021 at 19:22, Stefan Roese <sr at denx.de> wrote:  
> > > > > > > > > > 
> > > > > > > > > > Hi Andre,
> > > > > > > > > > 
> > > > > > > > > > On 05.11.21 00:11, Andre Przywara wrote:  
> > > > > > > > > > > On Thu, 4 Nov 2021 11:37:57 +0100
> > > > > > > > > > > Stefan Roese <sr at denx.de> wrote:
> > > > > > > > > > > 
> > > > > > > > > > > Hi Stefan,  
> > > > > > > > > > > > On 04.11.21 04:55, Samuel Holland wrote:  
> > > > > > > > > > > > > This series hooks up the watchdog uclass to automatically register
> > > > > > > > > > > > > watchdog devices for use with sysreset, doing a bit of minor cleanup
> > > > > > > > > > > > > along the way.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > The goal is for this to replace the sunxi board-level non-DM reset_cpu()
> > > > > > > > > > > > > function. I was surprised to find that the wdt_reboot driver requires
> > > > > > > > > > > > > its own undocumented device tree node, which references the watchdog
> > > > > > > > > > > > > device by phandle. This is problematic for us, because sunxi-u-boot.dtsi
> > > > > > > > > > > > > file covers 20 different SoCs with varying watchdog node phandle names.
> > > > > > > > > > > > > So it would have required adding a -u-boot.dtsi file for each board.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Hooking things up automatically makes sense to me; this is what Linux
> > > > > > > > > > > > > does. However, I put the code behind a new option to avoid surprises for
> > > > > > > > > > > > > other platforms.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Changes in v3:
> > > > > > > > > > > > >       - Move condition to wdt-uclass.c to fix build errors.
> > > > > > > > > > > > >       - Include watchdog name in error message.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Changes in v2:
> > > > > > > > > > > > >       - Extend the "if SYSRESET" block to the end of the file.
> > > > > > > > > > > > >       - Also make gpio_reboot_probe function static.
> > > > > > > > > > > > >       - Rebase on top of 492ee6b8d0e7 (now handle all watchdogs).
> > > > > > > > > > > > >       - Added patches 5-6 as an example of how the new option will be used.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Samuel Holland (6):
> > > > > > > > > > > > >        sysreset: Add uclass Kconfig dependency to drivers
> > > > > > > > > > > > >        sysreset: Mark driver probe functions as static
> > > > > > > > > > > > >        sysreset: watchdog: Move watchdog reference to plat data
> > > > > > > > > > > > >        watchdog: Automatically register device with sysreset
> > > > > > > > > > > > >        sunxi: Avoid duplicate reset_cpu with SYSRESET enabled
> > > > > > > > > > > > >        sunxi: Use sysreset framework for poweroff/reset
> > > > > > > > > > > > > 
> > > > > > > > > > > > >       arch/arm/Kconfig                     |  3 +++
> > > > > > > > > > > > >       arch/arm/mach-sunxi/board.c          |  2 ++
> > > > > > > > > > > > >       drivers/sysreset/Kconfig             | 11 ++++++--
> > > > > > > > > > > > >       drivers/sysreset/sysreset_gpio.c     |  2 +-
> > > > > > > > > > > > >       drivers/sysreset/sysreset_resetctl.c |  2 +-
> > > > > > > > > > > > >       drivers/sysreset/sysreset_syscon.c   |  2 +-
> > > > > > > > > > > > >       drivers/sysreset/sysreset_watchdog.c | 40 ++++++++++++++++++++++------
> > > > > > > > > > > > >       drivers/watchdog/wdt-uclass.c        |  8 ++++++
> > > > > > > > > > > > >       include/sysreset.h                   | 10 +++++++
> > > > > > > > > > > > >       9 files changed, 67 insertions(+), 13 deletions(-)  
> > > > > > > > > > > > 
> > > > > > > > > > > > Applied to u-boot-marvell  
> > > > > > > > > > > 
> > > > > > > > > > > Mmmh, why u-boot-marvell,  
> > > > > > > > > > 
> > > > > > > > > > Because I'm handling watchdog related changed since a few years and we
> > > > > > > > > > did not create a specific subsystem repo for this and I'm usually
> > > > > > > > > > using my "marvell" one for this.  
> > > > > > 
> > > > > > And fwiw, there's a few other cases like this.  If it's too confusing,
> > > > > > maybe we should just roll out a few more repositories, I think it's
> > > > > > easier to do that now than pre-gitlab?
> > > > > >   
> > > > > > > > > > > and why did this end up already in master?
> > > > > > > > > > > Isn't that material for the next merge window? After all this changes
> > > > > > > > > > > quite a bit, for a lot of boards, and I did not have a closer look at
> > > > > > > > > > > the sunxi parts yet.  
> > > > > > > > > > 
> > > > > > > > > > I was hesitating also a bit. But since this patchset is on the list in
> > > > > > > > > > v1 since over 2 months now (2021-08-21) I thought it was "ready" for
> > > > > > > > > > inclusion now. We are at -rc1 and I think we still have enough time to
> > > > > > > > > > fix any resulting problems in this release cycle.  
> > > > > > > > 
> > > > > > > > Why do we have the merge window then? This is clearly not a regression or
> > > > > > > > general fix.  
> > > > > > > 
> > > > > > > AFAIU, we are a bit less strict here in U-Boot. Patches that were posted
> > > > > > > before the merge-window and skipped the review process (most likely
> > > > > > > because of lack of time) are often still integrated in the early rcX
> > > > > > > cycles. At least this is how I handle it usually.
> > > > > > > 
> > > > > > > Tom, is my understanding here correct?  
> > > > > > 
> > > > > > Yes.  We are not as strict as the kernel is about what can come in
> > > > > > between rc1 and rc2 (and to a certain degree, post rc2).  I leave things
> > > > > > up to the discretion of the custodians.  People tend of have less time
> > > > > > to handle U-Boot changes than other stuff, so I try and be flexible in
> > > > > > picking things up.
> > > > > >   
> > > > > > > > > Yes I agree, that should be plenty of time for people to review it.  
> > > > > > > > 
> > > > > > > > Well, if there would be people to review the sunxi parts :-(
> > > > > > > > I am totally fine with the generic patches (as they have been reviewed),
> > > > > > > > but the sunxi integration is somewhat risky.
> > > > > > > > I was explicitly deprioritising that in my queue, as it really doesn't
> > > > > > > > change, add or fix anything, it's mere refactoring, from the user's point
> > > > > > > > of view.
> > > > > > > >   
> > > > > > > > > > Do you see any specific issues?  
> > > > > > > > 
> > > > > > > > Patch 6/6 changes the config for all 157 Allwinner boards, so I think that
> > > > > > > > deserves at least some testing, *before* merging it.  
> > > > > > > 
> > > > > > > I expect that Samuel did some testing. But still, I agree that it
> > > > > > > would be much better, if these patches - especially the Allwinner parts
> > > > > > > got more extensive testing.
> > > > > > >   
> > > > > > > > I will do as much testing now as possible, but I am not happy about that
> > > > > > > > situation.  
> > > > > > > 
> > > > > > > Understood. Should we revert patch 6/6 for now?  
> > > > > > 
> > > > > > FWIW, given Samuel has been doing a number of allwinner changes, I had
> > > > > > also assumed it was sufficiently tested, which is why I didn't raise a
> > > > > > further concern when I saw the widespread nature of the overall changes,
> > > > > > just figured it was a few more ready-to-go cleanups that weren't quite
> > > > > > picked up in time.  Please do speak up if you want me to revert the last
> > > > > > part.  
> > > > > 
> > > > > Also it is often true that people find problems by testing on master
> > > > > so applying it helps to shake the tree a bit.
> > > > > 
> > > > > Regards,
> > > > > Simon
> > > > >   
> > > > 
> > > > We don't actually have a problem with this series but with a previous
> > > > watchdog patch. The culprit according to bisecting is:
> > > > 
> > > > b147bd3607f8 ("sunxi: Enable watchdog timer support by default")
> > > > 
> > > > When booting the OrangePi PC the watchdog triggers while Linux is booting,
> > > > ca. 16 s after leaving the UEFI subsystem. This matches WDT_MAX_TIMEOUT in
> > > > drivers/watchdog/sunxi_wdt.c.
> > > > 
> > > > If I run
> > > >   
> > > > => wdt dev watchdog at 1c20ca0
> > > > => wdt stop  
> > > > 
> > > > before the bootefi command booting succeeds.
> > > > 
> > > > We don't disarm the watchdog and Linux does not do it for us in time.
> > > > 
> > > > The UEFI specification requires that the default watchdog reset time is 300
> > > > s. We should never arm the Sunxi hardware watchdog except within the
> > > > watchdog reset driver.
> > > > 
> > > > The solution is to disable CONFIG_WATCHDOG_AUTOSTART on SUNXI. See
> > > > 
> > > > [PATCH 1/1] watchdog: don't autostart watchdog on Sunxi boards
> > > > https://lists.denx.de/pipermail/u-boot/2021-November/466318.html  
> > > 
> > > This means we never did come up with a satisfactory to everyone solution
> > > to what UEFI thinks a watchdog should do, and what other types of
> > > deployment think a watchdog should do, yes?
> > >   
> > 
> > Dear Tom,
> > 
> > The issue is *not* UEFI specific.
> > 
> > A watchdog timeout of 16 seconds is too short for Linux to boot no matter
> > whether you use the EFI stub or the legacy entry point.
> > 
> > I only referred to the UEFI specification as it indicates what can be
> > considered as a reasonable timeout interval: 300 seconds.  
> 
> 16 seconds from the last time we pet the watchdog in U-Boot to the
> kernel being able to take over is quite reasonable.

How do we know that the kernel takes over? What if the kernel/EFI
payload doesn't have a watchdog driver? I was assuming that the
watchdog would be disabled as soon as we boot a kernel or an EFI app
calls ExitBootServices (maybe even earlier).
But this sounds like a generic problem, not sunxi specific. So how do
other platforms solve this?

Cheers,
Andre

> Now, if the Andre
> says he's fine just disabling watchdog by default for sunxi, fine.  But
> yes, we never did come up with a reasonable solution to UEFI saying 5
> minute timeout for watchdog servicing vs other platforms using a much
> shorter watchdog period.
> 



More information about the U-Boot mailing list