[PATCH v3 0/6] Improved sysreset/watchdog uclass integration

Tom Rini trini at konsulko.com
Tue Nov 9 01:25:19 CET 2021


On Mon, Nov 08, 2021 at 05:09:14PM -0700, Simon Glass wrote:
> Hi Heinrich,
> 
> On Mon, 8 Nov 2021 at 09:17, Heinrich Schuchardt
> <heinrich.schuchardt at canonical.com> wrote:
> >
> > On 11/8/21 17:05, Tom Rini wrote:
> > > On Mon, Nov 08, 2021 at 08:58:33AM -0700, Simon Glass wrote:
> > >> Hi,
> > >>
> > >> On Sun, 7 Nov 2021 at 04:18, Heinrich Schuchardt
> > >> <heinrich.schuchardt at canonical.com> wrote:
> > >>>
> > >>>
> > >>>
> > >>> On 11/6/21 14:53, Tom Rini wrote:
> > >>>> On Sat, Nov 06, 2021 at 04:55:44AM +0100, Heinrich Schuchardt wrote:
> > >>>>>
> > >>>>>
> > >>>>> On 11/6/21 02:52, Andre Przywara wrote:
> > >>>>>> On Fri, 5 Nov 2021 18:56:34 -0400
> > >>>>>> Tom Rini <trini at konsulko.com> wrote:
> > >>>>>>
> > >>>>>>> On Fri, Nov 05, 2021 at 09:38:50PM +0100, Heinrich Schuchardt wrote:
> > >>>>>>>> On 11/5/21 20:17, Tom Rini wrote:
> > >>>>>>>>> On Fri, Nov 05, 2021 at 07:37:02PM +0100, Heinrich Schuchardt wrote:
> > >>>>>>>>>> On 11/5/21 17:12, Simon Glass wrote:
> > >>>>>>>>>>> Hi,
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Fri, 5 Nov 2021 at 08:21, Tom Rini <trini at konsulko.com> wrote:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On Fri, Nov 05, 2021 at 12:14:47PM +0100, Stefan Roese wrote:
> > >>>>>>>>>>>>> Hi Andre,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Added Tom to Cc.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On 05.11.21 11:04, Andre Przywara wrote:
> > >>>>>>>>>>>>>> On Thu, 4 Nov 2021 20:02:41 -0600
> > >>>>>>>>>>>>>> Simon Glass <sjg at chromium.org> wrote:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Hi,
> > >>>>>>>>>>>>>>> On Thu, 4 Nov 2021 at 19:22, Stefan Roese <sr at denx.de> wrote:
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Hi Andre,
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> On 05.11.21 00:11, Andre Przywara wrote:
> > >>>>>>>>>>>>>>>>> On Thu, 4 Nov 2021 11:37:57 +0100
> > >>>>>>>>>>>>>>>>> Stefan Roese <sr at denx.de> wrote:
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Hi Stefan,
> > >>>>>>>>>>>>>>>>>> On 04.11.21 04:55, Samuel Holland wrote:
> > >>>>>>>>>>>>>>>>>>> This series hooks up the watchdog uclass to automatically register
> > >>>>>>>>>>>>>>>>>>> watchdog devices for use with sysreset, doing a bit of minor cleanup
> > >>>>>>>>>>>>>>>>>>> along the way.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> The goal is for this to replace the sunxi board-level non-DM reset_cpu()
> > >>>>>>>>>>>>>>>>>>> function. I was surprised to find that the wdt_reboot driver requires
> > >>>>>>>>>>>>>>>>>>> its own undocumented device tree node, which references the watchdog
> > >>>>>>>>>>>>>>>>>>> device by phandle. This is problematic for us, because sunxi-u-boot.dtsi
> > >>>>>>>>>>>>>>>>>>> file covers 20 different SoCs with varying watchdog node phandle names.
> > >>>>>>>>>>>>>>>>>>> So it would have required adding a -u-boot.dtsi file for each board.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Hooking things up automatically makes sense to me; this is what Linux
> > >>>>>>>>>>>>>>>>>>> does. However, I put the code behind a new option to avoid surprises for
> > >>>>>>>>>>>>>>>>>>> other platforms.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Changes in v3:
> > >>>>>>>>>>>>>>>>>>>          - Move condition to wdt-uclass.c to fix build errors.
> > >>>>>>>>>>>>>>>>>>>          - Include watchdog name in error message.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Changes in v2:
> > >>>>>>>>>>>>>>>>>>>          - Extend the "if SYSRESET" block to the end of the file.
> > >>>>>>>>>>>>>>>>>>>          - Also make gpio_reboot_probe function static.
> > >>>>>>>>>>>>>>>>>>>          - Rebase on top of 492ee6b8d0e7 (now handle all watchdogs).
> > >>>>>>>>>>>>>>>>>>>          - Added patches 5-6 as an example of how the new option will be used.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Samuel Holland (6):
> > >>>>>>>>>>>>>>>>>>>           sysreset: Add uclass Kconfig dependency to drivers
> > >>>>>>>>>>>>>>>>>>>           sysreset: Mark driver probe functions as static
> > >>>>>>>>>>>>>>>>>>>           sysreset: watchdog: Move watchdog reference to plat data
> > >>>>>>>>>>>>>>>>>>>           watchdog: Automatically register device with sysreset
> > >>>>>>>>>>>>>>>>>>>           sunxi: Avoid duplicate reset_cpu with SYSRESET enabled
> > >>>>>>>>>>>>>>>>>>>           sunxi: Use sysreset framework for poweroff/reset
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>          arch/arm/Kconfig                     |  3 +++
> > >>>>>>>>>>>>>>>>>>>          arch/arm/mach-sunxi/board.c          |  2 ++
> > >>>>>>>>>>>>>>>>>>>          drivers/sysreset/Kconfig             | 11 ++++++--
> > >>>>>>>>>>>>>>>>>>>          drivers/sysreset/sysreset_gpio.c     |  2 +-
> > >>>>>>>>>>>>>>>>>>>          drivers/sysreset/sysreset_resetctl.c |  2 +-
> > >>>>>>>>>>>>>>>>>>>          drivers/sysreset/sysreset_syscon.c   |  2 +-
> > >>>>>>>>>>>>>>>>>>>          drivers/sysreset/sysreset_watchdog.c | 40 ++++++++++++++++++++++------
> > >>>>>>>>>>>>>>>>>>>          drivers/watchdog/wdt-uclass.c        |  8 ++++++
> > >>>>>>>>>>>>>>>>>>>          include/sysreset.h                   | 10 +++++++
> > >>>>>>>>>>>>>>>>>>>          9 files changed, 67 insertions(+), 13 deletions(-)
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Applied to u-boot-marvell
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Mmmh, why u-boot-marvell,
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Because I'm handling watchdog related changed since a few years and we
> > >>>>>>>>>>>>>>>> did not create a specific subsystem repo for this and I'm usually
> > >>>>>>>>>>>>>>>> using my "marvell" one for this.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> And fwiw, there's a few other cases like this.  If it's too confusing,
> > >>>>>>>>>>>> maybe we should just roll out a few more repositories, I think it's
> > >>>>>>>>>>>> easier to do that now than pre-gitlab?
> > >>>>>>>>>>>>>>>>> and why did this end up already in master?
> > >>>>>>>>>>>>>>>>> Isn't that material for the next merge window? After all this changes
> > >>>>>>>>>>>>>>>>> quite a bit, for a lot of boards, and I did not have a closer look at
> > >>>>>>>>>>>>>>>>> the sunxi parts yet.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> I was hesitating also a bit. But since this patchset is on the list in
> > >>>>>>>>>>>>>>>> v1 since over 2 months now (2021-08-21) I thought it was "ready" for
> > >>>>>>>>>>>>>>>> inclusion now. We are at -rc1 and I think we still have enough time to
> > >>>>>>>>>>>>>>>> fix any resulting problems in this release cycle.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Why do we have the merge window then? This is clearly not a regression or
> > >>>>>>>>>>>>>> general fix.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> AFAIU, we are a bit less strict here in U-Boot. Patches that were posted
> > >>>>>>>>>>>>> before the merge-window and skipped the review process (most likely
> > >>>>>>>>>>>>> because of lack of time) are often still integrated in the early rcX
> > >>>>>>>>>>>>> cycles. At least this is how I handle it usually.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Tom, is my understanding here correct?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Yes.  We are not as strict as the kernel is about what can come in
> > >>>>>>>>>>>> between rc1 and rc2 (and to a certain degree, post rc2).  I leave things
> > >>>>>>>>>>>> up to the discretion of the custodians.  People tend of have less time
> > >>>>>>>>>>>> to handle U-Boot changes than other stuff, so I try and be flexible in
> > >>>>>>>>>>>> picking things up.
> > >>>>>>>>>>>>>>> Yes I agree, that should be plenty of time for people to review it.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Well, if there would be people to review the sunxi parts :-(
> > >>>>>>>>>>>>>> I am totally fine with the generic patches (as they have been reviewed),
> > >>>>>>>>>>>>>> but the sunxi integration is somewhat risky.
> > >>>>>>>>>>>>>> I was explicitly deprioritising that in my queue, as it really doesn't
> > >>>>>>>>>>>>>> change, add or fix anything, it's mere refactoring, from the user's point
> > >>>>>>>>>>>>>> of view.
> > >>>>>>>>>>>>>>>> Do you see any specific issues?
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Patch 6/6 changes the config for all 157 Allwinner boards, so I think that
> > >>>>>>>>>>>>>> deserves at least some testing, *before* merging it.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> I expect that Samuel did some testing. But still, I agree that it
> > >>>>>>>>>>>>> would be much better, if these patches - especially the Allwinner parts
> > >>>>>>>>>>>>> got more extensive testing.
> > >>>>>>>>>>>>>> I will do as much testing now as possible, but I am not happy about that
> > >>>>>>>>>>>>>> situation.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Understood. Should we revert patch 6/6 for now?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> FWIW, given Samuel has been doing a number of allwinner changes, I had
> > >>>>>>>>>>>> also assumed it was sufficiently tested, which is why I didn't raise a
> > >>>>>>>>>>>> further concern when I saw the widespread nature of the overall changes,
> > >>>>>>>>>>>> just figured it was a few more ready-to-go cleanups that weren't quite
> > >>>>>>>>>>>> picked up in time.  Please do speak up if you want me to revert the last
> > >>>>>>>>>>>> part.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Also it is often true that people find problems by testing on master
> > >>>>>>>>>>> so applying it helps to shake the tree a bit.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Regards,
> > >>>>>>>>>>> Simon
> > >>>>>>>>>>
> > >>>>>>>>>> We don't actually have a problem with this series but with a previous
> > >>>>>>>>>> watchdog patch. The culprit according to bisecting is:
> > >>>>>>>>>>
> > >>>>>>>>>> b147bd3607f8 ("sunxi: Enable watchdog timer support by default")
> > >>>>>>>>>>
> > >>>>>>>>>> When booting the OrangePi PC the watchdog triggers while Linux is booting,
> > >>>>>>>>>> ca. 16 s after leaving the UEFI subsystem. This matches WDT_MAX_TIMEOUT in
> > >>>>>>>>>> drivers/watchdog/sunxi_wdt.c.
> > >>>>>>>>>>
> > >>>>>>>>>> If I run
> > >>>>>>>>>> => wdt dev watchdog at 1c20ca0
> > >>>>>>>>>> => wdt stop
> > >>>>>>>>>>
> > >>>>>>>>>> before the bootefi command booting succeeds.
> > >>>>>>>>>>
> > >>>>>>>>>> We don't disarm the watchdog and Linux does not do it for us in time.
> > >>>>>>>>>>
> > >>>>>>>>>> The UEFI specification requires that the default watchdog reset time is 300
> > >>>>>>>>>> s. We should never arm the Sunxi hardware watchdog except within the
> > >>>>>>>>>> watchdog reset driver.
> > >>>>>>>>>>
> > >>>>>>>>>> The solution is to disable CONFIG_WATCHDOG_AUTOSTART on SUNXI. See
> > >>>>>>>>>>
> > >>>>>>>>>> [PATCH 1/1] watchdog: don't autostart watchdog on Sunxi boards
> > >>>>>>>>>> https://lists.denx.de/pipermail/u-boot/2021-November/466318.html
> > >>>>>>>>>
> > >>>>>>>>> This means we never did come up with a satisfactory to everyone solution
> > >>>>>>>>> to what UEFI thinks a watchdog should do, and what other types of
> > >>>>>>>>> deployment think a watchdog should do, yes?
> > >>>>>>>>
> > >>>>>>>> Dear Tom,
> > >>>>>>>>
> > >>>>>>>> The issue is *not* UEFI specific.
> > >>>>>>>>
> > >>>>>>>> A watchdog timeout of 16 seconds is too short for Linux to boot no matter
> > >>>>>>>> whether you use the EFI stub or the legacy entry point.
> > >>>>>>>>
> > >>>>>>>> I only referred to the UEFI specification as it indicates what can be
> > >>>>>>>> considered as a reasonable timeout interval: 300 seconds.
> > >>>>>>>
> > >>>>>>> 16 seconds from the last time we pet the watchdog in U-Boot to the
> > >>>>>>> kernel being able to take over is quite reasonable.
> > >>>>>>
> > >>>>>> How do we know that the kernel takes over? What if the kernel/EFI
> > >>>>>> payload doesn't have a watchdog driver? I was assuming that the
> > >>>>>> watchdog would be disabled as soon as we boot a kernel or an EFI app
> > >>>>>> calls ExitBootServices (maybe even earlier).
> > >>>>>> But this sounds like a generic problem, not sunxi specific. So how do
> > >>>>>> other platforms solve this?
> > >>>>>>
> > >>>>>> Cheers,
> > >>>>>> Andre
> > >>>>>
> > >>>>> The UEFI specification has this requirement in chapter "3.1.2 Load Option
> > >>>>> Processing":
> > >>>>>
> > >>>>> "... the boot manager must enable the watchdog timer for 5 minutes by using
> > >>>>> the EFI_BOOT_SERVICES.SetWatchdogTimer() boot service prior to calling
> > >>>>> EFI_BOOT_SERVICES.StartImage(). If a boot option returns control to the boot
> > >>>>> manager, the boot manager must disable the watchdog timer with an additional
> > >>>>> call to the SetWatchdogTimer() boot service."
> > >>>>>
> > >>>>> This means that having an armed watchdog when starting the kernel is
> > >>>>> correct.
> > >>>>>
> > >>>>> If you start a watchdog in the firmware which is not disabled or reset by
> > >>>>> the operating system, you are out of luck and won't be able to boot.
> > >>>>>
> > >>>>> Current Linux has driver drivers/watchdog/sunxi_wdt.c compatible to
> > >>>>> "allwinner,sun4i-a10-wdt","allwinner,sun6i-a31-wdt" and enabled by
> > >>>>> CONFIG_SUNXI_WATCHDOG. This driver was introduced in Linux v3.12. It
> > >>>>> originally had compatible "allwinner,sun4i-wdt" only.
> > >>>>>
> > >>>>> Debian Bullseye has the driver enabled as a module. In the bootlog of the
> > >>>>> Orange Pi PC I find:
> > >>>>> [   12.321909] sunxi-wdt 1c20ca0.watchdog: Watchdog enabled (timeout=16 sec,
> > >>>>> nowayout=0)
> > >>>>> This message appears approximately *20 seconds* after the EFI stub hands
> > >>>>> over to the main kernel. Adding the driver to initrd shortens this to *18
> > >>>>> seconds*. The message occurs after file system checks which can be a lengthy
> > >>>>> operation. In Debian systemd manages the watchdog.
> > >>>>>
> > >>>>> As I said: 16 seconds is way too short for a hardware watchdog timeout.
> > >>>>
> > >>>> What's the time if you build it in?
> > >>>>
> > >>>
> > >>> For sure you will find some board and configuration that is faster.
> > >>>
> > >>> But why should I care? This series breaks booting Debian on my board. So
> > >>> it needs to be fixed. So, please, apply my patch that is doing so.
> > >>
> > >> Five minutes sounds completely unacceptable for embedded platforms.
> > >> The user will surely have packaged the item up and will be just
> > >> heading out to drop it off for return...
> >
> > I have no problem with people switching on the SUNXI hardware watchdog
> > for their specific embedded solution. But here it was switched on for
> > all SUNXI boards and breaks booting into Linux distributions.
> >
> > If the Linux distribution resets the watchdog *after* file system checks
> > because it is managed by systemd (as is true for Debian), 5 minutes is
> > realistic.
> >
> > >
> > > I'm trying to avoid bringing up the long discussion from the previous
> > > thread about this :)
> > >
> > >> Do we need to add a special case for UEFI here? E.g. bootefi could use
> > >> a hook to lengthen the watchdog?
> >
> > The problem is not UEFI related.
> 
> Oh dear, that is definitely quite slow.
> 
> >
> > >
> > > Well, the problem is that the hardware watchdog has a maximum period of
> > > 16 seconds, I believe.
> > >
> >
> > Yes, Sunxi is limited to 16 seconds.
> 
> I suppose it isn't possible to disable the watchdog once it is enabled?

You can, it's just not a general solution because, well, other platforms
with a more reasonable overall period may still violate the UEFI spec
but not get tripped up.  And globally disabling the watchdog before
"bootefi" or similar is also not a great idea.  It probably is, sadly,
the best option to just disable by default (so not enable until the
kernel does) watchdog on sunxi.

-- 
Tom
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 659 bytes
Desc: not available
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20211108/b43c8e79/attachment.sig>


More information about the U-Boot mailing list