[PATCH 2/2] watchdog: add watchdog behavior configuration

Heinrich Schuchardt xypron.glpk at gmx.de
Sat Sep 26 14:39:39 CEST 2020


On 9/26/20 12:54 PM, Mark Kettenis wrote:
>> From: Wolfgang Denk <wd at denx.de>
>> Date: Sat, 26 Sep 2020 10:45:31 +0200
>
> Hi Wolfgang,
>
>> Dear Heinrich,
>>
>> In message <81e467f1-aff3-a54d-5f7e-3b8f5390f410 at gmx.de> you wrote:
>>>
>>> Chapter 2.3.7 RISC-V Platforms
>>>
>>> p. 40
>>> "The causes of reset could be power-on reset, external hard reset,
>>> brownout detected, watchdog timer elapse, sleep-mode wakeup, etc., which
>>> machine-mode UEFI system firmware has to distinguish."
>>
>> This is possible only if the hardware actually supports such
>> distinction.  In many cases of actual hardware I've seen a watchdog
>> reset is just the same as an external hard reset, and brownout
>> detection does not exist.
>
> This is RISC-V specific, and RISC-V apparently has an architected
> register that communicates the reset cause.  Even then the text
> immediately before the text quoted by Heinrich allows for hardware
> that can't distinguish to simply report 0 in that register implying
> "the most complete reset (e.g., hard reset)".
>
>>  p. 70
>>> "If LoadImage() succeeds, the boot manager must enable the watchdog
>>> timer for 5 minutes by using the EFI_BOOT_SERVICES.SetWatchdogTimer()
>>> boot service prior to calling EFI_BOOT_SERVICES.StartImage(). If a boot
>>> option returns control to the boot manager, the boot manager must
>>> disable the watchdog timer with an additional call to the
>>> SetWatchdogTimer() boot service."
>>>
>>> Chapter 7.4 EFI_BOOT_SERVICES.ExitBootServices()
>>>
>>> p. 222
>>> "On success ... the boot services watchdog timer is disabled."
>>>
>>> Chapter 7.5 EFI_BOOT_SERVICES.SetWatchdogTimer()
>>>
>>> This chapter describes management of the watchdog timer. You can set any
>>> duration with 1 second resolution. A value of 0 will disable the watchdog.
>>
>> To me this makes clear that what they are talking about is not a
>> hardware watchdog, which may not offer such flexibility, but a
>> software layer (driver) that provides such an API to the underlying
>> hardware.  Again depending on the capabilities of the actual
>> hardware.
>>
>>> p. 223
>>> "If the watchdog timer expires, the event is logged by the firmware. The
>>
>> And what if the hardware cannot detect this, for example because the
>> watchdog is an external chip that just pulls the external reset
>> line?
>
> Probably wishful thinking on behalf of the authors of the UEFI spec.
> The existence of a firmware event log isn't mandated by UEFI and on
> x86 typically only implemented on server systems.  On such systems the
> watchdog may very well be implemented by software running on a BMC.
>
>>> ... The watchdog must be set to a
>>> period of 5 minutes. ...
>>
>> Again: here "watchdog" can only mean some software component. Many
>> hardware watchdogs do not allow for such long timeouts at all.
>>
>>> Appendix R - Glossary
>>>
>>> p. 2444
>>> Watchdog Time
>>> An alarm timer that may be set to go off. This can be used to regain
>>> control in cases where a code path in the boot services environment
>>> fails to or is unable to return control by the expected path.
>>
>> Again this does not sound as if they were talking about a specific
>> hardware watchdog.  All this is about software services only, it
>> seem.
>
> Probably.
>
> That doesn't change the fact that enabling a hardware watchdog timer
> that resets the system is problematic for the EFI boot path in U-Boot.
> The typical EFI boot path is:
>
>   UEFI Firmware ->       EFI OS loader       ->    OS kernel
>     (U-Boot)       (GRUB, OpenBSD's EFIBOOT)    (Linux, OpenBSD)
>
> Here the EFI OS loader is hardware agnostic and only supposed to use
> EFI interfaces to do its job.  As such it cannot reset or disable the
> watchdog hardware.  Maybe it can do its job of loading and starting
> the OS kernel before a hardware watchdog timer expires.  But if there
> is any user interaction required the timer will at some point expire
> and reset the system in the middle of the user interaction.

This is why UEFI API has the EFI_BOOT_SERVICES.SetWatchdogTimer()
service which allows to set the timeout or disable the watchdog completely.

>
> Even if the EFI OS loader does its job before the hardware watchdog
> timer expires, there is no guarantee that the OS kernel has the driver
> necesary to reset/disable the hardware watchdog.  Or even if it does
> have such a driver, loading/attaching it may take too much time.

The Linux EFI stub calls EFI_BOOT_SERVICES.ExitBootServices. According
to the UEFI spec this is the moment when the watchdog must be disabled.

So in a UEFI environment you can only monitor the time in U-Boot, in
GRUB and part of the Linux EFI stub. You cannot monitor if Linux reaches
the command prompt using the watchdog provided by the UEFI firmware.
Linux could set up its own watchdog that takes over.

>
> In my opinion enabling a hardware watchdo timer only makes sense if
> U-Boot and the loaded OS kernel are tightly coupled.  Which is a use
> case where one would probably not use the EFI boot path in the first
> place and use a more traditional U-Boot bootpath such as "bootm"
> instead.
>
> At this point UEFI isn't really targeted at "deeply embedded" systems
> that require a hardware watchdog to implement guaranteed recovery from
> software failures.  If there is a desire to use the EFI boot path in
> this scenario someone should probably lobby the UEFI folks to add a
> EFI runtime service to reset the hardware watchdog that can be called
> by the EFI OS Loader and the OS to prevent the timer from expiring.

The API call is already exists as EFI_BOOT_SERVICES.SetWatchdogTimer()
as mentioned above.

If we want to have a hardware based watchdog in the UEFI context, we
need a U-Boot internal API by which we can enable and disable a hardware
watchdog with an arbitrary duration (>> 5 min) with 1 second resolution.
Then our implementation of SetWatchdogTimer() could call into this API.

Best regards

Heinrich


More information about the U-Boot mailing list