Broken watchdog in u-boot master branch

Rasmus Villemoes rasmus.villemoes at prevas.dk
Tue Oct 11 09:18:56 CEST 2022


On 10/10/2022 15.55, Tom Rini wrote:
> On Sun, Oct 09, 2022 at 09:12:25PM +0200, Pali Rohár wrote:
> 
>> Hello! Watchdog code seems to be broken in u-boot master branch.
>> On Nokia N900 I'm getting following message in qemu:
>>
>> cyclic function rx51_watchdog took too long: 10000us vs 1000us max, disabling
>>
>> Seems that watchdog core code is not prepared for "slower" watchdogs
>> which communicate over slower i2c bus, like it is the case for N900.
>>
>> Disabling slower watchdog is a bad idea as it would result in reboot
>> loop instead of slower - but working code.

So, a few thoughts.

First, I assume that that board has a very coarse-grained tick, probably
just 1000Hz. Otherwise it would be pretty amazing for cpu_time to come
out as 10ms exactly. That's not the board's fault, of course, just an
observation, but it is something we need to bear in mind. If the
resolution is merely 100Hz, so 10ms is simply the granularity, we cannot
really meaningfully compare the cpu_time to anything less than that,
because every once in a while it _will_ happen that we sample "now" just
before the tick, run the function, then sample again just after, and it
may only have taken 17us, yet the diff comes out as 10ms.

Second, perhaps the threshold should not be a compile-time constant, but
instead a fraction of the requested call frequency (say 1.5%, 1/64).
I.e., if we've registered a function to be called every 10 seconds, we'd
check if its runtime exceeded (10000000 >> 6) us. Preferably per above
that bound is rounded up to a multiple of the timer's granularity (we
can get that, right?)

Third, perhaps we shouldn't disable it, but just print a (one-time)
warning. Adding a "already-warned" field to struct cyclic_info is
certainly simple enough.

Rasmus


More information about the U-Boot mailing list