[PATCH] time: Fix get_ticks being non-monotonic
Sean Anderson
seanga2 at gmail.com
Wed Sep 9 02:02:11 CEST 2020
On 9/8/20 8:01 PM, Simon Glass wrote:
> Hi Sean,
>
> On Tue, 8 Sep 2020 at 17:59, Sean Anderson <seanga2 at gmail.com> wrote:
>>
>> On 9/8/20 7:56 PM, Simon Glass wrote:
>>> Hi Sean,
>>>
>>> On Mon, 7 Sep 2020 at 09:51, Sean Anderson <seanga2 at gmail.com> wrote:
>>>>
>>>> On 9/7/20 9:57 AM, Simon Glass wrote:
>>>>> Hi Sean,
>>>>>
>>>>> On Sun, 6 Sep 2020 at 20:02, Sean Anderson <seanga2 at gmail.com> wrote:
>>>>>>
>>>>>> On 9/6/20 9:43 PM, Simon Glass wrote:
>>>>>>> Hi Sean,
>>>>>>>
>>>>>>> On Tue, 1 Sep 2020 at 13:56, Sean Anderson <seanga2 at gmail.com> wrote:
>>>>>>>>
>>>>>>>> get_ticks does not always succeed. Sometimes it can be called before the
>>>>>>>> timer has been initialized. If it does, it returns a negative errno.
>>>>>>>> This causes the timer to appear non-monotonic, because the value will
>>>>>>>> become much smaller after the timer is initialized.
>>>>>>>>
>>>>>>>> No users of get_ticks which I checked handle errors of this kind. Further,
>>>>>>>> functions like tick_to_time mangle the result of get_ticks, making it very
>>>>>>>> unlikely that one could check for an error without suggesting a patch such
>>>>>>>> as this one.
>>>>>>>>
>>>>>>>> This patch changes get_ticks to always return 0 when there is an error.
>>>>>>>> 0 is the least unsigned integer, ensuring get_ticks appears monotonic. This
>>>>>>>> has the side effect of time apparently not passing until the timer is
>>>>>>>> initialized. However, without this patch, time does not pass anyway,
>>>>>>>> because the error value is likely to be the same.
>>>>>>>>
>>>>>>>> Fixes: c8a7ba9e6a5
>>>>>>>> Signed-off-by: Sean Anderson <seanga2 at gmail.com>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> lib/time.c | 4 ++--
>>>>>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> Would it be better to panic so people can fix the bug?
>>>>>>
>>>>>> I thought this was expected behavior. It's only a bug if you do
>>>>>> something like udelay before any timers are created. We just can't
>>>>>> report errors through get_ticks, because its users assume that it always
>>>>>> returns a time of some kind.
>>>>>
>>>>> I think it indicates a bug. If you use a device before it is ready you
>>>>> don't really know what it will do. I worry that this patch is just
>>>>> going to cause confusion, since the behaviour depends on when you call
>>>>> it. If we panic, people can figure out why the timer is being inited
>>>>> too late, or being used too early.
>>>>
>>>> Hm, maybe. I don't think it's as clear cut as "us[ing] a device before
>>>> it is ready," because get_ticks tries to initialize the timer if it
>>>> isn't already initialized. Unless someone else does it first, the first
>>>> call to get_ticks will always be before the timer is initialized.
>>>>
>>>> The specific problem I ran into was that after relocation, the watchdog
>>>> may be initialized before the timer. This occurs on RISC-V because
>>>> without [1] a timer only exists after arch_early_init_r. So, for the
>>>> first few calls to watchdog_reset there is no timer.
>>>>
>>>> The second return could probably be turned into a panic. I checked, and
>>>> all current timer drivers always succeed in getting the time (except for
>>>> the RISC-V timer, which is fixed in [1]), so the only way for
>>>> timer_get_count to fail is if timer_ops.get_count doesn't exist. That is
>>>> almost certainly an error on the driver author's part, so I think
>>>> panicking there is the only reasonable option.
>>>
>>> OK good, let's do that and update docs in timer.h
>>
>> That being to panic both times, or just panic the second time?
>
> Well I like a panic if the call is invalid, ie. in both cases.
Ok, sounds good to me.
--Sean
More information about the U-Boot
mailing list