[U-Boot] [RFC][Timer API] Revised Specification - Implementation details

Fri May 27 03:51:47 CEST 2011

Hi Bill,

On Fri, May 27, 2011 at 11:26 AM, J. William Campbell
<jwilliamcampbell at comcast.net> wrote:
> On 5/26/2011 4:28 PM, Graeme Russ wrote:

>>
>> Why mess around with bit shifting (which you would then have to cludge
>> into
>> your platform code) when carting around a 64-bit value is relatively
>> cheap,
>> transparent and poratble (all all supported up-to-date tool chains)
>>
> I really STRONGLY disagree with this statement. If you actually needed 64
> bit variables, fine use them. But as I have already shown, you do not need
> them in general.  We are computing a 32 bit result. There is some entropy
> argument that says you shouldn't need 64 bits to do so. Another way to look
> at it is that converting the top 32 bit word and the bottom 32 bit word to
> ms separately can be easier than doing them both together at once.  However,
> as we will see below, I do agree we need two 32 bit words to make this
> process go smoothly. I just don't agree that they should/will constitute a
> 64 bit binary word. See below.
>>>>
>>>>  - void timer_isr()
>>>>     - Optional (particularly if tick counter rollover period is
>>>>       exceptionally log which is usually the case for native 64-bit tick
>>>>       counters)
>>>>     - Simply calls sync_timebase()
>>>>     - For platforms without any tick counter, this can implement one
>>>>       (but accuracy will be harmed due to usage of disable_interrupts()
>>>> and
>>>>       enable_interrupts() in U-Boot
>>>>
>>>> So to get the new API up and running, only two functions are mandatory:
>>>>
>>>> get_ticks() which reads the hardware tick counter and deals with any
>>>> 'funny
>>>> stuff' including rollovers, short timers (12-bit for example), composite
>>>> counters (16-bit 0-999 microsecond + 16-bit millisecond) and maintains a
>>>> 'clean' 64-bit tick counter which rolls over from all 1's to all 0's.
>>>> The
>>>
>>> I think it is the task of get_ticks to return the hardware tick counter
>>> as
>>> an  increasing counter, period.  The counter may wrap at some final count
>>> that is not all ones. That is ok. Sync_timebase deals with the rollovers
>>> if
>>
>> The hardware tick counter may, the 64-bit software tick counter maintained
>> by get_ticks() may not
>
>>> necessary. get_ticks is very lightweight. get_ticks should deal with
>>> decrementing counters by returning the complement of the counter.  The
>>> sc520
>>> case is a bit more complex if you intend to use the 0-999 and 16 bit
>>> millisec registers, in that you do need to add them to the previous value
>>> to
>>
>> As I mentioned in another post, this is a problem for the platform
>> maintainer and is abstracted away throught the platform specific
>> implementation of get_ticks()
>>
>>> make an increasing counter. Sync_timebase "likes" short counters in that
>>> they are easy to convert to millisec and tick remainders.
>>
>> The compiler should handle using 64-bit rather than 32-bit transparently
>
> True enough.  But you don't need 64 bit variables at this point two 32 bit
> ones work just fine, in fact better in most cases.

Remember, we are not dealing with a high performance OS here. The primary
goal is portability - Performance optimisations (which do not break
portability) can be performed later

>>>>
>>>> 64-bit tick counter does not need to be reset to zero ever (even on
>>>> startup
>>>> - sync_timebase tacks care of all the details)
>>>
>>> True, but sync_timebase does have to be initialized (as does the timer
>>> itself in most cases, so this is not a restriction).
>>
>> This can be done in timer_init() via a call to sync_timebase() after the
>> timer has been configured. This should bring everything into line
>>
>>>> ticks_per_millisecond() simply return the number of ticks in a
>>>> millisecond
>>>> - This may as simple as:
>>>>
>>>> inline u64 ticks_per_millisecond(void)
>>>> {
>>>>        return CONFIG_SYS_TICK_PER_MS;
>>>> }
>>>>
>>>> But it may be trickier if you have a programmable tick frequency
>>>
>>> You will have to call the routine that initializes sync_timebase. This
>>> routine should have a name, like void init_sync_timebase(void)?
>>>>
>>>> The optional timer ISR is required if the tick counter has a short roll
>>>> over duration (short is up to you - 1 second is short, 1 hour might be,
>>>> 1
>>>> century is not)
>>>>
>>>> Regards,
>>>>
>>>> Graeme
>>>>
>>> It is probably true that sync_timebase should have a parameter flag. The
>>> reason is that if the timer isr is called only when the timer wraps, then
>>> the calls to sync_timebase may be slightly more than a full timer period
>>> apart. (due to interrupt latency). Therefore, when the timer difference
>>> is
>>> computed, if the current update is due to a wrap AND the previous update
>>> is
>>> due to a wrap, the difference should be approximately 1 wrap. If it comes
>>> up
>>> real short, you must add a wrap. This isn't necessary if the routine is
>>> called more often than once per wrap. Also, when sync_timebase is called
>>> in
>>
>> timer_isr() MUST be called more often than the rollover period of the
>> underlying hardware tick counter - This is a requirement
>
> The equality case can be made to work.  If the extension of the counter is
> done in the interrupt routine, not in get_ticks, get_ticks just needs to
> read the msb of the counter, read the lsb of the counter, then verify that
> the msb has not changed. If you have interrupts that work, that is the
> easiest way to go. If the lsb of the counter has represents 1 ms or less,
> you can just drop it (equivalent to the what the PPC does now). If the
> interrupt is slower than that, you must use at least part of the LSB. If you
> don't have interrupts, the point is moot.

So now we have a complicated ISR and a complicated get_ticks() and you
have to change get_ticks() when you decide to implement an ISR

>>>
>>> get_timer, you must first disable interrupts and then enable interrupts
>>> after sync_timebase returns
>>
>> Why? - get_ticks() provides an atomic read of the hardware tick counter.
>> If get_ticks() needs to disable and enable interrupts to do so, that is a
>> problem for the platform maintainer
>>
>> Admittedly, sync_timebase() will not be re-entrant, but how will it ever
>> be called concurrently? - Ah, I see - a call to get_timer() interrupted
>> by the timer ISR :)
>
> Yes, that is the problem. I have come to the view that  two 32 bit words are
> the best approach. Note that the lsb may actually not fill the full 32 bits.

Urghhh

> The top 32 bits are the rollover count and the bottom 32 bits are the
> current counter. If the counter is a full 32 bits, so much the better.

Ahhhhh - Lets keep it that way

> Again, one could put them together inside the interrupt routine , but it is
> easier to check for a changed value if you don't do this. Otherwise, you
> have to check both words. It also makes the isr faster. It is just an

As I said before - Simple First, Fast Later

> increment of the overflow counter, like the PPC is now. I happen to think it
> is easier to convert the two 32 bit words to milliseconds one at a time, but
> if you feel you must use 64 bit words, that is fine too. Just remember, the
> counter does not always fill the entire bottom word.

Urghhh

> In cases where there are no interrupts, get_ticks has to detect that the
> timer has "backed up" and increment the overflow counter itself, unless the
> counter is 64 bits to begin with and overflow is impossible anyway.
> get_ticks should NOT try to detect overflows if interrupts are available. If
> it got both words before an interrupt happened, the answer is correct. If it
> got an interrupt in between fetching the words, the event will be detected
> and the value re-fetched. All sync_timebase would do now is convert the
> returned value to milliseconds.
>
> So, if you have a 64 bit hardware counter, get_ticks reads and returns it.
> Else if you have interrupts, get_ticks reads the overflow counter into the
> msb. Next, it reads the hardware timer into the lsb. If the counter is a
> down counter, the lsb is = to the counter max - the lsb. The msb is then
> checked to make sure it hasn't changed, if it has, repeat the process. All
> the interrupt routine does is increase the overflow count.
> If you don't have interrupts get_ticks reads the hardware counter into the
> lsb. If the counter is a down counter, the lsb is = to the counter max - the
> lsb. If the lsb is less than it was in the previous call to get ticks, the
> overflow counter is increased. get_ticks then loads the overflow counter
> into the msb.
>
> sync_timebase converts the msb and lsb into millisec. It may do this by a 64
> bit divide, or some shifting to align the lsb with then msb and the a 64 bit
> divide, or a bunch of 32 bit fractional multiplies, or any such approach
> that works.
>
> How does that sound?

The fact that you have described three different implementations of
get_ticks() with two of these differentiated between whether you have
interrupts or not immediately suggests this solution is inherently more
complex and less maintainable.

Lets say you have a platform with a 32-bit tick counter running at a
reasonably long rollover time so you decide not to use interrupts. Then
you create a new platform with the same tick counter, but it runs much
faster and you realise you need to implement the interrupt routine to
make get_timer() work for long enough periods - Fine, you add an ISR
for the new platform that calls sync_timebase - No other changes are
required.

The last thing we want is for the 64-bit tick counter to be conceptually
different across platforms

I just realised - the ISR _does not need to call the sync_timebase at all_
The ISR only needs to call get_ticks(), so it will be fast anyway

Regards,

Graeme