[U-Boot] [RFC][Timer API] Revised Specification - Implementation details

Fri May 27 17:49:52 CEST 2011

On 5/26/2011 11:54 PM, Graeme Russ wrote:
> On Fri, May 27, 2011 at 4:33 PM, J. William Campbell
> <jwilliamcampbell at comcast.net>  wrote:
>> On 5/26/2011 9:33 PM, Graeme Russ wrote:
>>> Hi Bill,
>>>
>> <snip>
> [massive snip]
>
> OK, you have my ears pricked - Can you give me code samples for:
>
>   - get_ticks()
>   - sync_timbase() (no need to implement the whole lot if that is too
>     much effort right now)
>   - timer_isr()
>
> that handle the following hardware tick counter scenarios:
>
> a) 64-bit up counter
> b) 64-bit down counter
Hi Graeme,
> c) 32-bit up counter, wraps at 65000
Do you mean 32 bits or 16 bits? doesn't make much difference, but just 
checking.
> d) 16-bit microsecond up counter 0-999 which wraps and triggers a 16-bit
>     millisecond up counter. Reading milliseconds latched microseconds and
>     clears milliseconds (look in arch/x86/cpu/sc520/timer.c for example)
> e) 24-bit counter occupying bits 2-25 of a 32-bit word (just to be
>     difficult)
> f) Any other option anyone cares to throw ;)
>
> All of these options must be covered using:
>   - Minimal global data (we would like it to work before relocation, but
>     not mandatory - GD footprint would be nice)
>   - All use the same sync_timebase function
>   - Demonstrate using an ISR NOT synced to the hardware tick counter and
>     an ISR that is
>   - Parameters to get_ticks() and sync_timer() are permitted, but not for
>     timer_isr() (naturally)
>

     OK! Once again, I accept the challenge. One Caveat. My real work 
has been sliding due to the time I have been spending on this. I am 
flying from San Francisco to Sydney tonight , (to work, not to play),  
so I will be off-grid for 14 hours+. You will not get this code for a 
few days, like probably 3 days. I have looked at the requirements, and I 
see no real problems that I don't know how to solve.
>> I don't' see any reason to push this down to a lower level. It is just one
>> more thing to get messed up across implementations.
> Agreed
>
>>>> detection in the non-interrupt case to sync_timebase as well.
>>>> Sync_timebase
>>>> can also invert the down-counting counters, removing that from get_ticks.
>>>> The wrap detection code can be #ifdef out if one is using interrupts and
>>> Urghh - Why are you adding unnecessary ugliness - #ifdef in the middle of
>>> code is usually a sign you are doing something wrong
>> As I said, this is an optional optimization. I do not agree that an #ifdef
>> in the middle of code indicates you have a bad design. Lots and Lots of
>> ifdefs certainly indicates a bad design. An ifdef to eliminate code if some
>> option is not selected is hardly such a strange thing, especially only a
>> single #ifdef. However, feel free to not have it if you so desire.
> OK, I'll let this slide for the moment - please include in above example
Will Do.
>>>> offended by it's presence. Thanks for pointing this out and compelling me
>>>> to
>>>> reduce the number of cases! Making get_ticks more lightweight is a good
>>>> idea
>>>> in my opinion.
>>>>> Lets say you have a platform with a 32-bit tick counter running at a
>>>>> reasonably long rollover time so you decide not to use interrupts. Then
>>>>> you create a new platform with the same tick counter, but it runs much
>>>>> faster and you realise you need to implement the interrupt routine to
>>>>> make get_timer() work for long enough periods - Fine, you add an ISR
>>>>> for the new platform that calls sync_timebase - No other changes are
>>>>> required.
>>>>>
>>>>> The last thing we want is for the 64-bit tick counter to be conceptually
>>>>> different across platforms
>>>>>
>>>>> I just realised - the ISR _does not need to call the sync_timebase at
>>>>> all_
>>>>> The ISR only needs to call get_ticks(), so it will be fast anyway
>>>> The problem is that the way we previously detected wrapping does not work
>>>> if
>>>> the interrupt rate is == to the counter wrap time, which it essentially
>>>> always is. If get_ticks is trying to update the wrap count when an
>>>> interrupt
>>> Is it, always, on every platform?
>> Yes, pretty much. You get a terminal count/counter underflow interrupt and
>> that is it.
> Not on sc520 - The micro/millisecond counter cannot be used to driver an
> interrupt - you need to setup a seperate timer. I think the x86 internal
> performance counters are the same
What is true, as you have stated, is that the micro/millisecond counter 
on the sc520 does not interrupt at all. Nor do the x86 performance 
counters. The x86 performance counters are a non-problem because they 
are 64 bits long. We don't need interrupts for them. Now, if you choose 
to use the sc520 micro/millisecond counter, then you need another source 
of interrupts. Due to the fact that reading the sc520 counter resets it, 
we must accumulate the elapsed time in software. That means the 
interrupt routine must do a bit more work, but it also allows reading 
the counters in non-interrupt code (with interrupts disabled) to not 
mess up the accumulated count. We don't detect rollover in the sc520 
counters, we just read and accumulate the value. So no problem there.
>>>> comes in, it will do it wrong. If the interrupt rate was faster, it would
>>>> work, because get_ticks would always know the correct answer. Consider
>>>> the
>>>> following. Get ticks is called by the counter overflowing (which resets
>>>> it
>>>> to 0) and executes with the counter value at 0. Later, at the next
>>>> rollover,
>>>> with no intervening calls to get ticks, the interrupt routine calls get
>>>> ticks. Assuming negligible  interrupt latency, get_ticks just sees the
>>>> counter is still 0, so it will not detect a wrap condition. So now you
>>>> loose
>>> But wait a minute, don't you KNOW that the interrupt gets triggered on a
>>> rollover so therefore there MUST have been a rollover, therefore there has
>>> been 2^32 (or 2^16 or 2^whatever) ticks which you can just subtract from
>>> the
>>> last known tick count (which would be zero if there had been no get_timer
>>> calls in between)
>> Except if this interrupt was delayed because get_ticks turned off
>> interrupts, get_ticks may have already compensated the number. When we then
>> get the (delayed) interrupt, we will do it twice.
> That would mean get_timer() got called EXACTLY on the rollover - A race
> condition that should be avoided. But still, a race can still occur through
> using disable/enable interrupts which could push the timer isr out
>
>>>> a period. I thought by passing in a flag we could force get_ticks to do
>>>> the
>>>> right thing in this case, but since we must disable interrupts when we
>>>> call
>>>> get ticks from the get_timer routine, the outer call to get ticks may
>>>> have
>>>> already detected the rollover and the flag will force an additional
>>>> rollover
>>>> to be detected. It is a classic race condition. If we detect rollover
>>>> only
>>> Aha! - In nearly all situations, a race is better avoided than run!
>> Yes, that is why the new approach does NOT turn off interrupts when
>> get_ticks is called, but rather loops if the overflow count was changed
>> while reading the count lsb. That ensures get_ticks always gets both words
>> either before the counter wrapped or after the counter wrapped, but not
>> halfway in between. Some ISYNCs may be required to make sure there are no
>> outstanding changes pending to memory, depending on your architecture.
> OK, lets have a closer look
>
>>>> in the interrupt routine, we do not need to protect the rollover variable
>>>> from access by the main program, so we can be sure the results of
>>>> get_ticks
>>>> is coherent. If we try do do it in both places, we will have trouble. If
>>>> the
>>>> interrupt occurred at a faster rate, like say twice the rollover, we
>>>> wouldn't have a problem. But it doesn't. In most cases it happens at the
>>>> rollover rate, or just a bit more sometimes due to interrupt latency not
>>> Again does it really - for all arches?
>> Yep, pretty much. Almost all timers I know of only give one interrupt per
>> cycle.
> A bold assumption - and one that is wrong ;)
Ok, except for timers that don't interrupt at all and need an outside 
assist! In those cases, you need to interrupt faster than the counter 
period, which as your original constraint. But in that case, the 
interrupt is coming from elsewhere, not the timer being processed.
>>>> being a constant. It may appear to happen at somewhat less than a
>>>> rollover
>>>> as well, if the interrupt latency was long at time n-1 and short at time
>>>> n.
>>>> I think this problem can be overcome in the old design by keeping track
>>>> of
>>>> whether the last update was by a non-interrupt or interrupt call to
>>>> get_ticks , but I think the new design is better anyway.
>>> I disagree - The whole point of a HAL is to Abstract the Hardware
>> The HAL should abstract what needs abstracting, and no more. We are really
>> talking about literally 4 lines of code here, but I think those four lines
>> certainly belong in sync_timer. All the math in one place is a good thing.
>> In most cases, get_ticks becomes trivial. Only in cases where the timer is
>> wrong endian or embedded in another bit field (both things being possible
>> but not all that common) does get_ticks need to do anything other than just
>> read the registers and loop until the msb is stable. Simple here is good,
>> because this is what new implementers need to add/change. The less here, the
>> better! In fact, the looping until the msb is stable can also be moved up to
>> sync_timer. That is even simpler.
> OK, you got me - "show me the money^H^H^H^H^Hcode"
Wilco! You will get them by Monday Afternoon, California time, or sooner 
if I can stay awake long enough.

Best Regards,
Bill Campbell
> Regards,
>
> Graeme
>
>