[PATCH 2/2] armv8: generic_timer: Use event stream for udelay
Peter Hoyes
peter.hoyes at arm.com
Wed Apr 24 12:25:29 CEST 2024
Hi Quentin,
On 4/23/24 11:55, Quentin Schulz wrote:
> Hi Peter,
>
> On 4/23/24 10:10, Peter Hoyes wrote:
>> From: Peter Hoyes <Peter.Hoyes at arm.com>
>>
>> Polling cntpct_el0 in a tight loop for delays is inefficient.
>> This is particularly apparent on Arm FVPs, which do not simulate
>> real time, meaning that a 1s sleep can take a couple of orders
>> of magnitude longer to execute in wall time.
>>
>> If running at EL2 or above (where CNTHCTL_EL2 is available), enable
>> the cntpct_el0 event stream temporarily and use wfe to implement
>> the delay more efficiently. The event period is chosen as a
>> trade-off between efficiency and the fact that Arm FVPs do not
>> typically simulate real time.
>>
>> This is only implemented for Armv8 boards, where an architectural
>> timer exists.
>>
>> Some mach-socfpga AArch64 boards already override __udelay to make
>> it always inline, so guard the functionality with a new
>> ARMV8_UDELAY_EVENT_STREAM Kconfig, enabled by default.
>>
>> Signed-off-by: Peter Hoyes <Peter.Hoyes at arm.com>
>> ---
>> arch/arm/cpu/armv8/Kconfig | 8 ++++++++
>> arch/arm/cpu/armv8/generic_timer.c | 27 +++++++++++++++++++++++++++
>> arch/arm/include/asm/system.h | 6 ++++--
>> 3 files changed, 39 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm/cpu/armv8/Kconfig b/arch/arm/cpu/armv8/Kconfig
>> index 9f0fb369f7..544c5e2d74 100644
>> --- a/arch/arm/cpu/armv8/Kconfig
>> +++ b/arch/arm/cpu/armv8/Kconfig
>> @@ -191,6 +191,14 @@ config ARMV8_EA_EL3_FIRST
>> Exception handling at all exception levels for External Abort
>> and
>> SError interrupt exception are taken in EL3.
>> +config ARMV8_UDELAY_EVENT_STREAM
>> + bool "Use the event stream for udelay"
>> + default y if !ARCH_SOCFPGA
>> + help
>> + Use the event stream provided by the AArch64 architectural
>> timer for
>> + delays. This is more efficient than the default polling
>> + implementation.
>> +
>> menuconfig ARMV8_CRYPTO
>> bool "ARM64 Accelerated Cryptographic Algorithms"
>> diff --git a/arch/arm/cpu/armv8/generic_timer.c
>> b/arch/arm/cpu/armv8/generic_timer.c
>> index 8f83372cbc..e18b5c8187 100644
>> --- a/arch/arm/cpu/armv8/generic_timer.c
>> +++ b/arch/arm/cpu/armv8/generic_timer.c
>> @@ -115,3 +115,30 @@ ulong timer_get_boot_us(void)
>> return val / get_tbclk();
>> }
>> +
>> +#if CONFIG_IS_ENABLED(ARMV8_UDELAY_EVENT_STREAM)
>> +void __udelay(unsigned long usec)
>> +{
>> + u64 target = get_ticks() + usec_to_tick(usec);
>> +
>
> This can theoretically overflow, do we have any guarantee this cannot
> happen in real life, like... we would need U-Boot to be running for
> 100 years without being powered-down/reset or something like that? Can
> we document this assumption? Does this make sense?
>
>> + /* At EL2 or above, use the event stream to avoid polling
>> CNTPCT_EL0 so often */
>> + if (current_el() >= 2) {
>> + u32 cnthctl_val;
>> + const u8 event_period = 0x7;
>> +
>> + asm volatile("mrs %0, cnthctl_el2" : "=r" (cnthctl_val));
>> + asm volatile("msr cnthctl_el2, %0" : : "r"
>> + (cnthctl_val | CNTHCTL_EL2_EVNT_EN |
>> CNTHCTL_EL2_EVNT_I(event_period)));
>> +
>> + while (get_ticks() + (1ULL << event_period) <= target)
>
> This could be an overflow as well.
>
>> + wfe();
>> +
>> + /* Reset the event stream */
>> + asm volatile("msr cnthctl_el2, %0" : : "r" (cnthctl_val));
>> + }
>> +
>> + /* Fall back to polling CNTPCT_EL0 */
>> + while (get_ticks() <= target)
>
> get_ticks() could wrap around here maybe?
I don't see a problem here. It's using u64s and the maximum
COUNTER_FREQUENCY in the U-Boot source is 200MHz, which gives enough
ticks for about 3 millennia.
Additionally, the underlying "while" condition is the same as the
existing weak __udelay implementation
(https://source.denx.de/u-boot/u-boot/-/blob/master/lib/time.c?ref_type=heads#L181),
so this change doesn't affect the overflow limit.
Thanks,
Peter
More information about the U-Boot
mailing list