[U-Boot] [PATCH 050/126] x86: timer: Reduce timer code size in TPL on Intel CPUs

Bin Meng bmeng.cn at gmail.com
Mon Oct 14 02:00:46 UTC 2019


Hi Simon,

On Sun, Oct 13, 2019 at 1:55 AM Simon Glass <sjg at chromium.org> wrote:
>
> Hi Bin,
>
> On Fri, 11 Oct 2019 at 23:18, Bin Meng <bmeng.cn at gmail.com> wrote:
> >
> > Hi Simon,
> >
> > On Sat, Oct 12, 2019 at 11:38 AM Simon Glass <sjg at chromium.org> wrote:
> > >
> > > Hi Bin,
> > >
> > > On Fri, 11 Oct 2019 at 07:19, Bin Meng <bmeng.cn at gmail.com> wrote:
> > > >
> > > > Hi Simon,
> > > >
> > > > On Fri, Oct 11, 2019 at 1:06 AM Simon Glass <sjg at chromium.org> wrote:
> > > > >
> > > > > Hi Bin,
> > > > >
> > > > > On Sat, 5 Oct 2019 at 08:36, Bin Meng <bmeng.cn at gmail.com> wrote:
> > > > > >
> > > > > > Hi Simon,
> > > > > >
> > > > > > On Wed, Sep 25, 2019 at 10:58 PM Simon Glass <sjg at chromium.org> wrote:
> > > > > > >
> > > > > > > Most of the timer-calibration methods are not needed on recent Intel CPUs
> > > > > > > and just increase code size. Add an option to use the known-good way to
> > > > > > > get the clock frequency in TPL. Size reduction is about 700 bytes.
> > > > > > >
> > > > > > > Signed-off-by: Simon Glass <sjg at chromium.org>
> > > > > > > ---
> > > > > > >
> > > > > > >  drivers/timer/Kconfig     | 29 +++++++++++++++++++----------
> > > > > > >  drivers/timer/tsc_timer.c |  7 +++++--
> > > > > > >  2 files changed, 24 insertions(+), 12 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/timer/Kconfig b/drivers/timer/Kconfig
> > > > > > > index 5f4bc6edb67..90bc8ec7c53 100644
> > > > > > > --- a/drivers/timer/Kconfig
> > > > > > > +++ b/drivers/timer/Kconfig
> > > > > > > @@ -117,16 +117,6 @@ config RENESAS_OSTM_TIMER
> > > > > > >           Enables support for the Renesas OSTM Timer driver.
> > > > > > >           This timer is present on Renesas RZ/A1 R7S72100 SoCs.
> > > > > > >
> > > > > > > -config X86_TSC_TIMER_EARLY_FREQ
> > > > > > > -       int "x86 TSC timer frequency in MHz when used as the early timer"
> > > > > > > -       depends on X86_TSC_TIMER
> > > > > > > -       default 1000
> > > > > > > -       help
> > > > > > > -         Sets the estimated CPU frequency in MHz when TSC is used as the
> > > > > > > -         early timer and the frequency can neither be calibrated via some
> > > > > > > -         hardware ways, nor got from device tree at the time when device
> > > > > > > -         tree is not available yet.
> > > > > > > -
> > > > > > >  config OMAP_TIMER
> > > > > > >         bool "Omap timer support"
> > > > > > >         depends on TIMER
> > > > > > > @@ -174,6 +164,25 @@ config X86_TSC_TIMER
> > > > > > >         help
> > > > > > >           Select this to enable Time-Stamp Counter (TSC) timer for x86.
> > > > > > >
> > > > > > > +config X86_TSC_TIMER_EARLY_FREQ
> > > > > > > +       int "x86 TSC timer frequency in MHz when used as the early timer"
> > > > > > > +       depends on X86_TSC_TIMER
> > > > > > > +       default 1000
> > > > > > > +       help
> > > > > > > +         Sets the estimated CPU frequency in MHz when TSC is used as the
> > > > > > > +         early timer and the frequency can neither be calibrated via some
> > > > > > > +         hardware ways, nor got from device tree at the time when device
> > > > > > > +         tree is not available yet.
> > > > > > > +
> > > > > > > +config TPL_X86_TSC_TIMER_NATIVE
> > > > > > > +       bool "x86 TSC timer uses native calibration"
> > > > > > > +       depends on TPL && X86_TSC_TIMER
> > > > > > > +       help
> > > > > > > +         Selects native timer calibration for TPL and don't include the other
> > > > > > > +         methods in the code. This helps to reduce code size in TPL and works
> > > > > > > +         on fairly modern Intel chips. Code-size reductions is about 700
> > > > > > > +         bytes.
> > > > > > > +
> > > > > > >  config MTK_TIMER
> > > > > > >         bool "MediaTek timer support"
> > > > > > >         depends on TIMER
> > > > > > > diff --git a/drivers/timer/tsc_timer.c b/drivers/timer/tsc_timer.c
> > > > > > > index 919caba8a14..9630036bc7f 100644
> > > > > > > --- a/drivers/timer/tsc_timer.c
> > > > > > > +++ b/drivers/timer/tsc_timer.c
> > > > > > > @@ -49,8 +49,7 @@ static unsigned long native_calibrate_tsc(void)
> > > > > > >                 return 0;
> > > > > > >
> > > > > > >         crystal_freq = tsc_info.ecx / 1000;
> > > > > > > -
> > > > > > > -       if (!crystal_freq) {
> > > > > > > +       if (!CONFIG_IS_ENABLED(X86_TSC_TIMER_NATIVE) && !crystal_freq) {
> > > > > > >                 switch (gd->arch.x86_model) {
> > > > > > >                 case INTEL_FAM6_SKYLAKE_MOBILE:
> > > > > > >                 case INTEL_FAM6_SKYLAKE_DESKTOP:
> > > > > > > @@ -405,6 +404,10 @@ static void tsc_timer_ensure_setup(bool early)
> > > > > > >                 if (fast_calibrate)
> > > > > > >                         goto done;
> > > > > > >
> > > > > > > +               /* Reduce code size by dropping other methods */
> > > > > > > +               if (CONFIG_IS_ENABLED(X86_TSC_TIMER_NATIVE))
> > > > > > > +                       panic("no timer");
> > > > > > > +
> > > > > >
> > > > > > I don't get it. How could this reduce the code size? I don't see any
> > > > > > #ifdefs around the other methods we want to drop?
> > > > >
> > > > > The compiler sees that CONFIG_IS_ENABLED(..) is 1, and leaves out the
> > > > > code that follows it.
> > > >
> > > > Why?
> > > >
> > > > if (1)
> > > >     panic("no timer");
> > > >
> > > > then compiler does not generate any codes of the following?
> > > >
> > > > fast_calibrate = cpu_mhz_from_cpuid();
> > > >
> > > > I don't understand.
> > > >
> > >
> > > The panic() function is marked as noreturn, so the compiler assume it
> > > doesn't return. You can try this if you like. It reduces the size by
> > > 700 bytes which on a 22KB image is a lot.
> >
> > OK, compiler is smart to generate less codes :)
> >
> > But the way you added the CONFIG_IS_ENABLED(..) logic check here is
> > obscure if one does not dig into that deep ..
> >
> > >
> > > > Besides, I think adding some random Kconfig options to exclude some
> > > > specific parts in one C file is a bad idea. It's unclear to me why we
> > > > should exclude one part versus another part. I'm OK to exclude the
> > > > whole C file for TPL/SPL though, but not part of it for size
> > > > limitation purpose.
> > >
> > > My understanding is the most of the code in this function is a
> > > fallback in case an earlier method doesn't work. But on modern CPUs
> >
> > Yes, correct.
> >
> > > the first method always works, so this is a waste of time?
> > >
> >
> > It's not a wast of time, but a bloat of the code size. As you said,
> > these are fallbacks, and methods are prioritized based on the age of
> > the processors, so that native method is tried first, followed by
> > cpuid, MSR, and finally PIT.
> >
> > You also mentioned that "on modern CPUs the first method always
> > works", so today the first method is native_calibrate_tsc(), but say 3
> > years later, this might not be true, and chances are that we may add
> > another method before native_calibrate_tsc() for whatever mechanism is
> > used on the latest processors, and the insertion of the TPL Kconfig
> > option (TPL_X86_TSC_TIMER_NATIVE) check today is not future proof.
>
> That's right, it is not. Perhaps we need to have separate timer
> drivers for different generations? But if not, I am loath to have 700
> bytes of dead code in TPL, which I why I added the option.
>

I asked the TPL question in another thread. I think I will need
understand why size is a problem for latest x86 processors to have
just a single U-Boot booting from reset vector to the shell.

> If we later need to adjust it, we can do so, but this cuts off the
> worst of the bloat.

Regards,
Bin


More information about the U-Boot mailing list