[U-Boot] [PATCH 14/28] drivers/fsl-mc: Changed MC firmware loading for new boot architecture

Kim Phillips kim.phillips at freescale.com
Wed Mar 25 22:12:48 CET 2015


On Tue, 24 Mar 2015 21:32:56 -0500
Stuart Yoder <b08248 at gmail.com> wrote:

> Kim,

why is this being taken off-list?  Adding it back.  Also, please
don't top-post.

> I think it is premature to start focusing on saving single digit
> milliseconds in u-boot.  The 35 second boot time you are seeing on
> some ls2085 boards is due to the fact that the MC was running with
> it's CPU caches off until a few weeks ago.  DDR is still running at
> the slowest speed, which will affect the MC's performance.  It's still
> early in hardware bringup and things are not even stable yet.
> 
> I'm still convinced that the MC itself probably has significant
> performance work.
> 
> The 1ms delay used polling for MC to boot has nothing to do with DPL
> size.  DPL processing is a separate, later initialization step.  We're
> just waiting for the MC to initialize here.  The MC boot time
> shouldn't vary.  My visual reading watching things boot is that the MC
> boot takes a few hundred milliseconds.  I don't see what's wrong with
> a 1 ms polling delay here.
> 
> On a system with the MC running with caches on, u-boot took 20 seconds
> to boot.   The MC took about 5 seconds of that, most of that in DPL
> processing.
> 
> We would welcome help analyzing u-boot boot time and where the time is
> going.   But, seriously,  saving 1 ms is not going to help at all.

I did a quick experiment and saved ~50ms when setting both udelays
down to 50, which, sure, isn't a big deal given it's out of the
order of 10sec, but it's something, and, like I said, development
time for our users can be seriously helped if MC initialization were
omitted from the main u-boot boot sequence, and occurred only when
necessary, i.e., when users want to use one of the DP net
interfaces.  Most of the time when we boot today, we don't use DP
net interfaces, so MC init - with or without DPL processing - is
just a waste of our time!

Thanks,

Kim

> Thanks,
> Stuart
> 
> 
> 
> On Tue, Mar 24, 2015 at 10:35 AM, Kim Phillips
> <kim.phillips at freescale.com> wrote:
> > On Tue, 24 Mar 2015 10:14:39 -0500
> > Rivera Jose-B46482 <German.Rivera at freescale.com> wrote:
> >
> >> > From: Kim Phillips [mailto:kim.phillips at freescale.com]
> >> > Sent: Monday, March 23, 2015 5:06 PM
> >> >
> >> > On Mon, 23 Mar 2015 16:15:56 -0500
> >> > Rivera Jose-B46482 <German.Rivera at freescale.com> wrote:
> >> >
> >> > > > From: Kim Phillips [mailto:kim.phillips at freescale.com]
> >> > > > Sent: Monday, March 23, 2015 3:34 PM
> >> > > >
> >> > > > On Mon, 23 Mar 2015 15:06:11 -0500
> >> > > > Rivera Jose-B46482 <German.Rivera at freescale.com> wrote:
> >> > > >
> >> > > > > > From: Kim Phillips [mailto:kim.phillips at freescale.com]
> >> > > > > > Sent: Thursday, March 19, 2015 12:53 PM
> >> > > > > >
> >> > > > > > On Thu, 19 Mar 2015 09:45:45 -0700 York Sun
> >> > > > > > <yorksun at freescale.com> wrote:
> >> > > > > >
> >> > > > > > > From: "J. German Rivera" <German.Rivera at freescale.com>
> >> > > > > > >
> >> > > > > > > Changed MC firmware loading to comply with the new MC boot
> >> > > > > > architecture.
> >> > > > > > > Flush D-cache hierarchy after loading MC images. Add
> >> > > > > > > environment variables "mcboottimeout" for MC boot timeout in
> >> > > > > > > milliseconds, "mcmemsize" for MC DRAM block size. Check MC
> >> > > > > > > boot status before calling flib functions.
> >> > > > > >
> >> > > > > > Can we just assign actual and/or optimal values for
> >> > 'mcboottimeout'
> >> > > > > > and 'mcmemsize' instead of making them environment variables?
> >> > > > > >
> >> > > > > Having environment variables gives us the flexibility if these
> >> > > > > values need to be changed for a given customer configuration. The
> >> > > > > actual
> >> > > >
> >> > > > what defines a 'customer configuration,' and how does that manifest
> >> > > > itself at u-boot boot time?
> >> > > A DPL (data path layout - a device-tree-like structure describing The
> >> > > DPAA2 objects created at boot time and their connections)
> >> > >
> >> > > >  Is it the amount of time it takes to load (and execute?) firmare?
> >> > > Yes, bigger DPLs take longer to process by the MC.
> >> > >
> >> > > > Why isn't customer-specific firmware being loaded via linux?  All
> >> > > > u-boot needs is basic networking, pretty static
> >> > > > setup: fixed numbers for both memsize & timeout.
> >> > > This is not customer-specific firmware. What is customer-specific is
> >> > just the DPL.
> >> > > In order to have networking in u-boot, we need to load the MC firmware
> >> > > in u-boot, For cases in which the target system has only DPAA2-based
> >> > network interfaces.
> >> >
> >> > ok, for that case, when time comes for u-boot to do some DPAA2 networking
> >> > arrives (i.e., we shouldn't have to be loading firmware at board boot-
> >> > time), then we should load a minimal DPL for the number of singular, non-
> >> > virtual/switch, etc., interfaces for that board just to tftp: this
> >> > shouldn't be a big DPL at all, and its time complexity is fixed.
> >> >
> >> It is up to the customer to decide what kind of DPL they want to have.
> >
> > true, but in this case the 'customer' is the average upstream u-boot
> > user, presumably whose DPL is the simplest ethernet use-case for
> > tftp'ing kernels.
> >
> >> > > > > boot time of the MC and the amount of memory needed by the MC is
> >> > > > > dependent on how big/complex is the DPL used. Also, the memory
> >> > > > > needed by the MC needs to account for how much memory is needed
> >> > > > > for AIOP programs, which may depend on how big/complex they are.
> >> > > >
> >> > > > ok, that helps (modulo not knowing what 'DPL' is), but still, the
> >> > > > massive customer configurations should be being loaded via linux'
> >> > > > firmware loading infrastructure: u-boot should be using a static
> >> > > > image for u-boot's needs.
> >> > > >
> >> > > > > > > +static int wait_for_mc(bool booting_mc, u32 *final_reg_gsr) {
> >> > > > > > > + u32 reg_gsr;
> >> > > > > > > + u32 mc_fw_boot_status;
> >> > > > > > > + unsigned long timeout_ms = get_mc_boot_timeout_ms();
> >> > > > > > > + struct mc_ccsr_registers __iomem *mc_ccsr_regs =
> >> > > > > > > +MC_CCSR_BASE_ADDR;
> >> > > > > > > +
> >> > > > > > > + dmb();
> >> > > > > > > + debug("Polling mc_ccsr_regs->reg_gsr ...\n");
> >> > > > > > > + assert(timeout_ms > 0);
> >> > > > > > > + for (;;) {
> >> > > > > > > +         udelay(1000);   /* throttle polling */
> >> > > > > >
> >> > > > > > does this really need to be a whole 1ms?
> >> > > > >
> >> > > > > It is unlikely that the MC fw will boot in less than 1 ms.
> >> > > >
> >> > > > is wait_for_mc() only called for the boot command, or all commands?
> >> >
> >> > I see: there's a udelay(500) in mc_send_command(), which is too high,
> >> > too, IMO, but I'm not that familiar with the h/w:  How long does the
> >> > shortest command take?
> >
> > Can you answer this?
> >
> >> > > > > So, checking more frequently than 1 ms is not necessary.
> >> > > >
> >> > > > yes it is, because e.g., if it takes 1.1ms we will have wasted 0.9ms
> >> > > > on this.
> >> > > >
> >> > > How significant is to save 0.9ms of the whole boot time?
> >> >
> >> > Why waste 0.9ms of boot time when there's no need?  It already takes the
> >> > boards *way* too long to boot, and now I'm understanding
> >> > mc_send_command's delay should probably be adjusted, too.
> >> >
> >> I have not heard any complain about RDB/QDS boards taking too long to boot
> >> Due to this "wasteds 0.9ms".
> >
> > I'm complaining now :)
> >
> > And it's not just about this singleton constant; it's about the
> > general approach to not optimizing the code, vis-a-vis the
> > mc_send_command() delay I bring up above.
> >
> >> Can you support your statement about LS2 RDB/QDS boards taking too long to boot
> >> with actual numbers? Otherwise, I will not make any change.
> >
> > A board takes anywhere from 17 to 35 seconds to boot, and, from the
> > prompts, a lot - if not most - of this time is spent during MC
> > processing.  If we take boot time down to say half of that, that
> > would work miracles in how we spend our development time on them.
> >
> >> > > As the comment in the code says, the intent was to throttle down the
> >> > > polling, to reduce traffic on the system bus due to polling. This
> >> > > traffic competes with the MC itself accessing the system bus, as it
> >> > > boots. Having the polling traffic get in the way of the MC traffic may
> >> > > increase the MC boot time. Too small delay between polls may cause the
> >> > > MC boot time to increase more than the .9ms you are concerned of
> >> > wasting in the delay.
> >> > >
> >> > > What value would you suggest to use for the delay instead 1000ms?
> >> >
> >> > I don't know MC h/w:  what's the shortest boot time given a standard
> >> > simple "DPL"?:  A small fraction of that.
> >> >
> >> I will not make any change at this time as there is no evidence that
> >> this optimization is actually needed.
> >
> > this kind of attitude leaves a lot to be desired...:(
> >
> > Kim
> > _______________________________________________
> > U-Boot mailing list
> > U-Boot at lists.denx.de
> > http://lists.denx.de/mailman/listinfo/u-boot


More information about the U-Boot mailing list