[U-Boot] boot-up time optimization. Where to start?
Wolfgang Denk
wd at denx.de
Thu May 5 09:27:49 CEST 2011
Dear Alexander Stein,
In message <201105050906.35834.alexander.stein at systec-electronic.com> you wrote:
>
> > Are you also still using the old environment code in your port, or is
> > the new, hash table based one? When using the old code, there are
> > additional penalties for using a needlessly big environment as each
> > call to setenv() will recalculate the checksum.
>
> I was digging into this problem for a short time. And yes, the CRC
> checksumcalculation takes about 25ms each run. So setenv is called for each
> stdin,stdout and stderr. which sums up to ~75ms.
> So you're right this is the old environment code. Here a dcache will speed up
> the execution of course.
Even more so would reducing the environment size to some reasonable
value. Currently you are using some 2 KiB, so say you set the
environment size to 8 KiB. This would be 1/16 of your current size,
which means the ~75ms would shrink to less than 5 ms. You are wasting
70 ms (only here - there are other places which will add to this
figure) just because this inappropriate configuration.
> But our standard startup just stars U-Boot and copies the Linux kernel into
> RAM and starts it. There is not much use of dcache during copy here.
You are wrong. There is a huge difference between perrforming a copy
operation in single write cycles to uncached RAM versus writing to a
cached area where the cache flushes willoperate in burst mode. Also,
the U-Boot code will run faster, too, so copying and decompression is
much faster.
You repeat the same mistake again: you make assumptions about what
may or may not be fast or slow on your system without actually
measuring it. Donald Knuth is right again: "Early optimization is
the root of much evil."
> > > It is using a 32-Bit RAM-Bus. So, no.
> >
> > And your NOR flash?
>
> It is connected 16-bit like most devices only support, but it is setup to use
> page read mode.
Well, many systems use two 16 bit chips in parallel to give a 32 bit
bus.
> > DC of makes things awfully slow. See comments of commits c3330e9,
> > 95c6f6d and 7e4a9e6 - for plain RAM bound operations like
> > copying/uncompressing an image from RAM to RAM switchign on the DC can
> > accelerate the system by a factor of up to >15.
>
> Yes, from RAM to RAM, dcache will help a lot. But we neither copy from RAM to
> RAM nor do we uncompressing.
There is still a huge diference in memory bandwith between using plain
single write cycles versus burst mode accesses.
Don't speculate. Measure yourself!
Best regards,
Wolfgang Denk
--
DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
"Send lawyers, guns and money..." - Lyrics from a Warren Zevon song
More information about the U-Boot
mailing list