[U-Boot] boot-up time optimization. Where to start?

Alexander Stein alexander.stein at systec-electronic.com
Thu May 5 09:06:35 CEST 2011


Dear Wolfgang,

Am Donnerstag, 5. Mai 2011, 07:32:20 schrieb Wolfgang Denk:
> In message <201105030848.17576.alexander.stein at systec-electronic.com> you 
wrote:
> > This specific version was selected due to relocation problems on ARM. But
> > I expect the dcache doesn't have that big influence on the named code
> > part as the environment is already in RAM.
> 
> Your expectation is most likely completely wrong.  Reading from /
> writing to uncached RAM is painfully slow compared to a system with
> caches turned on.  And if you - as I speculate - need to checksum a
> huge amount of data, this will delay things without need.
> 
> 
> Are you also still using the old environment code in your port, or is
> the new, hash table based one?  When using the old code, there are
> additional penalties for using a needlessly big environment as each
> call to setenv() will recalculate the checksum.

I was digging into this problem for a short time. And yes, the CRC 
checksumcalculation takes about 25ms each run. So setenv is called for each 
stdin,stdout and stderr. which sums up to ~75ms.
So you're right this is the old environment code. Here a dcache will speed up 
the execution of course.
But our standard startup just stars U-Boot and copies the Linux kernel into 
RAM and starts it. There is not much use of dcache during copy here.

> > > (III) you are running on a narrow
> > > system bus (16 bit) with non-optimal RAM timings;
> > 
> > It is using a 32-Bit RAM-Bus. So, no.
> 
> And your NOR flash?

It is connected 16-bit like most devices only support, but it is setup to use 
page read mode.

> And your memory timings?

Should be pretty good.

> > > (IV) you do all this
> > > with caches turned off;
> > 
> > dcaches should be off, while icaches are on. So yes and no.
> 
> DC of makes things awfully slow.  See comments of commits c3330e9,
> 95c6f6d and 7e4a9e6 - for plain RAM bound operations like
> copying/uncompressing an image from RAM to RAM switchign on the DC can
> accelerate the system by a factor of up to >15.

Yes, from RAM to RAM, dcache will help a lot. But we neither copy from RAM to 
RAM nor do we uncompressing.

> > > (V) you measure some numbers but you don;t
> > > understand what they mean.
> > 
> > These numbers show me that this part of code increases the start time of
> > a considerable amount.
> 
> You don;t even understand that you have > 100 KiB of environment size
> which gets checksummed without need.

Mh, this might be an option for further ports.

> Fact is, the code that you claim takes 100 (or 500) ms to run has no
> potential for such a long run time unless your system is seriously
> misconfigured.  I guess it runs at least 100 times faster on all
> systems I have access to.

Well, as already said this is related to CRC calculation of environment. I did 
a fast port to v2011.03 and the setenv is a lot faster, which is due the new 
env code base.
But I also noticed the time until kernel_entry is called is about 30ms later 
after reset than on the old code base. But I didn't investigate any time 
further to see what caused this. But AFAICS also the new U-Boot code doesn't 
enable dcache on ARM1136 either.

Regards,
Alexander


More information about the U-Boot mailing list