[U-Boot] boot-up time optimization. Where to start?
Alexander Stein
alexander.stein at systec-electronic.com
Thu May 5 09:06:35 CEST 2011
Dear Wolfgang,
Am Donnerstag, 5. Mai 2011, 07:32:20 schrieb Wolfgang Denk:
> In message <201105030848.17576.alexander.stein at systec-electronic.com> you
wrote:
> > This specific version was selected due to relocation problems on ARM. But
> > I expect the dcache doesn't have that big influence on the named code
> > part as the environment is already in RAM.
>
> Your expectation is most likely completely wrong. Reading from /
> writing to uncached RAM is painfully slow compared to a system with
> caches turned on. And if you - as I speculate - need to checksum a
> huge amount of data, this will delay things without need.
>
>
> Are you also still using the old environment code in your port, or is
> the new, hash table based one? When using the old code, there are
> additional penalties for using a needlessly big environment as each
> call to setenv() will recalculate the checksum.
I was digging into this problem for a short time. And yes, the CRC
checksumcalculation takes about 25ms each run. So setenv is called for each
stdin,stdout and stderr. which sums up to ~75ms.
So you're right this is the old environment code. Here a dcache will speed up
the execution of course.
But our standard startup just stars U-Boot and copies the Linux kernel into
RAM and starts it. There is not much use of dcache during copy here.
> > > (III) you are running on a narrow
> > > system bus (16 bit) with non-optimal RAM timings;
> >
> > It is using a 32-Bit RAM-Bus. So, no.
>
> And your NOR flash?
It is connected 16-bit like most devices only support, but it is setup to use
page read mode.
> And your memory timings?
Should be pretty good.
> > > (IV) you do all this
> > > with caches turned off;
> >
> > dcaches should be off, while icaches are on. So yes and no.
>
> DC of makes things awfully slow. See comments of commits c3330e9,
> 95c6f6d and 7e4a9e6 - for plain RAM bound operations like
> copying/uncompressing an image from RAM to RAM switchign on the DC can
> accelerate the system by a factor of up to >15.
Yes, from RAM to RAM, dcache will help a lot. But we neither copy from RAM to
RAM nor do we uncompressing.
> > > (V) you measure some numbers but you don;t
> > > understand what they mean.
> >
> > These numbers show me that this part of code increases the start time of
> > a considerable amount.
>
> You don;t even understand that you have > 100 KiB of environment size
> which gets checksummed without need.
Mh, this might be an option for further ports.
> Fact is, the code that you claim takes 100 (or 500) ms to run has no
> potential for such a long run time unless your system is seriously
> misconfigured. I guess it runs at least 100 times faster on all
> systems I have access to.
Well, as already said this is related to CRC calculation of environment. I did
a fast port to v2011.03 and the setenv is a lot faster, which is due the new
env code base.
But I also noticed the time until kernel_entry is called is about 30ms later
after reset than on the old code base. But I didn't investigate any time
further to see what caused this. But AFAICS also the new U-Boot code doesn't
enable dcache on ARM1136 either.
Regards,
Alexander
More information about the U-Boot
mailing list