[U-Boot-Users] [PATCH 1/3] Initial OneNAND support

Chris Morgan chmorgan at gmail.com
Thu Mar 15 02:56:47 CET 2007


> > > > > For better performance I added 32-bytes aligned memcpy32.
> > > > > Pleae check it.
> > > >
> > > > Did you measure how much of performance  this  gains?  Is  it
> > > > really worth the effort? In any case, the file needs a
> > GPL license
> > > > header.
> > >
> > > we can feel that it's more faster than before. OK I added
> > GPL license
> >
> > How much? 5%? 20%? 50%?
>
> Architecture optimized copy is fast about 3 times in OMAP 16xx series (ARM
> 9 core) in U-Boot
>
> Here's test results
> memory copy (Unit: nsec)                      96 MHz            192 MHz
> unsigned long *                           189,440 ~ 192,000     158,720 ~
> 160,000
> architecture optimized copy                   56,320 ~ 58,880   53,760 ~ 55,
> 040
>
> where increment time is 2560 nsec in 96 MHz and 1280 nsec in 192 MHz
>

On the issue of architecture specific memcpy()s. There appears to be
an issue with the ppc optimized memcpy() in that it assumes alignment
of both source and destination buffers. Our solution was to disable
architecture specific memcpy() routines. We did however need the
performance of a long word copy so we made our own memcpy() routine
that we use during elf load, this isn't very elegant however.

It would be useful to have a generic C version of memcpy/memset/etc()
for 8, 16 and 32 (maybe 64), bit architectures. Maybe a macro with a
SIZE parameter would be suitable for generating the functions. The
implementation would likely be very similar to your memcpy32() and it
would be quite a bit easier to add support for unaligned pointers to a
C version of memcpy() vs. modifying the asm code for each
architecture.

Chris




More information about the U-Boot mailing list