[U-Boot] [PATCH] arm: Use optimized memcpy and memset from linux

Matthias Weißer weisserm at arcor.de
Tue Jan 25 11:55:22 CET 2011


Am 24.01.2011 21:07, schrieb Wolfgang Denk:
> OK - so which results do you see in reallife use, say when loading and
> booting an OS? How much boot time can be saved?

All tests are done with jadecpu

                        | HEAD(1)| HEAD(1)| HEAD(2)| HEAD(2)|
                        |        | +patch |        | +patch |
-----------------------+--------+--------+--------+--------+
Reset to prompt        |  438ms |  330ms |  228ms |  120ms |
                        |        |        |        |        |
TFTP a 3MB img         | 4782ms | 3428ms | 3245ms | 2820ms |
                        |        |        |        |        |
FATLOAD USB a 3MB img* | 8515ms | 8510ms | ------ | ------ |
                        |        |        |        |        |
BOOTM LZO img in RAM   | 3473ms | 3168ms |  592ms |  592ms |
  where CRC is          |  615ms |  615ms |   54ms |   54ms |
  uncompress            | 2460ms | 2462ms |  450ms |  451ms |
  final boot_elf        |  376ms |   68ms |   65ms |   65ms |
                        |        |        |        |        |
BOOTM LZO img in FLASH | 3207ms | 2902ms | 1050ms | 1050ms |
  where CRC is          |  600ms |  600ms |  135ms |  135ms |
  uncompress            | 2209ms | 2211ms |  828ms |  828ms |
  final boot_elf        |  376ms |   68ms |   65ms |   65ms |

(1) No dcache
(2) dcache enabled in board_init
*Does not work when dcache is on

I think we can see that there seems to be no negativ impact of theses 
patches when only execution speed is taken into consideration. The gain 
is noticable when caching is not used or not activated. For pure RAM to 
RAM copy when caching is activated the patch didn't change anything.

Here are some additional numbers for copying a 1.4MB image from NOR to RAM:

HEAD                  : 134ms
HEAD + patch          : 72ms
HEAD + dcache         : 120ms
HEAD + dcache + patch : 70ms

So, for copy actions from flash to RAM there is also an improvement. As 
boot times are a bit critical or us every improvement > 10ms is 
interesting for us.

> I guess the speed improvemnt you see for a few large copy operations
> is just one side - probably there will be slower excution (due to the
> effort to set up the operations) for the (many more frequent) small
> operations.  In addition, there is an increase of the memory footprint
> of nearly 1 kB.
 >
> I think additional measuremnts need to be done - for example, we
> should check how the execution times change for typical operations
> like TFTP download, reading from NAND flash and MMC/SDcard, booting a
> Linux kernel etc.

As the test above show there is no negative performance impact with the 
test cases I have done. As we don't use Linux here I can't test this. 
Maybe someone other can jump in here.

> Also, it should be possible to enable this feature consditionally, so
> users can decide wether speed or size is more important in their
> configurations.

Would it be an option to use the CONFIG entries CONFIG_USE_ARCH_MEMCPY 
and CONFIG_USE_ARCH_MEMSET to enable that feature? If that is OK I can 
send a new version of the patch. The only problem I see with this 
approach is that there are architectures which already have their own 
implementations which are then not affected by these config options.


Regards
Matthias


More information about the U-Boot mailing list