[U-Boot] [RFC PATCH] arm: provide a CONFIG flag for disabling relocation
Albert ARIBAUD
albert.u.boot at aribaud.net
Wed Sep 21 16:23:56 CEST 2011
Le 21/09/2011 14:31, Andreas Bießmann a écrit :
> Dear Albert,
>
> Am Mi 21 Sep 2011 14:03:09 CEST schrieb Albert ARIBAUD:
>> Le 21/09/2011 13:20, Andreas Bießmann a écrit :
>>> Dear "GROYER, Anthony",
>>> Dear Albert,
>>>
>>> Am Mi 21 Sep 2011 12:51:33 CEST schrieb Albert ARIBAUD:
>>>> Le 21/09/2011 11:29, GROYER, Anthony a écrit :
>>>>
>>>
>>> <snip>
>>>
>>>> 3) replace use of r9-r10 with e.g. r10-r11 in the copy loop, to
>>>> preserve
>>>> r9 during relocation.
>>>
>>> If one is changing this place I would like to discuss another point
>>> here.
>>> In my last changeset for relocation I found some implementation in
>>> a/a/c/pxa/start.S which do save the register to stack before copy_loop,
>>> use almost all registers (only r8 is not used which is gd_t for arm, but
>>> I think it could be used here too cause it is saved on the stack) and
>>> save the registers back later on.
>>> I guess this could fasten the copy_loop a bit but needs to be proven.
>>> Anthony, if you change all start.S could you consider this also?
>>
>> I am not 100% sure I get your point, but I assume that you are asking
>> for *removal* of the saving and restoring, right?
>
> No, that was not the point. I think the 'save registers before copy_loop
> to use more registers for ldmia/stmia instructions' is a good solution
> which could improve the copy_loop for all arm implementations.
I see -- you want to do a ldmia/stmia with a lot of regs.
>> I would tend to
>> agree that saving and restoring registers in relocate_code is moot, as
>> this function does not return in the usual sense.
>
> No the code does register save before copy_loop and restore them right
> afterwards. Therefore even r8 could be used in the copy_loop cause it is
> preserved on the (newly created) stack. Have a look at a/a/c/pxa/start.S
> from line 241 (relocate_code) to 263 (end of copy_loop).
> But I guess the ldmia/stmia instructions could even use r3-r11, only
> r0-r2 needs to be preserved for loop counting.
> I wonder if this could improve the copy_loop ... will try to test it
> these days, if no one else can do it (Anthony?).
Apart from your question (how are the number of registers in ldmia and
stmia speed related to the speed of the copy loop?) there is another
one: how do we handle the fact that the length to copy may not be a
multiple of the ldmia/stmia 'width'? Even in arm926ejs/start.S, two
registers are used, but the alignment for text+data is 4 bytes, not 8.
This did not bite us so far, and should not, since we're going to copy
into the space after .text, which *should* be .bss, which we'll zero
right after. But Murphy's law could hit...
>> As for r8, it should be preserved as it points to gd, but that is
>> ensured by the C code already IIRC.
>
> We use -ffixed-r8 therefore the compiler takes care for the C part, but
> we need to respect this in asm.
in arm926ejs/start.S we do. If there are other start.S files where r8 is
trashed, they should be fixed indeed.
> Well, if we preserve r8 for the copy_loop and restore it right
> afterwards we could use it in the copy_loop for copy purposes. Cause
> there is no dereferencing of r8 in copy_loop.
>
> best regards
>
> Andreas Bießmann
Amicalement,
--
Albert.
More information about the U-Boot
mailing list