[PATCH] arm64: Add support for bigger u-boot when CONFIG_POSITION_INDEPENDENT=y

Michal Simek michal.simek at xilinx.com
Thu Sep 3 16:03:00 CEST 2020



On 03. 09. 20 15:59, Edgar E. Iglesias wrote:
> On Thu, Sep 03, 2020 at 02:52:39PM +0100, André Przywara wrote:
>> On 03/09/2020 14:41, Michal Simek wrote:
>>>
>>>
>>> On 02. 09. 20 20:59, André Przywara wrote:
>>>> On 02/09/2020 16:25, Edgar E. Iglesias wrote:
>>>>> On Wed, Sep 02, 2020 at 04:18:48PM +0100, Andr� Przywara wrote:
>>>>>> On 02/09/2020 15:53, Edgar E. Iglesias wrote:
>>>>>>> On Wed, Sep 02, 2020 at 03:43:08PM +0100, Andr� Przywara wrote:
>>>>>>>> On 02/09/2020 12:15, Michal Simek wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>>>>
>>>>>>>>> From: "Edgar E. Iglesias" <edgar.iglesias at xilinx.com>
>>>>>>>>>
>>>>>>>>> When U-Boot binary exceeds 1MB with CONFIG_POSITION_INDEPENDENT=y
>>>>>>>>> compilation error is shown:
>>>>>>>>> /mnt/disk/u-boot/arch/arm/cpu/armv8/start.S:71:(.text+0x3c): relocation
>>>>>>>>> truncated to fit: R_AARCH64_ADR_PREL_LO21 against symbol `__rel_dyn_end'
>>>>>>>>> defined in .bss_start section in u-boot.
>>>>>>>>>
>>>>>>>>> It is caused by adr instruction which permits the calculation of any byte
>>>>>>>>> address within +- 1MB of the current PC.
>>>>>>>>> Because U-Boot is bigger then 1MB calculation is failing.
>>>>>>>>>
>>>>>>>>> The patch is using adrp/add instructions where adrp shifts a signed, 21-bit
>>>>>>>>> immediate left by 12 bits (4k page), adds it to the value of the program
>>>>>>>>> counter with the bottom 12 bits cleared to zero. Then add instruction
>>>>>>>>> provides the lower 12 bits which is offset within 4k page.
>>>>>>>>> These two instructions together compose full 32bit offset which should be
>>>>>>>>> more then enough to cover the whole u-boot size.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Edgar E. Iglesias <edgar.iglesias at xilinx.com>
>>>>>>>>> Signed-off-by: Michal Simek <michal.simek at xilinx.com>
>>>>>>>>
>>>>>>>> It's a bit scary that you need more than 1MB, but indeed what you do
>>>>>>>> below is the canonical pattern to get the full range of PC relative
>>>>>>>> addressing (this is used heavily in Trusted Firmware, for instance).
>>>>>>>>
>>>>>>>> The only thing to keep in mind is that this assumes that the load
>>>>>>>> address of the binary is 4K aligned, so that the low 12 bits of the
>>>>>>>> symbol stay the same. I wonder if we should enforce this somehow? But
>>>>>>>> the load address is not controlled by the build process (the whole
>>>>>>>> purpose of PIE), so that's not doable just in the build system?
>>>>>>>
>>>>>>> There shouldn't be any need for 4K alignment. Could you elaborate on
>>>>>>> why you think there is?
>>>>>>
>>>>>> That seems to be slightly tricky, and I tried to get some confirmation,
>>>>>> but here goes my reasoning. Maybe you can confirm this:
>>>>>>
>>>>>> - adrp takes the relative offset, but only of the upper 20 bits (because
>>>>>> that's all we can encode). It clears the lower 12 bits of the register.
>>>>>> - the "add" is not PC relative anymore, so it just takes the lower 12
>>>>>> bits of the "absolute" linker symbol.
>>>>>
>>>>> I was under the impression that this would use a PC-relative lower 12bit
>>>>> relocation but you are correct. I dissasembled the result:
>>>>>
>>>>>   40:   91000042        add     x2, x2, #0x0
>>>>>                         40: R_AARCH64_ADD_ABS_LO12_NC   __rel_dyn_start
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> So this assumes that the lower 12 bits of the actual address in memory
>>>>>> and the lower 12 bits of the linker's view match.
>>>>>> An example:
>>>>>> 00024: adrp x0, SYMBOL
>>>>>> 00028: add  x0, x0, :lo12:SYMBOL
>>>>>>
>>>>>> SYMBOL:
>>>>>> 42058: ...
>>>>>>
>>>>>> The toolchain will generate:
>>>>>> 	adrp x0, #0x42; add x0, x0, #0x058
>>>>>>
>>>>>> Now you load the code to 0x8000.0800 (NOT 4K aligned). SYMBOL is now at
>>>>>> 0x80042858.
>>>>>> The adrp will use the PC (0x8000.0824) & ~0xfff + offs => 0x8004.2000.
>>>>>> The add will just add 0x58, so you end up with x0 being 0x80042058,
>>>>>> which is not the right address.
>>>>>>
>>>>>> Does this make sense?
>>>>>
>>>>>
>>>>> Yes, it makes sense.
>>>>>
>>>>>>
>>>>>>> Perhaps the commit message is a little confusing. The toolchain will
>>>>>>> compute the pc-relative offset from this particular location to the
>>>>>>> symbol and apply the relocations accordingly.
>>>>>>
>>>>>> Yes, but the PC relative offset applies only to the upper 20 bits,
>>>>>> because it's only adrp that has PC relative semantics.
>>>>>>
>>>>>>
>>>>>>>>
>>>>>>>> Shall we at least document this? I guess typical load address are
>>>>>>>> actually quite well aligned, so it might not be an issue in practice.
>>>>>>>>
>>>>>
>>>>> Yes, probably worth documenting and perhaps an early bail-out if it's not
>>>>> the case...
>>>>
>>>> Documenting sounds good, Kconfig might be a good place, as Michal suggested.
>>>>
>>>> Bail out: I thought about that, it's very easy to detect at runtime, but
>>>> what then? This is really early, so you could just enter a WFI loop, and
>>>> hope for someone to connect the dots?
>>>> Or can you think of any other way of communicating with the user?
>>>
>>> yes it is very early. It is the first real task what run after reset.
>>> I am fine with detecting it to make sure that we won't have
>>> unpredictable behavior later.
>>> What detection code do you have in mind?
>>
>> Just "adr"ing the beginning of the image (linker address 0), and
>> checking for all 12 LSBs to be 0. The best I thought of would be a WFI
>> loop if not. That sounds like 4 instructions or so in total to me.
> 
> Yeah, that sounds good me too. With a good comment in the source-code, people would be able to connect the dots.
> 
>>
>>> Don't we even have this 4k alignment in place already?
>>
>> Do you mean in linker scripts? I think what counts here is that the
>> actual *load* address is 4K aligned, which I believe is out of control
>> of U-Boot. I would guess that's up to the user (flash address) or
>> previous boot-stages (BootROM or pre-SPL firmware) to set the actual
>> load address. In the best case it's very platform dependent.
>> But it is definitely variable, otherwise we wouldn't need PIE in the
>> first place.
> 
> Right, it's the run-time load address that matters.
> I guess we already have limitations that fail silently (i.e a user can't load U-boot at
> address 1) but the 4K one may be more subtle and possible to catch.

Edgar: Do you want to send v2 with it? Or do you want me to send v2 with
Kconfig update and checking alignment?
Also as you know you have written origin patch and I have added
description based on our discussion. If you want me to send v2 it is
good time to make that commit message more accurate. :-)

Thanks,
Michal



More information about the U-Boot mailing list