[PATCH] arm64: Add support for bigger u-boot when CONFIG_POSITION_INDEPENDENT=y

Edgar E. Iglesias edgar.iglesias at xilinx.com
Thu Sep 3 15:59:04 CEST 2020


On Thu, Sep 03, 2020 at 02:52:39PM +0100, André Przywara wrote:
> On 03/09/2020 14:41, Michal Simek wrote:
> > 
> > 
> > On 02. 09. 20 20:59, André Przywara wrote:
> >> On 02/09/2020 16:25, Edgar E. Iglesias wrote:
> >>> On Wed, Sep 02, 2020 at 04:18:48PM +0100, Andr� Przywara wrote:
> >>>> On 02/09/2020 15:53, Edgar E. Iglesias wrote:
> >>>>> On Wed, Sep 02, 2020 at 03:43:08PM +0100, Andr� Przywara wrote:
> >>>>>> On 02/09/2020 12:15, Michal Simek wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>>>>
> >>>>>>> From: "Edgar E. Iglesias" <edgar.iglesias at xilinx.com>
> >>>>>>>
> >>>>>>> When U-Boot binary exceeds 1MB with CONFIG_POSITION_INDEPENDENT=y
> >>>>>>> compilation error is shown:
> >>>>>>> /mnt/disk/u-boot/arch/arm/cpu/armv8/start.S:71:(.text+0x3c): relocation
> >>>>>>> truncated to fit: R_AARCH64_ADR_PREL_LO21 against symbol `__rel_dyn_end'
> >>>>>>> defined in .bss_start section in u-boot.
> >>>>>>>
> >>>>>>> It is caused by adr instruction which permits the calculation of any byte
> >>>>>>> address within +- 1MB of the current PC.
> >>>>>>> Because U-Boot is bigger then 1MB calculation is failing.
> >>>>>>>
> >>>>>>> The patch is using adrp/add instructions where adrp shifts a signed, 21-bit
> >>>>>>> immediate left by 12 bits (4k page), adds it to the value of the program
> >>>>>>> counter with the bottom 12 bits cleared to zero. Then add instruction
> >>>>>>> provides the lower 12 bits which is offset within 4k page.
> >>>>>>> These two instructions together compose full 32bit offset which should be
> >>>>>>> more then enough to cover the whole u-boot size.
> >>>>>>>
> >>>>>>> Signed-off-by: Edgar E. Iglesias <edgar.iglesias at xilinx.com>
> >>>>>>> Signed-off-by: Michal Simek <michal.simek at xilinx.com>
> >>>>>>
> >>>>>> It's a bit scary that you need more than 1MB, but indeed what you do
> >>>>>> below is the canonical pattern to get the full range of PC relative
> >>>>>> addressing (this is used heavily in Trusted Firmware, for instance).
> >>>>>>
> >>>>>> The only thing to keep in mind is that this assumes that the load
> >>>>>> address of the binary is 4K aligned, so that the low 12 bits of the
> >>>>>> symbol stay the same. I wonder if we should enforce this somehow? But
> >>>>>> the load address is not controlled by the build process (the whole
> >>>>>> purpose of PIE), so that's not doable just in the build system?
> >>>>>
> >>>>> There shouldn't be any need for 4K alignment. Could you elaborate on
> >>>>> why you think there is?
> >>>>
> >>>> That seems to be slightly tricky, and I tried to get some confirmation,
> >>>> but here goes my reasoning. Maybe you can confirm this:
> >>>>
> >>>> - adrp takes the relative offset, but only of the upper 20 bits (because
> >>>> that's all we can encode). It clears the lower 12 bits of the register.
> >>>> - the "add" is not PC relative anymore, so it just takes the lower 12
> >>>> bits of the "absolute" linker symbol.
> >>>
> >>> I was under the impression that this would use a PC-relative lower 12bit
> >>> relocation but you are correct. I dissasembled the result:
> >>>
> >>>   40:   91000042        add     x2, x2, #0x0
> >>>                         40: R_AARCH64_ADD_ABS_LO12_NC   __rel_dyn_start
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>> So this assumes that the lower 12 bits of the actual address in memory
> >>>> and the lower 12 bits of the linker's view match.
> >>>> An example:
> >>>> 00024: adrp x0, SYMBOL
> >>>> 00028: add  x0, x0, :lo12:SYMBOL
> >>>>
> >>>> SYMBOL:
> >>>> 42058: ...
> >>>>
> >>>> The toolchain will generate:
> >>>> 	adrp x0, #0x42; add x0, x0, #0x058
> >>>>
> >>>> Now you load the code to 0x8000.0800 (NOT 4K aligned). SYMBOL is now at
> >>>> 0x80042858.
> >>>> The adrp will use the PC (0x8000.0824) & ~0xfff + offs => 0x8004.2000.
> >>>> The add will just add 0x58, so you end up with x0 being 0x80042058,
> >>>> which is not the right address.
> >>>>
> >>>> Does this make sense?
> >>>
> >>>
> >>> Yes, it makes sense.
> >>>
> >>>>
> >>>>> Perhaps the commit message is a little confusing. The toolchain will
> >>>>> compute the pc-relative offset from this particular location to the
> >>>>> symbol and apply the relocations accordingly.
> >>>>
> >>>> Yes, but the PC relative offset applies only to the upper 20 bits,
> >>>> because it's only adrp that has PC relative semantics.
> >>>>
> >>>>
> >>>>>>
> >>>>>> Shall we at least document this? I guess typical load address are
> >>>>>> actually quite well aligned, so it might not be an issue in practice.
> >>>>>>
> >>>
> >>> Yes, probably worth documenting and perhaps an early bail-out if it's not
> >>> the case...
> >>
> >> Documenting sounds good, Kconfig might be a good place, as Michal suggested.
> >>
> >> Bail out: I thought about that, it's very easy to detect at runtime, but
> >> what then? This is really early, so you could just enter a WFI loop, and
> >> hope for someone to connect the dots?
> >> Or can you think of any other way of communicating with the user?
> > 
> > yes it is very early. It is the first real task what run after reset.
> > I am fine with detecting it to make sure that we won't have
> > unpredictable behavior later.
> > What detection code do you have in mind?
> 
> Just "adr"ing the beginning of the image (linker address 0), and
> checking for all 12 LSBs to be 0. The best I thought of would be a WFI
> loop if not. That sounds like 4 instructions or so in total to me.

Yeah, that sounds good me too. With a good comment in the source-code, people would be able to connect the dots.

> 
> > Don't we even have this 4k alignment in place already?
> 
> Do you mean in linker scripts? I think what counts here is that the
> actual *load* address is 4K aligned, which I believe is out of control
> of U-Boot. I would guess that's up to the user (flash address) or
> previous boot-stages (BootROM or pre-SPL firmware) to set the actual
> load address. In the best case it's very platform dependent.
> But it is definitely variable, otherwise we wouldn't need PIE in the
> first place.

Right, it's the run-time load address that matters.
I guess we already have limitations that fail silently (i.e a user can't load U-boot at
address 1) but the 4K one may be more subtle and possible to catch.

Cheers,
Edgar


More information about the U-Boot mailing list