[U-Boot] Relocation size penalty calculation

Graeme Russ graeme.russ at gmail.com
Sat Oct 17 07:17:04 CEST 2009


On Thu, Oct 15, 2009 at 3:45 AM, J. William Campbell
<jwilliamcampbell at comcast.net> wrote:
> Joakim Tjernlund wrote:
>>
>> Graeme Russ <graeme.russ at gmail.com> wrote on 14/10/2009 13:48:27:
>>
>>>
>>> On Wed, Oct 14, 2009 at 6:25 PM, Joakim Tjernlund
>>> <joakim.tjernlund at transmode.se> wrote:
>>>
>>>>
>>>> "J. William Campbell" <jwilliamcampbell at comcast.net> wrote on 14/10/2009
>>>> 01:48:52:
>>>>
>>>>>
>>>>> Joakim Tjernlund wrote:
>>>>>
>>>>>>
>>>>>> Graeme Russ <graeme.russ at gmail.com> wrote on 13/10/2009 22:06:56:
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> On Tue, Oct 13, 2009 at 10:53 PM, Joakim Tjernlund
>>>>>>> <joakim.tjernlund at transmode.se> wrote:
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Graeme Russ <graeme.russ at gmail.com> wrote on 13/10/2009 13:21:05:
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund
>>>>>>>>> <joakim.tjernlund at transmode.se> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Graeme Russ <graeme.russ at gmail.com> wrote on 11/10/2009 12:47:19:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [Massive Snip :)]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> So, all that is left are .dynsym and .dynamic ...
>>>>>>>>>>>  .dynsym
>>>>>>>>>>>    - Contains 70 entries (16 bytes each, 1120 bytes)
>>>>>>>>>>>    - 44 entries mimic those entries in .got which are not
>>>>>>>>>>> relocated
>>>>>>>>>>>    - 21 entries are the remaining symbols exported from the
>>>>>>>>>>> linker
>>>>>>>>>>>      script
>>>>>>>>>>>    - 4 entries are labels defined in inline asm and used in C
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Try adding proper asm declarations. Look at what gcc
>>>>>>>>>> generates for a function/variable and mimic these.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks - Now .dynsym contains only exports from the linker script
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> :)
>>>>>>>>
>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>    - 1 entry is a NULL entry
>>>>>>>>>>>
>>>>>>>>>>>  .dynamic
>>>>>>>>>>>    - 88 bytes
>>>>>>>>>>>    - Array of Elf32_Dyn
>>>>>>>>>>>    - typedef struct {
>>>>>>>>>>>          Elf32_Sword     d_tag;
>>>>>>>>>>>          union {
>>>>>>>>>>>              Elf32_Word  d_val;
>>>>>>>>>>>              Elf32_Addr  d_ptr;
>>>>>>>>>>>          } d_un;
>>>>>>>>>>>      } Elf32_Dyn;
>>>>>>>>>>>    - 0x11 entries
>>>>>>>>>>>      [00] 0x00000010, 0x00000000 DT_SYMBOLIC, (ignored)
>>>>>>>>>>>      [01] 0x00000004, 0x38059994 DT_HASH, points to .hash
>>>>>>>>>>>      [02] 0x00000005, 0x380595AB DT_STRTAB, points to .dynstr
>>>>>>>>>>>      [03] 0x00000006, 0x3805BDCC DT_SYMTAB, points to .dynsym
>>>>>>>>>>>      [04] 0x0000000A, 0x000003E6 DT_STRSZ, size of .dynstr
>>>>>>>>>>>      [05] 0x0000000B, 0x00000010 DT_SYMENT, ???
>>>>>>>>>>>      [06] 0x00000015, 0x00000000 DT_DEBUG, ???
>>>>>>>>>>>      [07] 0x00000011, 0x3805A8F4 DT_REL, points to .rel.text
>>>>>>>>>>>      [08] 0x00000012, 0x000014D8 DT_RELSZ, ???
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> How big DT_REL is
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>      [09] 0x00000013, 0x00000008 DT_RELENT, ???
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> hmm, cannot remeber :)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> How big an entry in DT_REL is
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> Right, how could I forget :)
>>>>>>>>
>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>      [0a] 0x00000016, 0x00000000 DT_TEXTREL, ???
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Oops, you got text relocations. This is generally a bad thing.
>>>>>>>>>> TEXTREL is commonly caused by asm code that arent truly pic so it
>>>>>>>>>> needs
>>>>>>>>>> to modify the .text segment to adjust for relocation.
>>>>>>>>>> You should get rid of this one. Look for DT_TEXTREL in .o files to
>>>>>>>>>> find
>>>>>>>>>> the culprit.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Alas I cannot - The relocations are a result of loading a register
>>>>>>>>> with a
>>>>>>>>> return address when calling show_boot_progress in the very early
>>>>>>>>> stages of
>>>>>>>>> initialisation prior to the stack becoming available. The x86 does
>>>>>>>>> not
>>>>>>>>> allow direct access to the IP so the only way to find the 'current
>>>>>>>>> execution address' is to 'call' to the next instruction and pop the
>>>>>>>>> return
>>>>>>>>> address off the stack
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> hmm, same as ppc but that in it self should not cause a TEXREL,
>>>>>>>> should it?
>>>>>>>> Ahh, the 'call' is absolute, not relative? I guess there is some way
>>>>>>>> around it
>>>>>>>> but it is not important ATM I guess.
>>>>>>>>
>>>>>>>> Evil idea, skip -fpic et. all and add the full reloc procedure
>>>>>>>> to relocate by rewriting directly in TEXT segment. Then you save
>>>>>>>> space
>>>>>>>> but you need more relocation code. Something like dl_do_reloc from
>>>>>>>> uClibc. Wonder how much extra code that would be? Not too much I
>>>>>>>> think.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> With the following flags
>>>>>>>
>>>>>>> PLATFORM_RELFLAGS += -fvisibility=hidden
>>>>>>> PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm
>>>>>>> PLATFORM_LDFLAGS += -pic --emit-relocs -Bsymbolic
>>>>>>> -Bsymbolic-functions
>>>>>>>
>>>>>>> I get no .got, but a lot of R_386_PC32 and R_386_32 relocations. I
>>>>>>> think
>>>>>>> this might mean I need the symbol table in the binary in order to
>>>>>>> resolve
>>>>>>> them
>>>>>>>
>>>>>>>
>>>>
>>>> BTW, how many relocs do you get compared with -fPIC? I suspect you more
>>>> now but hopefully not that many more.
>>>>
>>>>
>>>>>>
>>>>>> Possibly, but I think you only need to add an offset to all those
>>>>>> relocs.
>>>>>>
>>>>>>
>>>>>
>>>>> Almost right. The relocations specify a symbol value that needs to be
>>>>> added to the data in memory to relocate the reference. The symbol
>>>>> values
>>>>> involved should be the start of the text section for program
>>>>> references,
>>>>> the start of the uninitialized data section for bss references, and the
>>>>> start of the data section for initialized data and constants. So there
>>>>> are about four symbols whose value you need to keep. Take a look at
>>>>> http://refspecs.freestandards.org/elf/elf.pdf (which you have probably
>>>>> already looked at) and it tells you what to do with R_386_PC32 ad
>>>>> R_386_32 relocations. Hopefully the objcopy with the --strip-unneeded
>>>>> will remove all the symbols you don't actually need, but I don't know
>>>>> that for sure. Note also that you can change the section flags of a
>>>>> section marked noload  to load.
>>>>>
>>>>
>>>> Still think you can get away with just ADDING an offset. The image is
>>>> linked to a
>>>> specific address and then you move the whole image to a new address.
>>>> Therefore
>>>> you should be able to read the current address, add offset, write back
>>>> the
>>>>
>>>
>>> new address.
>>>    OK, I don't really get this at all....
>>>
>>> This code:
>>>
>>>    printf ("\n\n%s\n\n", version_string);
>>>
>>> gets compiled into:
>>>
>>>    380403e7:   68 a4 18 05 38   push   $0x380518a4
>>>    380403ec:   68 de 2c 05 38   push   $0x38052cde
>>>    380403f1:   e8 4f 84 00 00   call   38048845 <printf>
>>>
>>> With relocation entries in .rel.text of:
>>>
>>>  Offset     Info    Type            Sym.Value  Sym. Name
>>>   380403e8  00016201 R_386_32          380519f0   version_string
>>>   380403ed  00000201 R_386_32          380519f0   .rodata
>>>   380403f2  00016b02 R_386_PC32        38048991   printf
>>>
>>> Now I get the first two (R_386_32) entries - Relocation involves a simple
>>> addition of an offset to the values at addresses 0x380403e8 and
>>> 0x380403ed
>>> (of course, these addresses will be offset)
>>>
>>> However, the R_386_PC32 is an enigma - The call is already relative -
>>> there is no need to relocate it at all (call is a position independent
>>> opcode because it is a relative jump!)
>>>
>>
>> Yes, but printf is defined in glibc så the app needs to relocate the call
>> to glibc.
>
> Actually, the reason the call is relocatable is that the compiler DOESN'T
>  KNOW where printf is at all. If it is in a library, it will not be in the
> text segment and must be relocated accordingly. It may be in  a different
> segment for some reason. In any case, the compiler doesn't know the address
> in the image where printf resides, so it needs a relocation entry to get the
> value filled in at link time. After the value is filled in, if the
> referenced symbol is in the same segment (probably .text) as the point of
> reference, the relocation reference is probably of no more use. However,
> there is no rule that says the linker must delete the reference from the
> relocation list.
>>
>>  U-boot has all it needs so there you should not have PC32 I think.
>> Try defining a local static function. For non static functions
>> you may need to define visibility=hidden and/or -Bsymbolic too.
>>
>
> Won't help. Any symbols referenced but not defined locally are relocatable.
> After linking, they MAY, but need not, go away.
>>
>> You also need to look at the img after final linking.
>>
>
> After linking, if the symbol is defined, the R_386_PC32 is no longer
> important UNLESS the symbol referenced is in a different segment AND the
> segments are relocated with different offsets from each other than
> originally linked. For this reason, I think the linker will not discard
> these relocations. If we are not relocating the segments with different
> relative offsets, we can ignore these relocations as the change in offset
> will come out to be zero anyway. However, if you process them normally, you
> will just add 0 and nothing will change.
>>
>>
>>>
>>> Will all R_386_PC32 be like this? Can I simply ignore them all? If so,
>>> why
>>> do they even need to be generated?
>>>
>>
>> Hopefully you won't have any.
>
> I think they may still be there, because we ask the linker to preserve
> relocation information. However, if the entire image is being relocated, not
> changing the order or relative offset of any segments, they can be ignored,
> because the relative values will not change. It will be interesting to know
> if they remain or if the linker drops them out. For references in the same
> segment, we can hope that they get dropped. For references across segments
> (if any), or any undefined symbols, they will remain.
>>
>>  Not sure about weak functions though. These might
>> need PC32 relocs in some cases.
>>
>
> There can be PC32 relocs referencing the weak symbol, but that symbol may be
> undefined.
>>
>> Also, if you look at _dl_do_reloc() in uClibc/ldso/ldso/i386/elfinterp.c I
>> think
>> you can replace symbol_addr with relocation offset.
>>
>
> I agree, in the case you a moving the entire image and ignoring PC32 relocs.
>
> Best Regards,
> Bill Campbell
>>
>>    Jocke
>>

Apologies if this is getting way off-topic for a simple boot loader, but
this is information I have gathered from far and wide over the net. I am
surprised that there isn't a web site out there on 'How to create a
relocatable boot loader'...

OK, its all starting to come together now - It helps when you look at the
right files ;)

Firstly, u-boot.map

                0x380589a0                __rel_dyn_start = .

.rel.dyn        0x380589a0     0x42b0
 *(.rel.dyn)
 .rel.got       0x00000000        0x0 cpu/i386/start.o
 .rel.plt       0x00000000        0x0 cpu/i386/start.o
 .rel.text      0x380589a0     0x2e28 cpu/i386/start.o
 .rel.start16   0x3805b7c8       0x10 cpu/i386/start.o
 .rel.data      0x3805b7d8      0xc18 cpu/i386/start.o
 .rel.rodata    0x3805c3f0      0x360 cpu/i386/start.o
 .rel.u_boot_cmd
                0x3805c750      0x500 cpu/i386/start.o
                0x3805cc50                __rel_dyn_end = .


And the output of readelf...

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        38040000 001000 0118a4 00  AX  0   0  4
  [ 2] .rel.text         REL             00000000 066c68 005d00 08     40   1  4
  [ 3] .rodata           PROGBITS        380518a4 0128a4 005da5 00   A  0   0  4
  [ 4] .rel.rodata       REL             00000000 06c968 000360 08     40   3  4
  [ 5] .interp           PROGBITS        38057649 018649 000013 00   A  0   0  1
  [ 6] .dynstr           STRTAB          3805765c 01865c 0001ee 00   A  0   0  1
  [ 7] .hash             HASH            3805784c 01884c 0000cc 04   A 11   0  4
  [ 8] .data             PROGBITS        38057918 018918 000a3c 00  WA  0   0  4
  [ 9] .rel.data         REL             00000000 06ccc8 000c18 08     40   8  4
  [10] .got.plt          PROGBITS        38058354 019354 00000c 04  WA  0   0  4
  [11] .dynsym           DYNSYM          38058360 019360 000200 10   A  6   1  4
  [12] .dynamic          DYNAMIC         38058560 019560 000080 08  WA  6   0  4
  [13] .u_boot_cmd       PROGBITS        380585e0 0195e0 0003c0 00  WA  0   0  4
  [14] .rel.u_boot_cmd   REL             00000000 06d8e0 000500 08     40  13  4
  [15] .bss              NOBITS          3805cc50 01ec50 001a34 00  WA  0   0  4
  [16] .bios             PROGBITS        00000000 01e000 00053e 00  AX  0   0  1
  [17] .rel.bios         REL             00000000 06dde0 0000c0 08     40  16  4
  [18] .rel.dyn          REL             380589a0 0199a0 0042b0 08   A 11   0  4
  [19] .start16          PROGBITS        0000f800 01e800 000110 00  AX  0   0  1
  [20] .rel.start16      REL             00000000 06dea0 000038 08     40  19  4
  [21] .resetvec         PROGBITS        0000fff0 01eff0 000010 00  AX  0   0  1
  [22] .rel.resetvec     REL             00000000 06ded8 000008 08     40  21  4

...

Relocation section '.rel.text' at offset 0x66c68 contains 2976 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
38040010  00000101 R_386_32          38040000   .text
3804001e  00000101 R_386_32          38040000   .text
38040028  00000101 R_386_32          38040000   .text
3804003f  00000101 R_386_32          38040000   .text
38040051  00000101 R_386_32          38040000   .text
38040075  00000101 R_386_32          38040000   .text
38040085  00000101 R_386_32          38040000   .text
3804009d  0003e602 R_386_PC32        380403fa   load_uboot
380400a6  00000101 R_386_32          38040000   .text
38040015  00029f02 R_386_PC32        3804bdd8   early_board_init
38040023  0003f702 R_386_PC32        3804bdda   show_boot_progress_asm

...

Relocation section '.rel.rodata' at offset 0x6c968 contains 108 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
38051908  00000201 R_386_32          380518a4   .rodata
38051938  00000201 R_386_32          380518a4   .rodata
38051968  00000201 R_386_32          380518a4   .rodata
38051998  00000201 R_386_32          380518a4   .rodata
380519c8  00000201 R_386_32          380518a4   .rodata
380519f8  00000201 R_386_32          380518a4   .rodata

...

Relocation section '.rel.dyn' at offset 0x199a0 contains 2134 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
0000f838  00000008 R_386_RELATIVE
0000f846  00000008 R_386_RELATIVE
38040010  00000008 R_386_RELATIVE
3804001e  00000008 R_386_RELATIVE
38040028  00000008 R_386_RELATIVE
3804003f  00000008 R_386_RELATIVE
38040051  00000008 R_386_RELATIVE
38040075  00000008 R_386_RELATIVE
38040085  00000008 R_386_RELATIVE

Notice that, apart from .rel.dyn, non of the .rel.* sections have the
A (Allocated) flag set - They do not end up in the stripped binary image.
.rel.dyn is allocated in the binary image with all the R_386_PC32 entries
from the other .rel section are discarded and the R_386_32 have been
'converted' to R_386_RELATIVE which are simple to adjust (locate in memory
and adjust by the relocation offset)

The relocation fixup is really easy:

	Elf32_Rel *rel_dyn_start = (Elf32_Rel *)&__rel_dyn_start;
	Elf32_Rel *rel_dyn_end = (Elf32_Rel *)&__rel_dyn_end;
	Elf32_Rel *re;

	for (re = rel_dyn_start; re < rel_dyn_end; re++)
	{
		if (re->r_offset >= TEXT_BASE)
			if (*(ulong *)re->r_offset >= TEXT_BASE)
				*(ulong *)(re->r_offset - rel_offset) -= (Elf32_Addr)rel_offset;
	}

The size penalty is ~17kB of extra data (which is not copied to RAM) and
a tiny amount of relocation code (easily offset by removal of other fixups
such as the command table fixup

Any without using the pic flag in gcc, there is no GOT and no associated
performance penalty.

Thanks for everyone's help (especially Jocke and Bill)

Regards,

Graeme


More information about the U-Boot mailing list