[U-Boot] Query on CONFIG_SYS_THUMB_BUILD

Stefan Agner stefan at agner.ch
Tue Nov 18 19:37:18 CET 2014


On 2014-11-18 17:07, Stefan Agner wrote:
> On 2014-11-14 15:01, Simon Glass wrote:
>> Hi Victor,
>>
>> On 13 November 2014 09:29, Victor Ascroft <victorascroft at gmail.com> wrote:
>>> Hello,
>>>
>>> I am working with a Cortex A5 Freescale Vybrid Processor. Since a thumb build leads to a saving of almost 1 MB for my u-boot image and consequently to faster serial downloads I have been looking at this. Currently enabling this option leads to a hang.
>>>
>>> After some debugging I have narrowed the place of hang to "ldr pc, =board_init_r" in arch/arm/lib/crt0.S. My debugging procedure was to put a branch to a small function which just printed a small message with puts, just before the ldr instruction and then a printing a small message with puts just at the start of board_init_r in common/board_r.c . For a non thumb build, the two messages get printed and I can boot to the u-boot prompt. For a thumb build, only the first message before the ldr instruction gets printed.
>>>
>>> In crt0.S
>>> bl debug_print
>>> ldr pc, =board_init_r
>>>
>>> In board_init_r
>>> puts("In board_init_r\n"); // Right at start
>>>
>>> void debug_print(void)
>>> {
>>>     // Defined in board file
>>>     puts("Debug print\n");
>>> }
>>>
>>> My assembly knowledge is limited and after some consultation with a senior colleague, he told me things to check.
>>>
>>> An object dump of the crt0.o shows a branch to an even address. For thumb, this is expected to be odd. To just try out, I did a change as below
>>> ldr r3, =board_init_r
>>> add r3, #1
>>> bx r3
>>>
>>> No change with this. My expectation was the compiler/linker/assembler would take care of the requirements, with the CONFIG_SYS_THUMB_BUILD. Frankly speaking I am not sure if this is the complete issue or only a part of it. I have seen patches with regards to OMAP send in by Aneesh V, which made changes of the form .type fn_name, %function to all the low level assembly functions, but, I couldn't dig up much more or variants thereof. Basically, from what I understand, this takes care of specifying .thumb_func for a thumb function or so to speak.
>>>
>>> Any pointers?
>>
>> I tried this on a peach_pi (Samsung Chromebook 2 13") and it worked OK
>> for me. The code sequence you refer to came out as below for me.
>>
>> 23e01e10 <clbss_l>:
>> 23e01e10:       e1500001        cmp     r0, r1
>> 23e01e14:       35802000        strcc   r2, [r0]
>> 23e01e18:       32800004        addcc   r0, r0, #4
>> 23e01e1c:       3afffffb        bcc     23e01e10 <clbss_l>
>> 23e01e20:       fa000dec        blx     23e055d8 <coloured_LED_init>
>> 23e01e24:       fb000deb        blx     23e055da <red_led_on>
>> 23e01e28:       e1a00009        mov     r0, r9
>> 23e01e2c:       e5991030        ldr     r1, [r9, #48]   ; 0x30
>> 23e01e30:       e59ff008        ldr     pc, [pc, #8]    ; 23e01e40
>> <clbss_l+0x30>
>> 23e01e34:       02073800        .word   0x02073800
>> 23e01e38:       23e41eb0        .word   0x23e41eb0
>> 23e01e3c:       23e77bf0        .word   0x23e77bf0
>> 23e01e40:       23e057a9        .word   0x23e057a9
>>
>> The 'ldr pc' line is loading from 23e01e40 which does have an odd address.
>>
>> What toolchain are you using? I tried with gcc 4.8.2 - including
>> linaro's 2013.10 release.
>>
>> In arch/arm/cpu/armv7/config.mk there is a fallback to armv5 from
>> armv7-a, and this may cause it to generate Thumb code instead of Thumb
>> 2. But you should get errors if that happens.
>>
>> It's hard to debug with such limited visibility. But if I put a puts()
>> at the start of board_init_r(), I see it on the serial console.
> 
> Ok, turns out the problem with Thumb2 only appear when using
> CONFIG_USE_ARCH_MEMCPY (hence added Matthias to CC). Does peach_pi still
> works with that config enabled? On the Vybrid board we use that to speed
> up NAND access, with the current NAND driver, data get copied by the
> CPU.
> 
> In setup_reloc, common/board_f.c, probably the first use of memcpy,
> things started to get weird. The code after memcpy doesn't get executed,
> I think something with the stack goes wrong, but not really sure what
> happens.

It seems that this memcpy implementation is not able to be run in ARM
mode when called from Thumb2 code. Checked the Linux kernel, since
that's where that file comes from, they compile a Thumb2 version of that
file when compiling the kernel in Thumb2 mode.  With some changes I also
managed to compile that file in Thumb2 in U-Boot:

diff --git a/arch/arm/config.mk b/arch/arm/config.mk
index f0eafd6..ddbc8dc 100644
--- a/arch/arm/config.mk
+++ b/arch/arm/config.mk
@@ -30,6 +30,8 @@ PF_CPPFLAGS_ARM := $(call cc-option, -mthumb
-mthumb-interwork,\
 			$(call cc-option,-marm,)\
 			$(call cc-option,-mno-thumb-interwork,)\
 		)
+AFLAGS_AUTOIT	:=$(call
as-option,-Wa$(comma)-mimplicit-it=always,-Wa$(comma)-mauto-it)
+PF_CPPFLAGS_ARM += $(AFLAGS_AUTOIT)
 else
 PF_CPPFLAGS_ARM := $(call cc-option,-marm,) \
 		$(call cc-option,-mno-thumb-interwork,)
diff --git a/arch/arm/lib/memcpy.S b/arch/arm/lib/memcpy.S
index f655256..fcf028c 100644
--- a/arch/arm/lib/memcpy.S
+++ b/arch/arm/lib/memcpy.S
@@ -12,11 +12,14 @@
 
 #include <asm/assembler.h>
 
-#define W(instr)	instr
+#define W(instr)	instr.w
 
 #define LDR1W_SHIFT	0
 #define STR1W_SHIFT	0
 
+#define CALGN(code...)
+
+
 	.macro ldr1w ptr reg abort
 	W(ldr) \reg, [\ptr], #4
 	.endm
@@ -57,12 +60,15 @@
 
 /* Prototype: void *memcpy(void *dest, const void *src, size_t n); */
 
+	.syntax unified
+	.thumb
+	.thumb_func
 .globl memcpy
 memcpy:
-
+/*
 		cmp	r0, r1
 		moveq	pc, lr
-
+*/
 		enter	r4, lr
 
 		subs	r2, r2, #4
diff --git a/include/configs/colibri_vf.h b/include/configs/colibri_vf.h
index 76564ac..41a0dac 100644
--- a/include/configs/colibri_vf.h
+++ b/include/configs/colibri_vf.h
@@ -53,7 +53,10 @@
 #define CONFIG_CMD_NAND
 #define CONFIG_NAND_VF610_NFC
 #define CONFIG_SYS_NAND_SELF_INIT
+
+#define CONFIG_SYS_THUMB_BUILD
 #define CONFIG_USE_ARCH_MEMCPY
+
 #define CONFIG_SYS_MAX_NAND_DEVICE	1
 #define CONFIG_SYS_NAND_BASE		NFC_BASE_ADDR
 

The main change here is the implicit-it/auto-it functionality. For me it
works when enabling that globally. Is it harmful to enable that
globally? The other changes need a proper ifdef, but should be ok I
guess.

--
Stefan

> 
> --
> Stefan
> 
>> _______________________________________________
>> U-Boot mailing list
>> U-Boot at lists.denx.de
>> http://lists.denx.de/mailman/listinfo/u-boot
> 
> _______________________________________________
> U-Boot mailing list
> U-Boot at lists.denx.de
> http://lists.denx.de/mailman/listinfo/u-boot



More information about the U-Boot mailing list