[U-Boot] [PATCH 4/6] ARM: add SMP support for non-secure switch

Sat Jun 1 01:51:10 CEST 2013

On Fri, May 31, 2013 at 11:32:40AM +0200, Andre Przywara wrote:
> On 05/31/2013 07:32 AM, Christoffer Dall wrote:
> > On Mon, May 06, 2013 at 03:17:48PM +0200, Andre Przywara wrote:
> >> Currently the non-secure switch is only done for the boot processor.
> >> To later allow full SMP support, we have to switch all secondary
> >> cores into non-secure state also.
> >>
> >> So we add an entry point for secondary CPUs coming out of low-power
> >> state and make sure we put them into WFI again after having switched
> >> to non-secure state.
> >> For this we acknowledge and EOI the wake-up IPI, then go into WFI.
> >> Once being kicked out of it later, we sanity check that the start
> >> address has actually been changed (since another attempt to switch
> >> to non-secure would block the core) and jump to the new address.
> >>
> >> The actual CPU kick is done by sending an inter-processor interrupt
> >> via the GIC to all CPU interfaces except the requesting processor.
> >> The secondary cores will then setup their respective GIC CPU
> >> interface.
> >>
> >> The address secondary cores jump to is board specific, we provide
> >> the value here for the Versatile Express board.
> >>
> >> Signed-off-by: Andre Przywara <andre.przywara at linaro.org>
> >> ---
> >>   arch/arm/cpu/armv7/start.S          | 27 ++++++++++++++++++++++++++-
> >>   arch/arm/include/asm/armv7.h        |  1 +
> >>   arch/arm/lib/virt-v7.c              |  9 ++++++++-
> >>   include/configs/vexpress_ca15_tc2.h |  1 +
> >>   4 files changed, 36 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/arch/arm/cpu/armv7/start.S b/arch/arm/cpu/armv7/start.S
> >> index e63e892..02234c7 100644
> >> --- a/arch/arm/cpu/armv7/start.S
> >> +++ b/arch/arm/cpu/armv7/start.S
> >> @@ -575,8 +575,19 @@ fiq:
> >>
> >>   #ifdef CONFIG_ARMV7_VIRT
> >>   /* Routine to initialize GIC CPU interface and switch to nonsecure state.
> >> + * Will be executed directly by secondary CPUs after coming out of
> >> + * WFI, or can be called directly by C code for CPU 0.
> >> + * Those two paths mandate to not use any stack and to only use registers
> >> + * r0-r3 to comply with both the C ABI and the requirement of SMP startup
> >> + * code.
> >>    */
> >>   .globl _nonsec_gic_switch
> >> +.globl _smp_pen
> >> +_smp_pen:
> >> +	mrs	r0, cpsr
> >> +	orr	r0, r0, #0xc0
> >> +	msr	cpsr, r0			@ disable interrupts
> >> +	mov	lr, #0				@ clear LR to mark secondary
> >
> > instead of this subtle abuse of lr, why not make this routine simply
> > take a parameter?
> 
> How would this work if this is called out of SMP pen? Shall I rely on 
> the registers being zero, then? Not very stable, I guess.
> I think this whole routine is special anyways, so I felt this "subtle 
> abuse" is OK.
> An option would be to set r0 to 1 in the smp_pen path and pass 0 as the 
> first parameter when calling from C. But then I'd need to save this 
> value - possibly in the LR register ;-)

see my example in the previous patch comments, I don't see why using the
lr like this makes anything anymore clear or more stable?

> 
> > I also slightly object against wrapping the _smp_pen around the
> > _nonsec_gic_switch, I really think these are separate routines, where
> > one can just call the other...?
> 
> The actual routine and the purpose are the same, just the entry and exit 
> code is different. So this fitted nicely in here. I can add a more 
> specific comment on the different entry points.
> 

eh, to me it reads kind of like this C function (but maybe I just choked
on this somehow):

void do_magic(bool mode)
{
	if (mode) {
		pull_rabit_out_of_hat();
		do_card_trick();
	}

	saw_a_woman_in_half(); /* we always do this */
	linking_rings();

	if (!mode)
		return;

	while (1) {
		vanish_statue_of_liberty();
		walk_through_great_wall_of_china();
	}
}

> >>   _nonsec_gic_switch:
> >>   	mrc	p15, 4, r2, c15, c0, 0		@ r2 = PERIPHBASE
> >>   	add	r3, r2, #0x1000			@ GIC dist i/f offset
> >> @@ -617,5 +628,19 @@ _nonsec_gic_switch:
> >>   	add	r2, r2, #0x1000			@ GIC dist i/f offset
> >>   	str	r1, [r2]			@ allow private interrupts
> >>
> >> -	mov	pc, lr
> >> +	cmp	lr, #0
> >> +	movne	pc, lr				@ CPU 0 to return
> >> +						@ all others: go to sleep
> >> +_ack_int:
> >> +	ldr	r1, [r3, #0x0c]			@ read GICD acknowledge
> >> +	str	r1, [r3, #0x10]			@ write GICD EOI
> >> +
> >> +	adr	r1, _smp_pen
> >> +waitloop:
> >> +	wfi
> >> +	ldr	r0, =CONFIG_SYSFLAGS_ADDR	@ load start address
> >> +	ldr	r0, [r0]
> >> +	cmp	r0, r1			@ make sure we dont execute this code
> >
> > I think I raised this issue previously, but this code is in a core
> > u-boot file, but I could imagine a board with a different crazy boot
> > protocol that required you to check two different fields and jump
> > through other hoops to wake up from the smp pen, so I really think the
> > whole smp pen belongs in a board specific place.
> 
> Right, but I didn't want to do this prematurely without knowing what is 
> really needed. So my plan is to refactor this when adding Arndale 
> support. I think this is special anyways because of the SPL/non-SPL split.
> And although you are right about it being a core u-boot file, this whole 
> code is protected by CONFIG_ARMV7_VIRT. For non-compliant boards one 
> would either not enable this or add support for it.
> 

But I do think another important point is that when someone tries to
understand U-boot (which was for example me not long ago) you start
looking in start.S to get the basic gist of things, and this whole code
is pretty complicated, and called only in a specific use case, so I
think it should properly go in a separate file, why not?

We generally don't just keep giant files around filled with ifdefs,
because it's simply harder to read than having a separate file.

The only real benefit I see is reusing the vector, but I think it's a
small benefit, saving you 32 bytes.

> > Also, the boot-wrapper code used wfe instead, not sure if there are any
> > users that just send an event and doesn't send an IPI?
> 
> Good point. Is WFE a complete superset of WFI? Then I could just change 
> this.
> 
Don't think so, maybe we could get Catalin's input on this one, as he
wrote the wfe in the boot-wrapper code.

> >> +	beq	waitloop		@ again (due to a spurious wakeup)
> >> +	mov	pc, r0
> >>   #endif /* CONFIG_ARMV7_VIRT */
> >> diff --git a/arch/arm/include/asm/armv7.h b/arch/arm/include/asm/armv7.h
> >> index 25afffe..296dc92 100644
> >> --- a/arch/arm/include/asm/armv7.h
> >> +++ b/arch/arm/include/asm/armv7.h
> >> @@ -78,6 +78,7 @@ void v7_outer_cache_inval_range(u32 start, u32 end);
> >>   int armv7_switch_nonsec(void);
> >>
> >>   /* defined in cpu/armv7/start.S */
> >> +void _smp_pen(void);
> >>   void _nonsec_gic_switch(void);
> >>   #endif /* CONFIG_ARMV7_VIRT */
> >>
> >> diff --git a/arch/arm/lib/virt-v7.c b/arch/arm/lib/virt-v7.c
> >> index 3a48aee..0248010 100644
> >> --- a/arch/arm/lib/virt-v7.c
> >> +++ b/arch/arm/lib/virt-v7.c
> >> @@ -48,6 +48,7 @@ int armv7_switch_nonsec(void)
> >>   	unsigned int reg;
> >>   	volatile unsigned int *gicdptr;
> >>   	unsigned itlinesnr, i;
> >> +	unsigned int *sysflags;
> >>
> >>   	/* check whether the CPU supports the security extensions */
> >>   	asm("mrc p15, 0, %0, c0, c1, 1\n" : "=r"(reg));
> >> @@ -106,7 +107,13 @@ int armv7_switch_nonsec(void)
> >>   	for (i = 0; i <= itlinesnr; i++)
> >>   		gicdptr[GICD_IGROUPR0 / 4 + i] = (unsigned)-1;
> >>
> >> -	/* call the non-sec switching code on this CPU */
> >> +	/* now kick all CPUs (expect this one) by writing to GICD_SIGR */
> >> +	sysflags = (void *)CONFIG_SYSFLAGS_ADDR;
> >> +	sysflags[1] = (unsigned)-1;
> >> +	sysflags[0] = (uintptr_t)_smp_pen;
> >> +	gicdptr[GICD_SGIR / 4] = 1U << 24;
> >
> > here you definitely want a barrier to make sure you don't kick the cpus
> > before the sysflags addresses have been written.
> 
> Right.
> 

(which the writel should give you)

> > What does the
> > (unsigned)-1 write to sysflags[1] do?
> 
> This is either a ARM-TC or a VExpress "goodie": sysflags+0 is bit-set 
> only, sysflags+4 is bit-clear only. So to write arbitrary addresses in 
> here you better clear all bits first and then write in the value.
> Cost me at least one day to find this out ;-)

ouch, that wasn't trivial.

> So this should be guarded by VExpress specific defines, I guess. But 
> again didn't want to do this before support for a board with a different 
> behavior is actually implemented.
> 

I think that's pretty important actually, because it suggests how other
people should do it, and makes it easier for SoC vendors to intigrate
such support nicely in U-boot instead of placing a large refactoring
burden on them dealing with code that they may not understand...

Thanks,
-Christoffer