[U-Boot] [PATCH] powerpc/fsl: support low power boot for e500 and later

Wed Jan 21 09:23:38 CET 2015

On Wed, 2015-01-21 at 01:23 -0600, Wang Dongsheng-B40534 wrote:
> 
> 
> > -----Original Message-----
> > From: Wood Scott-B07421
> > Sent: Tuesday, January 20, 2015 8:33 AM
> > To: Wang Dongsheng-B40534
> > Cc: Sun York-R58495; Li Yang-Leo-R58472; Jin Zhengxiong-R64188; u-
> > boot at lists.denx.de
> > Subject: Re: [PATCH] powerpc/fsl: support low power boot for e500 and later
> > 
> > On Thu, 2015-01-15 at 14:04 +0800, Dongsheng Wang wrote:
> > > From: Wang Dongsheng <dongsheng.wang at freescale.com>
> > >
> > > low power boot means u-boot will put non-boot cpus into a low power
> > > status. Non-boot cpus don't need any more spin wait. e500, e500v2 will
> > > going to DOZE status. e500mc, e5500, e6500rev1 will going to PW10 state.
> > > e6500rev2 will going to PW20 state.
> > >
> > > e500/e500v2 will be kicked up by MPIC-IPI, e500mc later will be kicked up
> > > by doorbell.
> > 
> > This will break compatibility with existing kernels (and violate ePAPR
> > since you haven't changed the release-method string).  It must be
> > optional and (for now) off by default.
> > 
> 
> Agree.
> 
> > What benefits are we getting for this churn and complexity?  Do we
> > really need to optimize for the case where not all CPUs are used?
> > 
> 
> Save the power is our purpose no matter at any stage. If we can save the power
> that we need to do it.

No, we don't "need to do it" if the savings are negligible compared to
the complexity, or if it's for an obscure use case.

Worse, this could end up increasing power savings due to quickly
entering and then leaving low power mode (unlikely with PW10, but if you
end up in PW20 during the normal boot sequence, it could).

> > Where is this new mechanism documented?
> >
> 
> I don't understand. Do you mean Hardware datasheet? 

No, I mean software documentation for the new bootloader/kernel
interface you're creating.  The existing interface is documented in
ePAPR.

> > > diff --git a/arch/powerpc/cpu/mpc85xx/release.S
> > b/arch/powerpc/cpu/mpc85xx/release.S
> > > index a2c0ad4..a97a1b6 100644
> > > --- a/arch/powerpc/cpu/mpc85xx/release.S
> > > +++ b/arch/powerpc/cpu/mpc85xx/release.S
> > > @@ -297,10 +297,15 @@ __secondary_start_page:
> > >  	mtspr	SPRN_MAS7,r11
> > >  	tlbwe
> > >
> > > +	li	r6, 0
> > > +
> > >  	/*
> > >  	 * __bootpg_addr has the address of __second_half_boot_page
> > >  	 * jump there in AS=1 space with cache enabled
> > >  	 */
> > > +	.align 6
> > > +	.global jump_half_boot_page
> > > +jump_half_boot_page:
> > 
> > "half_boot_page" is confusing.
> > 
> 
> Why there are confused? 

I suspect you mean "jump to __second_half_boot_page" but it just doesn't
read that way when you omit the "second" (and it doesn't help that the
meaning of "second half" wasn't immediately obvious from context).  It
makes me ask questions like, "Is there a first_half_boot_page?" or "Is
there a full_boot_page?"

> The address of __bootpg_addr is not a physical address.
> Boot page will jump to a virtual address.

What does that have to do with the label name?

> > >  	lis	r13,toreset(__bootpg_addr)@h
> > >  	ori	r13,r13,toreset(__bootpg_addr)@l
> > >  	lwz	r11,0(r13)
> > > @@ -371,6 +376,9 @@ __second_half_boot_page:
> > >  	 * };
> > >  	 * we pad this struct to 64 bytes so each entry is in its own cacheline
> > >  	 */
> > > +	cmpwi   r6, 1
> > > +	beq	3f
> > 
> > What does a value of 1 mean for r6?  What about 0?  Could you use
> > symbolic constants?
> > 
> 
> Ok, replace the magic number with macro.
> 
> 1 means the current process is come from kicked up flow.
> 0 means the current process is first boot up flow.
> 
> > Why not just set the IVOR to where you really want to enter, rather than
> > conditionally branching there based on a value that happens to
> > correspond to whether you're entering via an interrupt?
> > 
> 
> Now the value of IVOR is jump_half_boot_page. In __bootpg_addr I can check the waken
> up cpu that is kernel want to kick. If not the cpu will loop in low power
> state again.

Whatever it is that you want to check for, the first thing you do when
you enter the interrupt is branch to 3f.  How does setting IVOR to
jump_half_boot_page rather than 3: help with what you're trying to do?

> > >  	li	r3,0
> > >  	li	r8,1
> > >  	mfspr	r4,SPRN_PIR
> > > @@ -402,7 +410,132 @@ __second_half_boot_page:
> > >  #endif
> > >  	lwz	r4,ENTRY_ADDR_LOWER(r10)
> > >  	andi.	r11,r4,1
> > > -	bne	3b
> > > +	beq	6f
> > > +
> > > +	li	r6, 0
> > > +	addi	r6, r6, 1
> > 
> > Why not just "li r6, 1"?
> > 
> 
> Mask the cpu that cpu fall into low power state.

Forget the reason -- just tell me how the resulting state of "li r6,0;
addi r6, r6, 1" will be different from "li r6, 1".

> > > +	/* External Interrupt exception. */
> > > +	lis	r7, toreset(jump_half_boot_page)@h
> > > +	mtspr	SPRN_IVPR, r7
> > > +	li	r7, toreset(jump_half_boot_page)@l
> > > +
> > > +#ifdef	CONFIG_E500MC
> > > +	/* e500MC, e5500, e6500 will use doorbell to send ipi signal */
> > > +	mtspr	SPRN_IVOR36, r7
> > > +#endif
> > > +
> > > +	/*
> > > +	 * For e500mc later:
> > > +	 * EE will open in low power state, IVOR4 make sure we can ACK
> > > +	 * trash interrupt and keep we can loop in wait state again until
> > > +	 * the desired interrupt coming.
> > 
> > I don't understand "EE will open".  Do you mean "EE will be set"?
> 
> Thanks... Yes, EE will be set.
> 
> > Why would we get unexpected interrupts?
> > 
> 
> When EE be set, cpu can receive external interrupt.

But what else would be sending an interrupt to this CPU at this point?

> If a cpu is not kicked in kernel that be triggered by external interrupt, we
> need to make sure the cpu can be loop in low power again.

No, you can just say that the first interrupt will wake the CPU and it's
up to the OS to program the MPIC properly.

> > > +	/* Fix erratum, e6500 rev1 does not support PW20 & AltiVec idle */
> > > +	rlwinm  r11, r0, 0, 16, 31
> > > +	cmpwi   r11, 0x20
> > > +	blt     5f
> > 
> > PW10 isn't enough here?
> > 
> 
> e6500rev2 support PW20, the PW20 lower than PW10, Why not?

See the beginning of the mail.

> > > +#define PW20_WAIT_IDLE_BIT	50 /* 1ms, TB frequency is 41.66MHZ */
> > 
> > If you must use PW20 please set the timeout long enough that it's
> > obvious we won't hit it during normal bootup.
> > 
> 
> Sorry, I don’t understand "obvious we won't hit it"...

What I mean is that we do not want to enter PW20 during normal boot.
The precise time it takes before the OS releases the secondary CPUs can
vary, so the timeout should be large enough that we are highly confident
that the CPUs will be released before the PW20 timer expires.

> > > +	/*
> > > +	 * Set all of cpu PIR-ID is 0, wait kernel send doorbell or MPIC-IPI
> > > +	 * signal.
> > > +	 *
> > > +	 * When kernel kick one of cpus, all cpus will be wakenup. To make
> > > +	 * sure that only the target cpu is effected, other cpus (by checking
> > > +	 * spin_table->addr_l) should go back to low power state.
> > > +	 *
> > > +	 * U-boot has renumber the cpu PIR Why we need to set all of PIR to
> > > +	 * the same value?
> > > +	 * A: Before kernel kicking cpu, the doorbell message was not configured
> > > +	 * for target cpu(cpu_messages->data). If we try to send a
> > > +	 * non-configured message to target cpu, it cannot correctly receive
> > > +	 * doorbell interrput. So SET ALL OF CPU'S PIR to the same value to
> > > +	 * let all cpus catch the interrupt.
> > > +	 *
> > > +	 * Why set PIR to zero?
> > > +	 * A: U-boot cannot know how many cpus will be kicked up(Kernel allow us
> > > +	 * to configure NR_CPUS) and IPI is a per_cpu variable, u-boot cannot
> > > +	 * set a appropriate PIR for every cpu, but the boot cpu(CPU0) always be
> > > +	 * there. So set PIR is zero as a default PIR ID for each CPUs.
> > 
> > What does NR_CPUS have to do with the appropriate PIR for each CPU?
> > 
> 
> PIR is for each CPU. 

I am aware of this.

> The NR_CPUS just a number of kernel control to start the CPU.

It's the maximum number of CPUs that the kernel will support.

> We need to set the same value for each CPU PIR register.

No, we don't.

>  Please see my comments for Why we need to do it.

If you think you've answered something and I ask again, it generally
means I'm having a hard time understanding and it would be helpful to
reword rather than just copying and pasting or telling me to read them
again (I think I finally understand what you meant, but it's not clear).

> * U-boot has renumber the cpu PIR Why we need to set all of PIR to
> * the same value?
> * A: Before kernel kicking cpu, the doorbell message was not configured
> * for target cpu(cpu_messages->data). If we try to send a
> * non-configured message to target cpu, it cannot correctly receive
> * doorbell interrput. So SET ALL OF CPU'S PIR to the same value to
> * let all cpus catch the interrupt.

So the problem is that you're using a Linux function to send the msgsnd
before it was meant to be called.  Don't do that.  Either set up the
required data earlier, or use a lower level function that doesn't have
that prerequisite.

> > > +	/*
> > > +	 * If proxy mode enable in MPIC, Read EPR to ACK INTERRUPT
> > > +	 * Or proxy mode disable, Kernel will read MPIC to ACK INTERRUPT.
> > > +	 */
> > > +	mfspr	r7, SPRN_EPR
> > 
> > Where do you limit this to cases when external proxy is enabled?  Why
> > would we care about external proxy at all?  The only chips we use IPIs
> > on don't have external proxy.
> > 
> 
> In kernel the MPCI IPI message not be supported on CORENET platform and later.

Yes, it is.  We just don't use it.

> After e500v2, proxy mode always enable in MPIC.

No, it's not.  It has to be manually enabled by the kernel.  U-Boot
cannot know if the kernel has done this (without reading the relevant
MPIC register).

Neither of the above two sentences answer my questions.  What stops you
from trying to read EPR on e500v2?

-Scott