[U-Boot] [Patch v1 2/4] armv8/fsl-lsch3: Release secondary cores from boot hold off with Boot Page

Mon Jul 14 15:28:11 CEST 2014

> > >> @@ -119,3 +107,94 @@ ENTRY(lowlevel_init)
> > >>         mov     lr, x29                 /* Restore LR */
> > >>         ret
> > >>  ENDPROC(lowlevel_init)
> > >> +
> > >> +       /* Keep literals not used by the secondary boot page outside
> > it */
> > >> +       .ltorg
> > >> +
> > >> +       .align 4
> > >
> > > That looks like a small alignment for a page.
> > >
> > > Should this be larger? Or is the "page" a misnomer here?
> >
> > I think as far as it is aligned to instruction size and keep "ldr" happy,
> > it is OK. The code will be copied to the beginning of DDR to run. Any
> > concern here?
> >
> 
> "page" is definitely a misnomer here, the comment (and maybe the label
> below) should probably be altered.
> The align directive is fine I guess.

I think it can be dropped to .align 2 if it's just to keep the
instruction stream aligned as per York's comment, but it's not harmful
to have greater alignment (just slightly confusing when trying to figure
out why the .align is there).

[...]

> > >> +#if defined(CONFIG_GICV3)
> > >> +       gic_wait_for_interrupt_m x0
> > >> +#endif
> > >> +
> > >> +       bl secondary_switch_to_el2
> > >> +#ifdef CONFIG_ARMV8_SWITCH_TO_EL1
> > >> +       secondary_switch_to_el1
> > >> +#endif
> > >> +
> > >> +slave_cpu:
> > >> +       wfe
> > >> +#ifdef CONFIG_FSL_SMP_RELEASE_ALL
> > >> +       ldr     x1, =CPU_RELEASE_ADDR
> > >> +       ldr     x0, [x1]
> > >> +#else
> > >> +       ldr     x0, [x11]
> > >> +       tbnz    x0, #0, slave_cpu
> > >> +#endif
> > >> +       cbz     x0, slave_cpu
> > >> +       br      x0                      /* branch to the given address
> > */
> > >
> > > Just to check, I take it CPUs won't ever be in a big-endian mode at
> > > this point?
> >
> > Don't know yet. Any concern if big-endian here?
> 
> I think we missed something here. Mark please correct me if I am
> wrong.  If the CPU is big-endian then we will need to convert the
> address contained at "CPU_RELEASE_ADDR", since it will always be
> written as a "single 64-bit little-endian value" (quoting from
> Documentation/arm64/booting.txt")

It sounds like you figured it out.

As far as I am aware, all you need to do is byte-swap the value if CPUs
are big-endian at this point. Linux will configure the CPUs to the
endianness it desires before it makes any explicit memory accesses.

> 
> >
> > >
> > >> +ENDPROC(secondary_boot_func)
> > >> +
> > >> +ENTRY(secondary_switch_to_el2)
> > >> +       switch_el x0, 1f, 0f, 0f
> > >> +0:     ret
> > >> +1:     armv8_switch_to_el2_m x0
> > >> +ENDPROC(secondary_switch_to_el2)
> > >> +
> > >> +ENTRY(secondary_switch_to_el1)
> > >> +       switch_el x0, 0f, 1f, 0f
> > >> +0:     ret
> > >> +1:     armv8_switch_to_el1_m x0, x1
> > >> +ENDPROC(secondary_switch_to_el1)
> > >> +
> > >> +       /* Ensure that the literals used by the secondary boot page
> > are
> > >> +        * assembled within it
> > >> +        */
> > >> +       .ltorg
> > >> +
> > >> +       .align 4
> > >
> > > Similarly to above, this looks like a small alignment for a page.
> >
> > Please suggest a proper alignment.
> >
> 
> Think this is confusion caused by our use of the term "secondary boot
> page". I don't think it needs to be sized or aligned as a page. We
> should probably change our terminology.

If there's some better terminology we could use, it would certainly make
things clearer. I must admit that the alternatives I came up with
weren't much better. "Secondary boot region", perhaps?

[...]

> > >> +       /* Initialize SCTLR_EL2 */
> > >> +       msr     sctlr_el2, xzr
> > >
> > > What about the RES1 bits (e.g. bits 29 & 28)?
> > >
> > > We don't seem to initialise them before the eret.
> >
> > I can't answer this question and below. Adding Arnab as the original
> > author for these changes.
> >
> > York
> >
> 
> You are right, will fix this. According to the ARMv8 ARM, RES1 bits
> "should be one or preserved".  What should the preferred approach be
> here? Write one or read modify and update the required bits?

As this seems to be the first initialisation of sctlr_el2, I believe the
correct thing to do is to write one for those bits, as their value may
be UNKNOWN (if not hardwired).

Per my reading of the ARM ARM's description of SBOP, after
initialization read-modify-write preserving the value of those bits is
the preferred way of modifying the register.

> 
> > >
> > >> +
> > >> +       /* Return to the EL2_SP2 mode from EL3 */
> > >> +       mov     \xreg1, sp
> > >> +       msr     sp_el2, \xreg1  /* Migrate SP */
> > >> +       mrs     \xreg1, vbar_el3
> > >> +       msr     vbar_el2, \xreg1        /* Migrate VBAR */
> > >> +       mov     x0, #0x3c9
> 
> Just noticed a bug here, x0 should in fact be \xreg1. My bad!!
> 
> > >> +       msr     spsr_el3, \xreg1        /* EL2_SP2 | D | A | I | F */
> > >> +       msr     elr_el3, lr
> > >> +       eret
> > >> +.endm
> > >> +
> > >> +.macro armv8_switch_to_el1_m, xreg1, xreg2
> > >> +       /* Initialize Generic Timers */
> > >> +       mrs     \xreg1, cnthctl_el2
> > >> +       orr     \xreg1, \xreg1, #0x3    /* Enable EL1 access to timers
> > */
> > >> +       msr     cnthctl_el2, \xreg1
> > >
> > > Is there any reason this can't be set to a precise known value? This
> > > currently leaves EVNTDIR and EVNTEN in UNKNOWN states (which could
> > > differ across CPUs).
> > >
> 
> You mean EVENTDIR and EVNTI? EVNTEN resets to 0 according to my copy
> of the documentation.

Sorry, my bad, I mixed up EVENTEN and EVENTI when reading the ARM ARM.

The problem I thought I'd spotted does not exist, so feel free to ignore
the above.

> Since events are disabled will the value of these bits matter?

I don't think so. I think EVENTEN being zero is sufficient to prevent
anything unexpected.

> I would expect any code that sets EVNTEN to first set these bits to
> sane values, right? That said, there is no harm in setting them here I
> guess, does 0 seem like a reasonable value to go with?

I think leaving them as-is should be fine given EVENTEN resets to 0. As
you say, we should expect anything wanting to make use of the event
stream to configure EVENTI and EVENTDIR.

> > > An EL1 OS can enable the event stream if it wants in CNTKCTL_EL1, so
> > > is there any reason to enable it at EL2?
> > >
> 
> Are we enabling it at EL2? We are only setting EL1PCEN and EL1PCTEN.
> EVNTEN resets to 0 so we are leaving events disabled, right?

Yes. Sorry for the noise.

> > >> +       msr     cntvoff_el2, \xreg1
> > >
> > > Please initialize cntvoff_el2 using xzr. Due to the aforementioned
> > > UNKNOWN bits, this could leave CPUs with differing view of time, and
> > > there's no point differing in the first place.
> > >
> > > An EL1 OS will not be able to fix this.
> > >
> > > I fixed this elsewhere in  b924d586d70b (arm64: zero cntvoff_el2).
> > >
> 
> Will do

Cheers.

> > >> +       mrs     \xreg1, cntkctl_el1
> > >> +       orr     \xreg1, \xreg1, #0x3    /* Enable EL0 access to timers
> > */
> > >> +       msr     cntkctl_el1, \xreg1
> > >
> > > Likewise this leaves many bits UNKNOWN and potentially differing
> > > across CPUs, though the OS at EL1 should be able to fix this up (and
> > > Linux will).
> > >
> 
> As you said, Linux sets this register and the value used (0x2) is
> different from what we use here (0x3).  So, should we get rid of this
> section of code?

I U-Boot isn't doing anything at EL0 and doesn't need an event stream,
then I'd recommend dropping this and leaving it up to the OS to
configure.

> > >> +
> > >> +       /* Initilize MPID/MPIDR registers */
> > >> +       mrs     \xreg1, midr_el1
> > >> +       mrs     \xreg2, mpidr_el1
> > >> +       msr     vpidr_el2, \xreg1
> > >> +       msr     vmpidr_el2, \xreg2
> > >> +
> > >> +       /* Disable coprocessor traps */
> > >> +       mov     \xreg1, #0x33ff
> > >> +       msr     cptr_el2, \xreg1        /* Disable coprocessor traps
> > to EL2 */
> > >> +       msr     hstr_el2, xzr           /* Disable coprocessor traps
> > to EL2 */
> > >> +       mov     \xreg1, #3 << 20
> > >> +       msr     cpacr_el1, \xreg1       /* Enable FP/SIMD at EL1 */
> > >> +
> > >> +       /* Initialize HCR_EL2 */
> > >> +       mov     \xreg1, #(1 << 31)              /* 64bit EL1 */
> > >> +       orr     \xreg1, \xreg1, #(1 << 29)      /* Disable HVC */
> > >> +       msr     hcr_el2, \xreg1
> > >> +
> > >> +       /* SCTLR_EL1 initialization */
> > >> +       mov     \xreg1, #0x0800
> > >> +       movk    \xreg1, #0x30d0, lsl #16
> > >> +       msr     sctlr_el1, \xreg1
> > >
> > > That doesn't seem to set up all the RES1 bits (e.g. bit 29).
> > >
> 
> Will fix.

Cheers.

> > >> +
> > >> +       /* Return to the EL1_SP1 mode from EL2 */
> > >> +       mov     \xreg1, sp
> > >> +       msr     sp_el1, \xreg1          /* Migrate SP */
> > >> +       mrs     \xreg1, vbar_el2
> > >> +       msr     vbar_el1, \xreg1        /* Migrate VBAR */
> > >> +       mov     \xreg1, #0x3c5
> > >> +       msr     spsr_el2, \xreg1        /* EL1_SP1 | D | A | I | F */
> > >> +       msr     elr_el2, lr
> > >> +       eret
> > >> +.endm
> > >> +
> > >> +#if defined(CONFIG_GICV3)
> > >> +.macro gic_wait_for_interrupt_m xreg1
> > >> +0 :    wfi
> > >> +       mrs     \xreg1, ICC_IAR1_EL1
> > >> +       msr     ICC_EOIR1_EL1, \xreg1
> > >> +       cbnz    \xreg1, 0b
> > >> +.endm
> > >> +#elif defined(CONFIG_GICV2)
> > >> +.macro gic_wait_for_interrupt_m xreg1, wreg2
> > >> +0 :    wfi
> > >> +       ldr     \wreg2, [\xreg1, GICC_AIAR]
> > >> +       str     \wreg2, [\xreg1, GICC_AEOIR]
> > >> +       cbnz    \wreg2, 0b
> > >> +.endm
> > >
> > > Assuming I've understood correctly, here we block until we receive SGI
> > > 0 from the CPU with GIC ID 0? Do we have a guarantee that the boot CPU
> > > will have GIC ID 0?
> 
> You are right, for GICv2 there is an assumption that the boot CPU will have
> CPU interface ID 0. As far as I know there is no such guarantee. It is probably ok to
> check bits 9:0 and ignore 12:10 here right? The assumption being that whoever is sending
> SGI 0 is the boot CPU.

That sounds fine to me.

A assume that by the time we get to an OS no CPUs should be waiting on
an interrupt?

For 32-bit Linux we broadcast SGIs until each CPU has read its GIC CPU
ID from the GIC (necessary on platforms where SGIs form part of the
secondary boot protocol). On 64-bit we don't do that currently but it
would be nice to know that if we did happen to broadcast an SGI it
wouldn't cause bad things to happen inside U-Boot for some secondary
CPUs.

Cheers,
Mark.