ARM A53 and initial MMU mapping for EL0/1/2/3 ?
Joakim Tjernlund
Joakim.Tjernlund at infinera.com
Thu Feb 17 17:05:04 CET 2022
On Thu, 2022-02-17 at 15:13 +0000, Andre Przywara wrote:
> On Fri, 11 Feb 2022 17:00:48 +0000
> Joakim Tjernlund <Joakim.Tjernlund at infinera.com> wrote:
>
> Hi,
>
> > On Fri, 2022-02-11 at 15:00 +0100, Joakim Tjernlund wrote:
> > > On Fri, 2022-02-11 at 01:26 +0000, Andre Przywara wrote:
> > > > On Fri, 11 Feb 2022 00:22:25 +0000
> > > > Joakim Tjernlund <Joakim.Tjernlund at infinera.com> wrote:
> > > >
> > > > > On Thu, 2022-02-10 at 22:43 +0000, Andre Przywara wrote:
> > > > > > On Thu, 10 Feb 2022 21:58:30 +0000
> > > > > > Joakim Tjernlund <Joakim.Tjernlund at infinera.com> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > > On Thu, 2022-02-10 at 10:22 +0000, Andre Przywara wrote:
> > > > > > > > On Wed, 9 Feb 2022 12:03:47 +0000
> > > > > > > > Joakim Tjernlund <Joakim.Tjernlund at infinera.com> wrote:
> > > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > > On Wed, 2022-02-09 at 10:45 +0000, Andre Przywara wrote:
> > > > > > > > > > On Wed, 9 Feb 2022 08:35:04 +0000
> > > > > > > > > > Joakim Tjernlund <Joakim.Tjernlund at infinera.com> wrote:
> > > > > > > > > >
> > > > > > > > > > Hi,
> > > > > > > > > >
> > > > > > > > > > > On Wed, 2022-02-09 at 00:33 +0000, Andre Przywara wrote:
> > > > > > > > > > > > On Tue, 8 Feb 2022 22:05:00 +0000
> > > > > > > > > > > > Joakim Tjernlund <Joakim.Tjernlund at infinera.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Hi Joakim,
> > > > > > > > > > > >
> > > > > > > > > > > > > Trying to figure out how I should map the MMU for normal RAM so it acessible
> > > > > > > > > > > > > from all ELx security states.
> > > > > > > > > > > >
> > > > > > > > > > > > ^^^^^^^
> > > > > > > > > > > >
> > > > > > > > > > > > This does not make much sense. U-Boot is typically running in one
> > > > > > > > > > > > exception level only, and sets up the page table for exactly that EL.
> > > > > > > > > > > > Each EL uses a separate translation regime (with some twists for stage
> > > > > > > > > > > > 2 EL2 and combined EL1/0, plus VHE). If you map your memory in EL3, then
> > > > > > > > > > > > drop to EL2, the EL3 page tables become irrelevant.
> > > > > > > > > > > >
> > > > > > > > > > > > So in U-Boot we just set up the page tables for the EL we are running
> > > > > > > > > > > > in, and leave the paging for the lower exception levels to be set up at
> > > > > > > > > > > > the discretion of our payloads (kernels, hypervisors).
> > > > > > > > > > > >
> > > > > > > > > > > > Please not that *secure* memory is a separate concept, and handled by
> > > > > > > > > > > > external hardware, typically using regions, not page tables.
> > > > > > > > > > >
> > > > > > > > > > > I am a beginner w.r.t ARM and Secure/Non secure so thank you for above.
> > > > > > > > > > >
> > > > > > > > > > > The problem I have is that I boot a custom SOC into u-boot and when u-boot tries
> > > > > > > > > > > to boot linux I get an error exception when u-boot calls armv8_switch_to_el2 to enter linux.
> > > > > > > > > >
> > > > > > > > > > So that means that U-Boot runs in EL3, is that the first and only firmware
> > > > > > > > > > that you run? I think the EL3 part of U-Boot is not widely used and tested
> > > > > > > > > > beyond the very few platforms that use it.
> > > > > > > > >
> > > > > > > > > Yes, u-boot is first firmware and runs in EL3(ATM, may change once initial bringup is complete)
> > > > > > > > > Maybe u-boot then lacks some critical init? Do you have an example of a board in u-boot
> > > > > > > > > that starts in EL3(from reset) using an A53 cpu?
> > > > > > > >
> > > > > > > > As you have probably figured out by now, the whole Layerscape family uses
> > > > > > > > that approach. However most other platforms go with Trusted-Firmware as the
> > > > > > > > EL3 setup and secure runtime service provider, so the U-Boot EL3 code in
> > > > > > > > here is not well tested or looked after. For initial bringup it might be
> > > > > > > > OK, but maybe the problems you run into are due to issues in this code.
> > > > > > > >
> > > > > > > > > > Do you have the exact address that fails? That should be in ELR, it would
> > > > > > > > > > be great if you can pinpoint the exact instruction in macro.h that fails.
> > > > > > > > >
> > > > > > > > > Yes, the address is the first address where kernel is loaded and you can branch there without problems.
> > > > > > > >
> > > > > > > > You mean if you load the kernel and branch to the entry point, it starts
> > > > > > > > running, but crashes as soon as it realises that in runs in EL3?
> > > > > > > >
> > > > > > > > > It is the eret instruction(last insn in macro armv8_switch_to_el2_m) that fails.
> > > > > > > >
> > > > > > > > Interesting. Maybe there is something missing in the EL2 setup, but my
> > > > > > > > understanding is that this is the part that is actually used by
> > > > > > > > Layerscape, for instance.
> > > > > > > >
> > > > > > > > > > > I think the exception means "Instruction Abort taken without a change in Exception level."
> > > > > > > > > > > I was thinking it could be some privilege missing in MMU map.
> > > > > > > > > >
> > > > > > > > > > Could be. One thing that made me wonder is your rather miserly mapping of
> > > > > > > > > > only 32MB, which sounds a bit on the small side. Typically we just map the
> > > > > > > > >
> > > > > > > > > We only have 32 MB ATM :( a bit small but it may increase to 64MB
> > > > > > > >
> > > > > > > > That sounds very miserly. Can you actually run an arm64 Linux kernel with
> > > > > > > > that little RAM? IIRC for QEMU we need at least 128 MB, and I haven't seen
> > > > > > > > an ARMv8 hardware platform with less than 512MB (maybe 256MB) DRAM yet.
> > > > > > > >
> > > > > > > > > > whole first DRAM bank, regardless of whether you actually have memory
> > > > > > > > > > there or not. U-Boot should know how much DRAM you have, so will not go
> > > > > > > > > > beyond that. Having page tables covering more address space does not
> > > > > > > > > > really hurt, but avoids all kind of problems.
> > > > > > > > > > And please note that U-Boot loves to move things around: itself from the
> > > > > > > > > > load address to the end of DRAM (that it knows of); possibly the kernel,
> > > > > > > > > > when the alignment is not right, or the DT and initrd if it sees fit.
> > > > > > > > > > So there is little point in mapping just portions of the memory.
> > > > > > > > >
> > > > > > > > > U-boot moves around a lot, I know :) In this case u-boot lives
> > > > > > > > > in is own 4MB SRAM but kernel lives in a 32MB HyperRAM.
> > > > > > > >
> > > > > > > > Interesting. I wonder if this works well with U-Boot's memory management,
> > > > > > > > which assumes it has quite some DRAM to play with.
> > > > > > >
> > > > > > > Found it, all memory spaces were set to secure mode, the req. spec does not agree :(
> > > > > >
> > > > > > Ah, yes, if the DRAM is configured as secure only, running in EL2
> > > > > > (always non-secure on the A53) will not end well.
> > > > > >
> > > > > > > Anyhow, now kernel enters into EL2 then EL1 to EL0, all is well until kernel tries
> > > > > > > to do simple cache ops like dc ivac, x0 or mrs x3,ctr_el0 when I again just get an error exception:
> > > > > > > EXC [0x400] Synchronous Lower EL using AArch64
> > > > > >
> > > > > > Was this with Linux, or some other kernel? IIRC cache maintenance
> > > > >
> > > > > Yes, 5.14.x
> > > >
> > > > Ah, I see. And that really runs with 32MB? I think we need at least
> > > > 64MB. Maybe the issues you see are related to that? IIRC the effects can
> > > > look rather random.
> > > >
> > > > > > instructions in EL0 need to be enabled in SCTLR_EL1 (.UCI and .DZE, for
> > > > > > instance, plus maybe more registers), and those and other operations
> > > > > > should not be trapped to EL2 as well.
> > > > >
> > > > > SCTLR_EL1 is 0x30500800 and does not seem to match with above. looks like it is kernel that sets this reg?
> > > > > how can kernel get that wrong ?
> > > >
> > > > That can't be really the kernel value, because the MMU needs to be on
> > > > (bit 0). Is this the reset value, read in U-Boot? The kernel sets those
> > > > bits, check the definition of INIT_SCTLR_EL1_MMU_ON in the kernel
> > > > source.
> > > > Maybe (the generic EL3) U-Boot code misses to set some EL3 registers,
> > > > so some stuff is blocked already there, and the kernel is helpless?
> > >
> > > This is before MMU is on, kernel has forced SCTLR_EL1 to ENDIAN_SET_EL1 | SCTLR_EL1_RES1 via INIT_SCTLR_EL1_MMU_OFF
> > > I hacked the define to:
> > > #define INIT_SCTLR_EL1_MMU_OFF \
> > > - (ENDIAN_SET_EL1 | SCTLR_EL1_RES1)
> > > + (ENDIAN_SET_EL1 | SCTLR_EL1_RES1 | SCTLR_EL1_DZE | SCTLR_EL1_UCI)
> > >
> > > but that didn't change anything. The only thing I can think of is some prep
> > > u-boot must do while in EL3 or maybe the A53 core has been oddly wired into the ASIC(own custom ASIC)
> > > and changed som default setting in HW ?
> > >
> >
> > Found it! A kernel bug actually:
> > diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h
> > index 3198acb2aad8..7f3c87f7a0ce 100644
> > --- a/arch/arm64/include/asm/el2_setup.h
> > +++ b/arch/arm64/include/asm/el2_setup.h
> > @@ -106,7 +106,7 @@
> > msr_s SYS_ICC_SRE_EL2, x0
> > isb // Make sure SRE is now set
> > mrs_s x0, SYS_ICC_SRE_EL2 // Read SRE back,
> > - tbz x0, #0, 1f // and check that it sticks
> > + tbz x0, #0, .Lskip_gicv3_\@ // and check that it sticks
> > msr_s SYS_ICH_HCR_EL2, xzr // Reset ICC_HCR_EL2 to defaults
> > .Lskip_gicv3_\@:
> > .endm
> >
> > branching to 1f got you way off and into el0 when you were supposed be in el2/el1 still.
>
> Well, as the list confirmed, that is indeed a bug, but the more
> important question is why. The bug wasn't noticed because this is some
> kind of error path only anyway, so any sane setup wouldn't trigger this.
Right, my setup was not sane ...
>
> So do you actually enable the EL3 GICv3 setup in your U-Boot build?
> That would be CONFIG_GICV3, and you need to define the GICD base
> address, I believe.
I had not setup GICv3, I found some define for that, was not possible to select this from Kconfig in u-boot.
Only had to define GICD and GICR and u-boot did the rest.
It boots OK now(gets to mounting rootfs which remains to do)
The one thing I havn't figured out w.r.t GIC DTS config is Rediststributor SPI/PPI init.
We have 4 cores but only one runs linux and linux boots first.
Should the other cores setup their own Rediststributor space or should the kernel
do it for them(can it ?)?
> But as Marc hinted already, this code is not well tested. Not sure it
> covers the WAKER setup that the GIC500 requires.
>
> > Not sure why GIC init fails there, we got a GIC-500v4 but I think it should pass this test still ?
> > If so I guess we need to something with GIC in uboot before booting Linux?
> > Any idea what I might be missing?
>
> There is a list of requirements that Linux expects to be fulfilled by
> the firmware, check the section "system registers" under
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Fv5.15%2Farm64%2Fbooting.html%23call-the-kernel-image&data=04%7C01%7CJoakim.Tjernlund%40infinera.com%7C3b5f986b9bb948eb471408d9f228464f%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637807077180242329%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=9sGOfkwGL6IpEUtnFTFElXaaSDoV57iLGBryptApQA0%3D&reserved=0
Nice! I must check that to see if I have missed something
> In general the GIC resets to be accessible from secure state only, and
> needs to be setup to be usable from non-secure lower ELs. This affects
> some GICD registers and some GICv3 system registers. U-Boot *should* do
> the basics, but either it's not enabled, or it's missing something.
U-boot does the required setup in EL3 before booting kernel into EL2 I think.
>
> Cheers,
> Andre
More information about the U-Boot
mailing list