QEMU NUMA and U-Boot

François Ozog francois.ozog at linaro.org
Wed Jul 7 09:40:43 CEST 2021


On Wed, 7 Jul 2021 at 05:59, Heinrich Schuchardt <xypron.glpk at gmx.de> wrote:
>
> Am 7. Juli 2021 05:18:20 MESZ schrieb Heinrich Schuchardt <xypron.glpk at gmx.de>:
> >Am 7. Juli 2021 03:44:35 MESZ schrieb AKASHI Takahiro
> ><takahiro.akashi at linaro.org>:
> >>François,
> >>
> >>On Tue, Jul 06, 2021 at 08:10:08PM +0200, Heinrich Schuchardt wrote:
> >>> On 7/6/21 6:13 PM, François Ozog wrote:
> >>> > Hi Heinrich, U-Boot 2021-07rc5 does not take into account memory
> >>> > description when using Qemu 5.2 NUMA configuration to adapt memory
> >>map
> >>> > (kernel_addr_r...):
> >>> >
> >>> >         -smp 4 \
> >>> >           -m 8G,slots=2,maxmem=16G \
> >>> >          -object memory-backend-ram,size=4G,id=m0 \
> >>> >          -object memory-backend-ram,size=4G,id=m1 \
> >>> >          -numa node,cpus=0-1,nodeid=0,memdev=m0 \
> >>> >          -numa node,cpus=2-3,nodeid=1,memdev=m1
> >>> >
> >>> > kernel_addr_r is still 0x4040000 and thus you can't use it to
> >>bootefi.
> >>> >
> >>> > fdt addr 0x13ede6de0; fdt print
> >>> >
> >>> > Displays fdt while I think it should not.
> >>> >
> >>> > If I load the kernel at dram.start, the load works but not boot
> >>> >
> >>> > U-Boot 2021.07 (Jul 06 2021 - 13:26:43 +0000)
> >>> >
> >>> >
> >>> > DRAM:4 GiB
> >>> >
> >>> > Flash: 64 MiB
> >>> >
> >>> > Loading Environment from Flash... OK
> >>> >
> >>> > In:pl011 at 9000000
> >>> >
> >>> > Out: pl011 at 9000000
> >>> >
> >>> > Err: pl011 at 9000000
> >>> >
> >>> > Net: eth0: virtio-net#32
> >>> >
> >>> > Hit any key to stop autoboot:0
> >>> >
> >>> > =>
> >>> >
> >>> > => bdinfo
> >>> >
> >>> > boot_params = 0x0000000000000000
> >>> >
> >>> > DRAM bank = 0x0000000000000000
> >>> >
> >>> > -> start= 0x0000000140000000
> >>> >
> >>> > -> size = 0x0000000100000000
> >>> >
> >>> > flashstart= 0x0000000000000000
> >>> >
> >>> > flashsize = 0x0000000004000000
> >>> >
> >>> > flashoffset = 0x00000000000bc990
> >>> >
> >>> > baudrate= 115200 bps
> >>> >
> >>> > relocaddr = 0x000000013ff27000
> >>> >
> >>> > reloc off = 0x000000013ff27000
> >>> >
> >>> > Build = 64-bit
> >>> >
> >>> > current eth = virtio-net#32
> >>> >
> >>> > ethaddr = 52:52:52:52:52:52
> >>> >
> >>> > IP addr = <NULL>
> >>> >
> >>> > fdt_blob= 0x000000013ede6de0
> >>> >
> >>> > new_fdt = 0x000000013ede6de0
> >>> >
> >>> > fdt_size= 0x0000000000100000
> >>> >
> >>> > lmb_dump_all:
> >>> >
> >>> > memory.cnt= 0x1
> >>> >
> >>> > memory.reg[0x0].base = 0x140000000
> >>> >
> >>> > .size = 0x100000000
> >>> >
> >>> >
> >>> > reserved.cnt= 0x0
> >>> >
> >>> > arch_number = 0x0000000000000000
> >>> >
> >>> > TLB addr= 0x000000013fff0000
> >>> >
> >>> > irq_sp= 0x000000013ede6dd0
> >>> >
> >>> > sp start= 0x000000013ede6dd0
> >>> >
> >>> > Early malloc usage: 3a8 / 2000
> >>> >
> >>> > => load virtio 0:1 0x140000000 /oskit.efi
> >>> >
> >>> > 853424 bytes read in 1 ms (813.9 MiB/s)
> >>> >
> >>> > => bootefi0x140000000 0x13ede6dd0
> >>> >
> >>> > ERROR: Failed to register WaitForKey event
> >>> >
> >>> > Setting OsIndications failed
> >>> >
> >>> > Error: Cannot initialize UEFI sub-system, r = 9
> >>> >
> >>> >
> >>> > I think there is a need to calculate memory map based on previous
> >>> > firmware (TFA, QEMU can be considered as previous frimware)
> >>information
> >>> > (DT or blob_list).
> >>> >
> >>> > What do you think ?
> >>> >
> >>> > Cheers
> >>> >
> >>> > FF
> >>> >
> >>> > --
> >>> >
> >>> > François-Frédéric Ozog | /Director Business Development/
> >>> > T: +33.67221.6485
> >>> > francois.ozog at linaro.org <mailto:francois.ozog at linaro.org>
> >>| Skype: ffozog
> >>> >
> >>> >
> >>>
> >>> The kernel load address is hard coded here:
> >>> include/configs/qemu-arm.h:41:  "kernel_addr_r=0x40400000\0" \
> >>>
> >>> bdinfo shows:
> >>> DRAM start = 0x140000000
> >>> DRAM size  = 0x100000000
> >>>
> >>> fdt addr $fdt_addr
> >>> fdt printf
> >>>
> >>> shows two memory areas. One at 40000000, one at 140000000.
> >>
> >>(This shows that U-Boot receives a correct memory map via dtb.)
> >>
> >>Is this a NUMA machine, isn't it? Why should we care of which
> >>memory region be used here? Please note that this is a virtual
> >machine,
> >>there is no practical difference between two regions.
> >>
> >>The root problem is that U-Boot did not recognize there were two
> >>memory regions. We can fix this issue in either way:
> >>
> >>1)
> >>diff --git a/configs/qemu_arm64_defconfig
> >>b/configs/qemu_arm64_defconfig
> >>index f6e586627a8e..b70ffae8bf6e 100644
> >>--- a/configs/qemu_arm64_defconfig
> >>+++ b/configs/qemu_arm64_defconfig
> >>@@ -1,7 +1,7 @@
> >> CONFIG_ARM=y
> >> CONFIG_POSITION_INDEPENDENT=y
> >> CONFIG_ARCH_QEMU=y
> >>-CONFIG_NR_DRAM_BANKS=1
> >>+CONFIG_NR_DRAM_BANKS=2
> >> CONFIG_ENV_SIZE=0x40000
> >> CONFIG_ENV_SECT_SIZE=0x40000
> >> CONFIG_AHCI=y
> >>
> >>2)
> >>diff --git a/lib/fdtdec.c b/lib/fdtdec.c
> >>index 4b097fb588ed..4067ea2dead6 100644
> >>--- a/lib/fdtdec.c
> >>+++ b/lib/fdtdec.c
> >>@@ -1111,7 +1111,7 @@ int fdtdec_setup_memory_banksize(void)
> >>                return -EINVAL;
> >>        }
> >>
> >>-       for (bank = 0; bank < CONFIG_NR_DRAM_BANKS; bank++) {
> >>+       for (bank = 0; ; bank++) {
> >>                ret = ofnode_read_resource(mem, reg++, &res);
> >>                if (ret < 0) {
> >>                        reg = 0;
> >>
> >>   (fdtdec_setup_memory_banksize() is called in dram_init_banksize().)
> >>
> >>
> >>(2) seems much better, but I don't know why we had to use
> >>CONFIG_NR_DRAM_BANKS here.
> >>

2) alone does not work as other places in the code refer to
CONFIG_NR_DRAM_BANKS. Setting ...BANKS to 32 makes my code work and
bdinfo seems now correct:

=> bdinfo
boot_params = 0x0000000000000000
DRAM bank   = 0x0000000000000000
-> start    = 0x0000000140000000
-> size     = 0x0000000100000000
DRAM bank   = 0x0000000000000001
-> start    = 0x0000000040000000
-> size     = 0x0000000100000000
flashstart  = 0x0000000000000000
flashsize   = 0x0000000004000000
flashoffset = 0x00000000000bcb88
baudrate    = 115200 bps
relocaddr   = 0x000000013ff27000
reloc off   = 0x000000013ff27000
Build       = 64-bit
current eth = virtio-net#32
ethaddr     = 52:52:52:52:52:52
IP addr     = <NULL>
fdt_blob    = 0x000000013ede6cf0
new_fdt     = 0x000000013ede6cf0
fdt_size    = 0x0000000000100000
lmb_dump_all:
    memory.cnt   = 0x1
    memory.reg[0x0].base   = 0x40000000
  .size   = 0x200000000
    reserved.cnt   = 0x1
    reserved.reg[0x0].base = 0x13ede58f0
    .size = 0x121a710
arch_number = 0x0000000000000000
TLB addr    = 0x000000013fff0000
irq_sp      = 0x000000013ede6ce0
sp start    = 0x000000013ede6ce0
Early malloc usage: 3a8 / 2000

May I suggest you propose a combined patch Akashi-san? If we assume
NUMA systems to be tested up to 8 nodes to mimic real existing
enterprise hardware and up to 4 memory slots (say for memory hot
plugging tests) what about a default value of 32? Alternatively, we
could set this value to a much higher one if the costs are negligible.


> >>In this case, other occurrences of CONFIG_NR_DRAM_BANKS in this file
> >>should be replaced with a variable for it.
> >>
> >>> Your use case is well beyond the typical U-Boot usage. So I guess it
> >>> will be up to Linaro to provide the necessary patches:
> >>>
> >>> * determine the active CPU
> >>> * determine the RAM assigned to the active CPU according
> >>>   to the numa-node-id in the device-tree
> >>> * make sure that U-Boot only uses the memory of the active CPU
> >>>   internally
> >>> * make sure that the UEFI memory map contains a compliant
> >description
> >>> * possibly, dynamically set up the environment variables
> >>>
> >>> +CC Tuomas Tynkkynen (maintainer for qemu_arm64_defconfig)
> >>
> >>For (1), we'd better have a different config, or increase
> >>the value of CONFIG_NR_DRAM_BANKS to a bigger number?
> >
> >Is the system configured such that each CPU can access the others CPU's
> >RAM when entering U-Boot?
> >
> >Best regards
> >
> >Heinrich
> >
>
> At least the comments for this patch sound as if on a physical system cross NUMA node memory access is only available after full SMP initialization:
>
> https://patchwork.kernel.org/project/linux-acpi/patch/20180625130552.5636-1-lorenzo.pieralisi@arm.com/
>
> QEMU may be less restrictive.
>
> QEMU allows the node distance to be 255 indicating that cross node access is infeasible.
>
> Best regards
>
> Heinrich
>
> >>
> >>-Takahiro Akashi
> >>
> >>
> >>> Best regards
> >>>
> >>> Heinrich
>


-- 
François-Frédéric Ozog | Director Business Development
T: +33.67221.6485
francois.ozog at linaro.org | Skype: ffozog


More information about the U-Boot mailing list