[U-Boot] [PATCH 2/3] ARM: rmobile: Add basic PSCI support for r8a7790 SoC

Oleksandr olekstysh at gmail.com
Mon Feb 11 20:10:58 UTC 2019


On 09.02.19 18:32, Marek Vasut wrote:
> On 2/8/19 11:52 AM, Oleksandr wrote:
>> On 05.02.19 20:55, Marek Vasut wrote:
>>
>> Hi Marek
> Hi,

Hi Marek

>>> On 1/31/19 6:38 PM, Oleksandr Tyshchenko wrote:
>>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko at epam.com>
>>>>
>>>> Also enable PSCI support for Stout and Lager boards where
>>>> actually the r8a7790 SoC is installed.
>>>>
>>>> All secondary CPUs will be switched to a non-secure HYP mode
>>>> after booting.
>>>>
>>>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko at epam.com>
>>>> ---
>>>>    arch/arm/mach-rmobile/Kconfig.32   |   2 +
>>>>    arch/arm/mach-rmobile/Makefile     |   6 +
>>>>    arch/arm/mach-rmobile/pm-r8a7790.c | 408
>>>> +++++++++++++++++++++++++++++++++++++
>>>>    arch/arm/mach-rmobile/pm-r8a7790.h |  72 +++++++
>>>>    arch/arm/mach-rmobile/psci.c       | 193 ++++++++++++++++++
>>>>    include/configs/lager.h            |   2 +
>>>>    include/configs/stout.h            |   2 +
>>>>    7 files changed, 685 insertions(+)
>>>>    create mode 100644 arch/arm/mach-rmobile/pm-r8a7790.c
>>>>    create mode 100644 arch/arm/mach-rmobile/pm-r8a7790.h
>>>>    create mode 100644 arch/arm/mach-rmobile/psci.c
>>>>
>>>> diff --git a/arch/arm/mach-rmobile/Kconfig.32
>>>> b/arch/arm/mach-rmobile/Kconfig.32
>>>> index a2e9e3d..728c323 100644
>>>> --- a/arch/arm/mach-rmobile/Kconfig.32
>>>> +++ b/arch/arm/mach-rmobile/Kconfig.32
>>>> @@ -78,6 +78,7 @@ config TARGET_LAGER
>>>>        imply CMD_DM
>>>>        select CPU_V7_HAS_NONSEC
>>>>        select CPU_V7_HAS_VIRT
>>>> +    select ARCH_SUPPORT_PSCI
>>>>      config TARGET_KZM9G
>>>>        bool "KZM9D board"
>>>> @@ -119,6 +120,7 @@ config TARGET_STOUT
>>>>        imply CMD_DM
>>>>        select CPU_V7_HAS_NONSEC
>>>>        select CPU_V7_HAS_VIRT
>>>> +    select ARCH_SUPPORT_PSCI
>> To myself: Move this option under "config R8A7790".
>>
>>>>      endchoice
>>>>    diff --git a/arch/arm/mach-rmobile/Makefile
>>>> b/arch/arm/mach-rmobile/Makefile
>>>> index 1f26ada..6f4c513 100644
>>>> --- a/arch/arm/mach-rmobile/Makefile
>>>> +++ b/arch/arm/mach-rmobile/Makefile
>>>> @@ -13,3 +13,9 @@ obj-$(CONFIG_SH73A0) += lowlevel_init.o
>>>> cpu_info-sh73a0.o pfc-sh73a0.o
>>>>    obj-$(CONFIG_R8A7740) += lowlevel_init.o cpu_info-r8a7740.o
>>>> pfc-r8a7740.o
>>>>    obj-$(CONFIG_RCAR_GEN2) += lowlevel_init_ca15.o cpu_info-rcar.o
>>>>    obj-$(CONFIG_RCAR_GEN3) += lowlevel_init_gen3.o cpu_info-rcar.o
>>>> memmap-gen3.o
>>>> +
>>>> +ifndef CONFIG_SPL_BUILD
>>>> +ifdef CONFIG_R8A7790
>>>> +obj-$(CONFIG_ARMV7_PSCI) += psci.o pm-r8a7790.o
>>>> +endif
>>>> +endif
>>>> diff --git a/arch/arm/mach-rmobile/pm-r8a7790.c
>>>> b/arch/arm/mach-rmobile/pm-r8a7790.c
>>>> new file mode 100644
>>>> index 0000000..c635cf6
>>>> --- /dev/null
>>>> +++ b/arch/arm/mach-rmobile/pm-r8a7790.c
>>>> @@ -0,0 +1,408 @@
>>>> +// SPDX-License-Identifier: GPL-2.0+
>>>> +/*
>>>> + * CPU power management support for Renesas r8a7790 SoC
>>>> + *
>>>> + * Contains functions to control ARM Cortex A15/A7 cores and
>>>> + * related peripherals basically used for PSCI.
>>>> + *
>>>> + * Copyright (C) 2018 EPAM Systems Inc.
>>>> + *
>>>> + * Mainly based on Renesas R-Car Gen2 platform code from Linux:
>>>> + *    arch/arm/mach-shmobile/...
>>>> + */
>>>> +
>>>> +#include <common.h>
>>>> +#include <asm/secure.h>
>>>> +#include <asm/io.h>
>>>> +
>>>> +#include "pm-r8a7790.h"
>>>> +
>>>> +/*****************************************************************************
>>>>
>>> I'd expect checkpatch to complain about these long lines of asterisks.
>> No, there was no complain about it. I have checked. Anyway, I can remove
>> them if required.
> Yes please, keep the comment style consistent with the rest of the
> codebase, which is also the kernel comment style.

OK, will remove


>
>>>> + * APMU definitions
>>>> +
>>>> *****************************************************************************/
>>>>
>>>> +#define CA15_APMU_BASE    0xe6152000
>>>> +#define CA7_APMU_BASE    0xe6151000
>>> All these addresses should come from DT. And the driver should be DM
>>> capable and live in drivers/
>>>
>>> [...]
>> All PSCI backends for ARMV7 in U-Boot which I was able to found, are
>> located either in arch/arm/cpu/armv7/
>>
>> or in arch/arm/mach-X. As well as other PSCI backends, this one will be
>> located in a separate secure section and acts as secure monitor,
>>
>> so it will be still alive, when U-Boot is gone away. Do we really want
>> this one to go into drivers?
> I'd much prefer it if we stopped adding more stuff to arch/arm/mach-* ,
> but I think we cannot avoid that in this case, can we ?

I am afraid, we can't avoid.


>
>>>> +/*****************************************************************************
>>>>
>>>> + * Functions which intended to be called from PSCI handlers. These
>>>> functions
>>>> + * marked as __secure and are placed in .secure section.
>>>> +
>>>> *****************************************************************************/
>>>>
>>>> +void __secure r8a7790_apmu_power_on(int cpu)
>>>> +{
>>>> +    int cluster = r8a7790_cluster_id(cpu);
>>>> +    u32 apmu_base;
>>>> +
>>>> +    apmu_base = cluster == 0 ? CA15_APMU_BASE : CA7_APMU_BASE;
>>>> +
>>>> +    /* Request power on */
>>>> +    writel(BIT(r8a7790_core_id(cpu)), apmu_base + WUPCR_OFFS);
>>> wait_for_bit of some sorts ?
>> probably yes, will re-use
>>>> +    /* Wait for APMU to finish */
>>>> +    while (readl(apmu_base + WUPCR_OFFS))
>>>> +        ;
>>> Can this spin forever ?
>> I didn't find anything in TRM which describes how long it may take.
>> Linux also doesn't use timeout.
>>
>> https://elixir.bootlin.com/linux/v5.0-rc5/source/arch/arm/mach-shmobile/platsmp-apmu.c#L46
>>
>>
>> Shall I add some timeout (probably 1s) in order to be completely safe?
> I think so, because if you start spinning in here forever, the system
> will just completely freeze without any way of recovering.

Yes, the CPU executing that PSCI request (SMC) will block here.


> I don't think
> that's what you want to happen while using virtualization, right ?

Absolutely, will use timeout.


>
>>>> +}
>>>> +
>>>> +void __secure r8a7790_apmu_power_off(int cpu)
>>>> +{
>>>> +    int cluster = r8a7790_cluster_id(cpu);
>>>> +    u32 apmu_base;
>>>> +
>>>> +    apmu_base = cluster == 0 ? CA15_APMU_BASE : CA7_APMU_BASE;
>>>> +
>>>> +    /* Request Core Standby for next WFI */
>>>> +    writel(CPUPWR_STANDBY, apmu_base +
>>>> CPUNCR_OFFS(r8a7790_core_id(cpu)));
>>>> +}
>>>> +
>>>> +void __secure r8a7790_assert_reset(int cpu)
>>>> +{
>>>> +    int cluster = r8a7790_cluster_id(cpu);
>>>> +    u32 mask, magic, rescnt;
>>>> +
>>>> +    mask = BIT(3 - r8a7790_core_id(cpu));
>>>> +    magic = cluster == 0 ? CA15RESCNT_CODE : CA7RESCNT_CODE;
>>>> +    rescnt = RST_BASE + (cluster == 0 ? CA15RESCNT : CA7RESCNT);
>>>> +    writel((readl(rescnt) | mask) | magic, rescnt);
>>>> +}
>>>> +
>>>> +void __secure r8a7790_deassert_reset(int cpu)
>>>> +{
>>>> +    int cluster = r8a7790_cluster_id(cpu);
>>>> +    u32 mask, magic, rescnt;
>>>> +
>>>> +    mask = BIT(3 - r8a7790_core_id(cpu));
>>>> +    magic = cluster == 0 ? CA15RESCNT_CODE : CA7RESCNT_CODE;
>>>> +    rescnt = RST_BASE + (cluster == 0 ? CA15RESCNT : CA7RESCNT);
>>>> +    writel((readl(rescnt) & ~mask) | magic, rescnt);
>>>> +}
>>>> +
>>>> +void __secure r8a7790_system_reset(void)
>>>> +{
>>>> +    u32 bar;
>>>> +
>>>> +    /*
>>>> +     * Before configuring internal watchdog timer (RWDT) to reboot
>>>> system
>>>> +     * we need to re-program BAR registers for the boot CPU to jump to
>>>> +     * bootrom code. Without taking care of, the boot CPU will jump to
>>>> +     * the reset vector previously installed in ICRAM1, since BAR
>>>> registers
>>>> +     * keep their values after watchdog triggered reset.
>>>> +     */
>>>> +    bar = (BOOTROM >> 8) & 0xfffffc00;
>>>> +    writel(bar, RST_BASE + CA15BAR);
>>>> +    writel(bar | SBAR_BAREN, RST_BASE + CA15BAR);
>>>> +    writel(bar, RST_BASE + CA7BAR);
>>>> +    writel(bar | SBAR_BAREN, RST_BASE + CA7BAR);
>>>> +    dsb();
>>>> +
>>>> +    /* Now, configure watchdog timer to reboot the system */
>>>> +
>>>> +    /* Trigger reset when counter overflows */
>>>> +    writel(WDTRSTCR_CODE | 0x2, RST_BASE + WDTRSTCR);
>>>> +    dsb();
>>>> +
>>>> +    /* Stop counter */
>>>> +    writel(RWTCSRA_CODE, RWDT_BASE + RWTCSRA);
>>>> +
>>>> +    /* Initialize counter with the highest value  */
>>>> +    writel(RWTCNT_CODE | 0xffff, RWDT_BASE + RWTCNT);
>>>> +
>>>> +    while (readb(RWDT_BASE + RWTCSRA) & RWTCSRA_WRFLG)
>>>> +        ;
>>>> +
>>>> +    /* Start counter */
>>>> +    writel(RWTCSRA_CODE | RWTCSRA_TME, RWDT_BASE + RWTCSRA);
>>> This seems to reimplement the reset code that's in U-Boot already. Why ?
>> Yes. I had to re-implement. Let me describe why.
>>
>>  From my understanding (I may mistake), the PSCI backend code which lives
>> in secure section should be as lightweight as possible
>>
>> and shouldn't call any U-Boot routines not marked as __secure, except
>> simple static inline functions.
>>
>> You can see PSCI implementation for any platform in U-Boot,  and only
>> simple "macroses/inlines" are used across all of them.
>>
>> Even for realizing some delay in code, they have specially implemented
>> something simple. As an example __mdelay() realization here:
>>
>> https://elixir.bootlin.com/u-boot/v2019.01/source/arch/arm/cpu/armv7/sunxi/psci.c#L61
> Can the U-Boot code be refactored somehow to avoid the duplication ?

Sorry, what duplication you are speaking about?


>
>> I know that PMIC (for Lager) and CPLD (for Stout) assist SoC to perform
>> reset. But, I couldn't use that functional here in PSCI backend, as it
>> pulls a lot of code (i2c for PMIC, gpio manipulation for CPLD, etc),
> How can the reset work properly if the CPLD/PMIC isn't even used then ?


We ask WDT to perform a CPU reset, although this is not the same reset 
as an external reset from CPLD/PMIC,

but, why not use it, if we don't have alternative? This is certainly 
better than nothing, I think.

Actually, we ask WDT to do what it is intended to do, and it seems to 
work properly as the system up and running after WDT reset in 100% cases.

What is more, this PSCI reset implementation could be common for Gen2 
SoCs where WDT is present...


>> so for that reason (AFAIK the PSCI system reset is a mandatory option) I
>> chose WDT as a entity for doing a reset. This is quite simple and can be
>> used on both boards, what is more that it can be used on other Gen2 SoC
>> if required.
>>
>>>> +}
>>>> +
>>>> +/*****************************************************************************
>>>>
>>>> + * Functions which intended to be called from PSCI board
>>>> initialization.
>>>> +
>>>> *****************************************************************************/
>>>>
>>>> +static int sysc_power_up(int scu)
>>>> +{
>>>> +    u32 status, chan_offs, isr_bit;
>>>> +    int i, j, ret = 0;
>>>> +
>>>> +    if (scu == CA15_SCU) {
>>>> +        chan_offs = CA15_SCU_CHAN_OFFS;
>>>> +        isr_bit = CA15_SCU_ISR_BIT;
>>>> +    } else {
>>>> +        chan_offs = CA7_SCU_CHAN_OFFS;
>>>> +        isr_bit = CA7_SCU_ISR_BIT;
>>>> +    }
>>>> +
>>>> +    writel(BIT(isr_bit), SYSC_BASE + SYSCISCR);
>>>> +
>>>> +    /* Submit power resume request until it was accepted */
>>>> +    for (i = 0; i < PWRER_RETRIES; i++) {
>>>> +        /* Wait until SYSC is ready to accept a power resume request */
>>>> +        for (j = 0; j < SYSCSR_RETRIES; j++) {
>>>> +            if (readl(SYSC_BASE + SYSCSR) & BIT(1))
>>>> +                break;
>>>> +
>>>> +            udelay(SYSCSR_DELAY_US);
>>>> +        }
>>>> +
>>>> +        if (j == SYSCSR_RETRIES)
>>>> +            return -EAGAIN;
>>>> +
>>>> +        /* Submit power resume request */
>>>> +        writel(BIT(0), SYSC_BASE + chan_offs + PWRONCR_OFFS);
>>>> +
>>>> +        /* Check if power resume request was accepted */
>>>> +        status = readl(SYSC_BASE + chan_offs + PWRER_OFFS);
>>>> +        if (!(status & BIT(0)))
>>>> +            break;
>>>> +
>>>> +        udelay(PWRER_DELAY_US);
>>>> +    }
>>>> +
>>>> +    if (i == PWRER_RETRIES)
>>>> +        return -EIO;
>>>> +
>>>> +    /* Wait until the power resume request has completed */
>>>> +    for (i = 0; i < SYSCISR_RETRIES; i++) {
>>>> +        if (readl(SYSC_BASE + SYSCISR) & BIT(isr_bit))
>>>> +            break;
>>>> +        udelay(SYSCISR_DELAY_US);
>>>> +    }
>>>> +
>>>> +    if (i == SYSCISR_RETRIES)
>>>> +        ret = -EIO;
>>>> +
>>>> +    writel(BIT(isr_bit), SYSC_BASE + SYSCISCR);
>>>> +
>>>> +    return ret;
>>>> +}
>>>> +
>>>> +static bool sysc_power_is_off(int scu)
>>>> +{
>>>> +    u32 status, chan_offs;
>>>> +
>>>> +    chan_offs = scu == CA15_SCU ? CA15_SCU_CHAN_OFFS :
>>>> CA7_SCU_CHAN_OFFS;
>>>> +
>>>> +    /* Check if SCU is in power shutoff state */
>>>> +    status = readl(SYSC_BASE + chan_offs + PWRSR_OFFS);
>>>> +    if (status & BIT(0))
>>>> +        return true;
>>>> +
>>>> +    return false;
>>>> +}
>>>> +
>>>> +static void apmu_setup_debug_mode(int cpu)
>>>> +{
>>>> +    int cluster = r8a7790_cluster_id(cpu);
>>>> +    u32 apmu_base, reg;
>>>> +
>>>> +    apmu_base = cluster == 0 ? CA15_APMU_BASE : CA7_APMU_BASE;
>>>> +
>>>> +    /*
>>>> +     * Enable reset requests derived from power shutoff to the
>>>> AP-system
>>>> +     * CPU cores in debug mode. Without taking care of, they may
>>>> fail to
>>>> +     * resume from the power shutoff state.
>>>> +     */
>>>> +    reg = readl(apmu_base + DBGRCR_OFFS);
>>>> +    reg |= DBGCPUREN | DBGCPUNREN(r8a7790_core_id(cpu)) | DBGCPUPREN;
>>>> +    writel(reg, apmu_base + DBGRCR_OFFS);
>>> setbits_le32()
>> Agree, will re-use
>>
>>>> +}
>>>> +
>>>> +/*
>>>> + * Reset vector for secondary CPUs.
>>>> + * This will be mapped at address 0 by SBAR register.
>>>> + * We need _long_ jump to the physical address.
>>>> + */
>>>> +asm("    .arm\n"
>>>> +    "    .align 12\n"
>>>> +    "    .globl shmobile_boot_vector\n"
>>>> +    "shmobile_boot_vector:\n"
>>>> +    "    ldr r1, 1f\n"
>>>> +    "    bx    r1\n"
>>>> +    "    .type shmobile_boot_vector, %function\n"
>>>> +    "    .size shmobile_boot_vector, .-shmobile_boot_vector\n"
>>>> +    "    .align    2\n"
>>>> +    "    .globl    shmobile_boot_fn\n"
>>>> +    "shmobile_boot_fn:\n"
>>>> +    "1:    .space    4\n"
>>>> +    "    .globl    shmobile_boot_size\n"
>>>> +    "shmobile_boot_size:\n"
>>>> +    "    .long    .-shmobile_boot_vector\n");
>>> Why can't this be implemented in C ?
>> This "reset vector" code was ported from Linux:
>>
>> https://elixir.bootlin.com/linux/v5.0-rc5/source/arch/arm/mach-shmobile/headsmp.S#L21
>>
>>
>> Really don't know whether it can be implemented in C.
>>
>> I was thinking of moving this code to a separate ASM file in order not
>> to mix C and ASM. What do you think about it?
> U-Boot already has a reset vector code, can't that be reused ?

I don't think. Being honest, I couldn't find an obvious way how to reuse 
(I assume you meant arch/arm/cpu/armv7/start.S).

The newly turned on secondary CPU entry should be common 
"psci_cpu_entry", which does proper things.

And this reset vector is just "a small piece of code" to be located in 
on-chip RAM (with limited size) and used for the jump stub...


>
>>>> +extern void shmobile_boot_vector(void);
>>>> +extern unsigned long shmobile_boot_fn;
>>>> +extern unsigned long shmobile_boot_size;
>>> Are the externs necessary ?
>> Yes, otherwise, compiler will complain.
>>
>> I can hide them in a header file. Shall I do that with moving ASM code
>> to a separate file?
> Only if you cannot reuse the U-Boot reset vector ones , which I think
> you can ?

I tried to explain above, why I can't reuse. So, if you and other 
reviewers OK with it,

I will move it to a separate file in V2.


>
>>>> +void r8a7790_prepare_cpus(unsigned long addr, u32 nr_cpus)
>>>> +{
>>>> +    int cpu = get_current_cpu();
>>>> +    int i, ret = 0;
>>>> +    u32 bar;
>>>> +
>>>> +    shmobile_boot_fn = addr;
>>>> +    dsb();
>>>> +
>>>> +    /* Install reset vector */
>>>> +    memcpy_toio((void __iomem *)ICRAM1, shmobile_boot_vector,
>>>> +            shmobile_boot_size);
>>>> +    dmb();
>>> Does this take into consideration caches ?
>> Good question... Probably, I should have added flush_dcache_range()
>> after copy.
>>
>>>> +    /* Setup reset vectors */
>>>> +    bar = (ICRAM1 >> 8) & 0xfffffc00;
>>>> +    writel(bar, RST_BASE + CA15BAR);
>>>> +    writel(bar | SBAR_BAREN, RST_BASE + CA15BAR);
>>>> +    writel(bar, RST_BASE + CA7BAR);
>>>> +    writel(bar | SBAR_BAREN, RST_BASE + CA7BAR);
>>>> +    dsb();
>>>> +
>>>> +    /* Perform setup for debug mode for all CPUs */
>>>> +    for (i = 0; i < nr_cpus; i++)
>>>> +        apmu_setup_debug_mode(i);
>>>> +
>>>> +    /*
>>>> +     * Indicate the completion status of power shutoff/resume procedure
>>>> +     * for CA15/CA7 SCU.
>>>> +     */
>>>> +    writel(BIT(CA15_SCU_ISR_BIT) | BIT(CA7_SCU_ISR_BIT),
>>>> +           SYSC_BASE + SYSCIER);
>>>> +
>>>> +    /* Power on CA15/CA7 SCU */
>>>> +    if (sysc_power_is_off(CA15_SCU))
>>>> +        ret += sysc_power_up(CA15_SCU);
>>>> +    if (sysc_power_is_off(CA7_SCU))
>>>> +        ret += sysc_power_up(CA7_SCU);
>>>> +    if (ret)
>>>> +        printf("warning: some of secondary CPUs may not boot\n");
>>> "some" is not very useful for debugging,.
>> OK, will provide more concrete output.
>>
>>>> +    /* Keep secondary CPUs in reset */
>>>> +    for (i = 0; i < nr_cpus; i++) {
>>>> +        /* Make sure that we don't reset boot CPU */
>>>> +        if (i == cpu)
>>>> +            continue;
>>>> +
>>>> +        r8a7790_assert_reset(i);
>>>> +    }
>>>> +
>>>> +    /*
>>>> +     * Enable snoop requests and DVM message requests for slave
>>>> interfaces
>>>> +     * S4 (CA7) and S3 (CA15).
>>>> +     */
>>>> +    writel(readl(CCI_BASE + CCI_SLAVE3 + CCI_SNOOP) | CCI_ENABLE_REQ,
>>>> +           CCI_BASE + CCI_SLAVE3 + CCI_SNOOP);
>>>> +    writel(readl(CCI_BASE + CCI_SLAVE4 + CCI_SNOOP) | CCI_ENABLE_REQ,
>>>> +           CCI_BASE + CCI_SLAVE4 + CCI_SNOOP);
>>>> +    /* Wait for pending bit low */
>>>> +    while (readl(CCI_BASE + CCI_STATUS))
>>>> +        ;
>>> can this spin forever ?
>> Paragraph "3.3.4 Status Register" in "ARM CoreLink™CCI-400 Cache
>> Coherent Interconnect" document says nothing
>>
>> about possibility to spin forever. Just next:
>>
>> "After writing to the snoop or DVM enable bits, the controller must wait
>> for the register write to
>> complete, then test that the change_pending bit is LOW before it turns
>> an attached device on or off."
> So what if this code spins forever, will this also completely hang Linux
> which calls this code ?
> [...]


In this case, no. This is PSCI board initialization code, which is being 
executed while we are still in U-Boot.

Unlike r8a7790_apmu_power_on()/r8a7790_apmu_power_off() functions, which 
are indented to be called from PSCI handlers (at runtime), when the 
"main" U-Boot is gone away.

If this code spins forever, U-Boot will get stuck before even starting 
Linux/Hypervisor.

Probably, it would be better to add safe timeout (~1s) here as well.


>
-- 
Regards,

Oleksandr Tyshchenko



More information about the U-Boot mailing list