[PATCH v2] riscv: cancel the limitation that NR_CPUS is less than or equal to 32

Mon Jan 3 12:11:54 CET 2022

On Thu, Dec 30, 2021 at 01:55:15AM +0800, Xiang W wrote:
> 在 2021-12-29星期三的 17:23 +0800，Leo Liang写道：
> > Hi Xiang,
> > On Wed, Dec 22, 2021 at 07:32:53AM +0800, Xiang W wrote:
> > > Various specifications of riscv allow the number of hart to be
> > > greater than 32. The limit of 32 is determined by
> > > gd->arch.available_harts. We can eliminate this limitation through
> > > bitmaps. Currently, the number of hart is limited to 4095, and 4095
> > > is the limit of the RISC-V Advanced Core Local Interruptor
> > > Specification.
> > > 
> > > Test on sifive unmatched.
> > > 
> > > Signed-off-by: Xiang W <wxjstz at 126.com>
> > > ---
> > > Changes since v1:
> > > 
> > > * When NR_CPUS is very large, the value of GD_AVAILABLE_HARTS will
> > >   overflow the immediate range of ld/lw. This patch fixes this
> > >   problem
> > > 
> > >  arch/riscv/Kconfig                   |  4 ++--
> > >  arch/riscv/cpu/start.S               | 21 ++++++++++++++++-----
> > >  arch/riscv/include/asm/global_data.h |  4 +++-
> > >  arch/riscv/lib/smp.c                 |  2 +-
> > >  4 files changed, 22 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/arch/riscv/cpu/start.S b/arch/riscv/cpu/start.S
> > > index 76850ec9be..92f3b78f29 100644
> > > --- a/arch/riscv/cpu/start.S
> > > +++ b/arch/riscv/cpu/start.S
> > > @@ -166,11 +166,22 @@ wait_for_gd_init:
> > >         mv      gp, s0
> > >  
> > >         /* register available harts in the available_harts mask */
> > > -       li      t1, 1
> > > -       sll     t1, t1, tp
> > > -       LREG    t2, GD_AVAILABLE_HARTS(gp)
> > > -       or      t2, t2, t1
> > > -       SREG    t2, GD_AVAILABLE_HARTS(gp)
> > > +       li      t1, GD_AVAILABLE_HARTS
> > > +       add     t1, t1, gp
> > > +       LREG    t1, 0(t1)
> > > +#if defined(CONFIG_ARCH_RV64I)
> > > +       srli    t2, tp, 6
> > > +       slli    t2, t2, 3
> > > +#elif defined(CONFIG_ARCH_RV32I)
> > > +       srli    t2, tp, 5
> > > +       slli    t2, t2, 2
> > > +#endif
> > > +       add     t1, t1, t2
> > > +       LREG    t2, 0(t1)
> > > +       li      t3, 1
> > > +       sll     t3, t3, tp
> > This seems incorrect.
> > Shouldn't we have "$tp % sizeof(ulong)" instead of "$tp /
> > sizeof(ulong)" ?
> 
> Do you meening: "$tp % sizeof(ulong)" instead of "$tp" ?
> 
> There is such a description in the riscv specification：
> 
> SLL, SRL, and SRA perform logical left, logical right, and arithmetic
> right shifts on the value in register rs1 by the shift amount held in
> the lower 5 bits of register rs2.
> 
> SLL, SRL, and SRA perform logical left, logical right, and arithmetic
> right shifts on the value in register rs1 by the shift amount held in
> register rs2. In RV64I, only the low 6 bits of rs2 are considered for
> the shift amount.
> 
> So we don’t need to perform the remainder operation.

Got it! Thanks for the explanation.

LGTM,
Reviewed-by: Leo Yu-Chi Liang <ycliang at andestech.com>

Best regards,
Leo
> 
> regards，
> Xiang W
> > > +       or      t2, t2, t3
> > > +       SREG    t2, 0(t1)
> > >  
> > >         amoswap.w.rl zero, zero, 0(t0)
> > Best regards,
> > Leo
> 
>