[PATCH 2/2] sunxi: enable dual rank memory on R40
Andre Przywara
andre.przywara at arm.com
Tue Mar 2 16:19:04 CET 2021
On Tue, 02 Mar 2021 21:50:49 +0800
Icenowy Zheng <icenowy at aosc.io> wrote:
Hi Icenowy,
> 于 2021年3月2日 GMT+08:00 下午9:40:44, Andre Przywara <andre.przywara at arm.com> 写到:
> >On Fri, 26 Feb 2021 00:13:25 +0800
> >Icenowy Zheng <icenowy at aosc.io> wrote:
> >
> >> Previously we do not have proper dual rank memory detection on R40
> >> (because we omitted PIR_QSGATE, which does not work on R40 with our
> >> configuration), and dual rank memory is just simply disabled as early
> >> R40 boards available (Banana Pi M2 Ultra and Berry) have single rank
> >> memory.
> >>
> >> As a board with dual rank memory (Forlinx OKA40i-C) is now known to
> >us,
> >> we need to have a way to do memory rank detection to support that
> >board.
> >>
> >> Add some routine to detect memory rank by trying to access the memory
> >> in rank 1 and check for error status of the memory controller, and
> >then
> >> enable dual rank memory on R40.
> >>
> >> Similar routine can be used to detect half DQ width (which is also
> >> detected by PIR_QSGATE on other SoCs), but it's left unimplemented
> >> because there's no known R40 board with half DQ width now.
> >
> >So when looking at the SPL size I realised that this patch breaks the
> >socid constant parameter propagation, especially for mctl_set_cr(). I
> >don't see immediately why, this whole code here should be compiled out
> >on any A64 builds, for instance. Instead GCC turns
> >"<mctl_set_cr.constprop.0>:" into "<mctl_set_cr>:", and passes 0x1689
> >around everytime. I tried GCC 10.2.0 and 9.2.0, also tried adding const
> >everywhere, to amplify the constant nature of this value. Patch
> >1/2 added to the code size, but kept the const propagation (only one
> >instance of 0x1689 in the disassembly). This patch here should be for
> >free on A64, but adds 104 bytes.
> >
> >Does anyone have a clue why this is happening? Is this a compiler
> >issue?
>
> Maybe the issue is that I assume this codepath is only for R40 and
> I didn't add socid to it?
But that's clearly visible by this function only being called inside "if
(socid == SOCID_R40)". And that works very well for the H3 ZQ
calibration quirk, for instance.
> Could you try to add a socid parameter to mctl_r40_detect_rank_count()?
I just tried that, and apart from looking weird it didn't do anything.
To be clear: Your code is absolutely fine, exactly the way it should be
written. It's just that the compiler is playing stupid suddenly. I was
thinking that your dummy readl might have upset it, but even with that
commented out it's still happening.
> Or maybe it makes mctl_calc_rank_size() not inlined?
So the assembly looks like everything apart from mctl_set_cr() and
mctl_auto_detect_dram_size_rank() is inlined. Those are extra because
they are called multiple times and we are using -Os.
> (Well I think the code looks noop on non-R40)
Exactly that was my thinking: when compiling for the A64, it should be
totally compiled out, and the object file should be the same before and
after.
> BTW, original rank/width detection part won't get called on R40. But
> R40 is not where we are tight on code size, so I didn't bother to ignore
> it on R40.
I see. Yeah, I am not concerned about R40 either, but I want to avoid
the A64 bloating up.
> Or should we switch back to #if's and do not rely on compiler behavior any longer?
I'd rather not do that. We are doing the right thing, and it works
marvellously so far.
To be clear: it's not a show stopper, I was just curious why this
happens.
The code size is not really critical on the A64 at the moment, so I'd
merge it as it, even if we don't find a solution. Maybe just leave a
hint about it in the code should people need to come back to this.
I also asked some compiler buffs about it, but it's not exactly the
simple reproducer case they like to deal with ;-)
Cheers,
Andre
>
> >
> >Cheers,
> >Andre
> >
> >> Signed-off-by: Icenowy Zheng <icenowy at aosc.io>
> >> ---
> >> arch/arm/mach-sunxi/dram_sunxi_dw.c | 55
> >+++++++++++++++++++++++++----
> >> 1 file changed, 49 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/arch/arm/mach-sunxi/dram_sunxi_dw.c
> >b/arch/arm/mach-sunxi/dram_sunxi_dw.c
> >> index 2b9d631d49..b86ae7cdf3 100644
> >> --- a/arch/arm/mach-sunxi/dram_sunxi_dw.c
> >> +++ b/arch/arm/mach-sunxi/dram_sunxi_dw.c
> >> @@ -414,11 +414,9 @@ static void mctl_set_cr(uint16_t socid, struct
> >dram_para *para)
> >> }
> >>
> >> if (socid == SOCID_R40) {
> >> - if (para->dual_rank)
> >> - panic("Dual rank memory not supported\n");
> >> -
> >> /* Mux pin to A15 address line for single rank memory. */
> >> - setbits_le32(&mctl_com->cr_r1, MCTL_CR_R1_MUX_A15);
> >> + if (!para->dual_rank)
> >> + setbits_le32(&mctl_com->cr_r1, MCTL_CR_R1_MUX_A15);
> >> }
> >> }
> >>
> >> @@ -702,8 +700,55 @@ static unsigned long mctl_calc_rank_size(struct
> >rank_para *rank)
> >> return (1UL << (rank->row_bits + rank->bank_bits)) *
> >rank->page_size;
> >> }
> >>
> >> +/*
> >> + * Because we cannot do mctl_phy_init(PIR_QSGATE) on R40 now (which
> >leads
> >> + * to failure), it's needed to detect the rank count of R40 in
> >another way.
> >> + *
> >> + * The code here is modelled after time_out_detect() in BSP, which
> >tries to
> >> + * access the memory and check for error code.
> >> + *
> >> + * TODO: auto detect half DQ width here
> >> + */
> >> +static void mctl_r40_detect_rank_count(struct dram_para *para)
> >> +{
> >> + ulong rank1_base = (ulong) CONFIG_SYS_SDRAM_BASE +
> >> + mctl_calc_rank_size(¶->ranks[0]);
> >> + struct sunxi_mctl_ctl_reg * const mctl_ctl =
> >> + (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE;
> >> +
> >> + /* Enable read time out */
> >> + setbits_le32(&mctl_ctl->pgcr[0], 0x1 << 25);
> >> +
> >> + (void) readl((void *) rank1_base);
> >> + udelay(10);
> >> +
> >> + if (readl(&mctl_ctl->pgsr[0]) & (0x1 << 13)) {
> >> + clrsetbits_le32(&mctl_ctl->dtcr, 0xf << 24, 0x1 << 24);
> >> + para->dual_rank = 0;
> >> + }
> >> +
> >> + /* Reset PHY FIFO to clear it */
> >> + clrbits_le32(&mctl_ctl->pgcr[0], 0x1 << 26);
> >> + udelay(100);
> >> + setbits_le32(&mctl_ctl->pgcr[0], 0x1 << 26);
> >> +
> >> + /* Clear error status */
> >> + setbits_le32(&mctl_ctl->pgcr[0], 0x1 << 24);
> >> +
> >> + /* Clear time out flag */
> >> + clrbits_le32(&mctl_ctl->pgsr[0], 0x1 << 13);
> >> +
> >> + /* Disable read time out */
> >> + clrbits_le32(&mctl_ctl->pgcr[0], 0x1 << 25);
> >> +}
> >> +
> >> static void mctl_auto_detect_dram_size(uint16_t socid, struct
> >dram_para *para)
> >> {
> >> + if (socid == SOCID_R40) {
> >> + mctl_r40_detect_rank_count(para);
> >> + mctl_set_cr(socid, para);
> >> + }
> >> +
> >> mctl_auto_detect_dram_size_rank(socid, para,
> >(ulong)CONFIG_SYS_SDRAM_BASE, ¶->ranks[0]);
> >>
> >> if ((socid == SOCID_A64 || socid == SOCID_R40) && para->dual_rank)
> >{
> >> @@ -854,8 +899,6 @@ unsigned long sunxi_dram_init(void)
> >> uint16_t socid = SOCID_H3;
> >> #elif defined(CONFIG_MACH_SUN8I_R40)
> >> uint16_t socid = SOCID_R40;
> >> - /* Currently we cannot support R40 with dual rank memory */
> >> - para.dual_rank = 0;
> >> #elif defined(CONFIG_MACH_SUN8I_V3S)
> >> uint16_t socid = SOCID_V3S;
> >> #elif defined(CONFIG_MACH_SUN50I)
More information about the U-Boot
mailing list