[U-Boot] [PATCH v3 3/7] MSCC: add support for Ocelot SoCs

Thu Dec 13 14:05:21 UTC 2018

Hi Daniel,

 On lun., déc. 10 2018, Daniel Schwierzeck <daniel.schwierzeck at gmail.com> wrote:

>> diff --git a/arch/mips/mach-mscc/include/ioremap.h b/arch/mips/mach-mscc/include/ioremap.h
>> new file mode 100644
>> index 0000000000..8ea5c65ce3
>> --- /dev/null
>> +++ b/arch/mips/mach-mscc/include/ioremap.h
>> @@ -0,0 +1,51 @@
>> +/* SPDX-License-Identifier: (GPL-2.0+ OR MIT) */
>
> this line should start with a //. There are more files in this patch
> which need to be fixed.

Actually, according to the documentation (Licenses/README):
  The SPDX license identifier is added in form of a comment.  The comment
   style depends on the file type::

      C source:	// SPDX-License-Identifier: <SPDX License Expression>
      C header:	/* SPDX-License-Identifier: <SPDX License Expression> */

So for a C header file, /* comment */ is correct.

>
>> +/*
>> + * Copyright (c) 2018 Microsemi Corporation
>> + */
>> +
>> +#ifndef __ASM_MACH_MSCC_IOREMAP_H
>> +#define __ASM_MACH_MSCC_IOREMAP_H
>> +
>> +#include <linux/types.h>
>> +#include <mach/common.h>
>> +
>> +/*
>> + * Allow physical addresses to be fixed up to help peripherals located
>> + * outside the low 32-bit range -- generic pass-through version.
>> + */
>> +static inline phys_addr_t fixup_bigphys_addr(phys_addr_t phys_addr,
>> +					     phys_addr_t size)
>> +{
>> +	return phys_addr;
>> +}
>> +
>> +static inline int is_vcoreiii_internal_registers(phys_addr_t offset)
>> +{
>> +#if defined(CONFIG_ARCH_MSCC)
>
> this define is superfluous because this directory is only added to the
> include paths when CONFIG_ARCH_MSCC is selected

OK

>
>> +	if ((offset >= MSCC_IO_ORIGIN1_OFFSET &&
>> +	     offset < (MSCC_IO_ORIGIN1_OFFSET + MSCC_IO_ORIGIN1_SIZE)) ||
>> +	    (offset >= MSCC_IO_ORIGIN2_OFFSET &&
>> +	     offset < (MSCC_IO_ORIGIN2_OFFSET + MSCC_IO_ORIGIN2_SIZE)))
>> +		return 1;
>> +#endif
>> +
>> +	return 0;
>> +}

[...]

>> +/*
>> + * DDR memory sanity checking failed, tally and do hard reset
>> + *
>> + * NB: Assumes inlining as no stack is available!
>> + */
>> +static inline void hal_vcoreiii_ddr_failed(void)
>> +{
>> +	register u32 reset;
>> +
>> +	writel(readl(BASE_CFG + ICPU_GPR(6)) + 1, BASE_CFG + ICPU_GPR(6));
>> +
>> +	clrbits_le32(BASE_DEVCPU_GCB + PERF_GPIO_OE, BIT(19));
>> +
>> +	/* Jump to reset - does not return */
>> +	reset = KSEG0ADDR(_machine_restart);
>> +	/* Reset while running from cache */
>> +	icache_lock((void *)reset, 128);
>> +	asm volatile ("jr %0"::"r" (reset));
>
> could you briefly describe the reason for this in a comment? It's not
> clear why this code is necessary without knowing the SoC. AFAIU from
> your last mail the boot SPI flash is mapped to KUSEG and you need to
> establish a TLB mapping in lowlevel_init() to be able to move to
> KSEG0.

The reboot workaround in _machine_restart() will change the SPI NOR
into SW bitbang.

This will render the CPU unable to execute directly from the NOR, which
is why the reset instructions are prefetched into the I-cache.

When failing the DDR initialization we are executing from NOR.

The last instruction in _machine_restart() will reset the MIPS CPU
(and the cache), and the CPU will start executing from the reset vactor.

I will add this explanation as comment.

>
>> +
>> +	panic("DDR init failed\n");
>> +}
>> +
>> +/*
>> + * DDR memory sanity checking done, possibly enable ECC.
>> + *
>> + * NB: Assumes inlining as no stack is available!
>> + */
>> +static inline void hal_vcoreiii_ddr_verified(void)
>> +{
>> +#ifdef MIPS_VCOREIII_MEMORY_ECC
>> +	/* Finally, enable ECC */
>> +	register u32 val = readl(BASE_CFG + ICPU_MEMCTRL_CFG);
>> +
>> +	val |= ICPU_MEMCTRL_CFG_DDR_ECC_ERR_ENA;
>> +	val &= ~ICPU_MEMCTRL_CFG_BURST_SIZE;
>> +
>> +	writel(val, BASE_CFG + ICPU_MEMCTRL_CFG);
>> +#endif
>> +
>> +	/* Reset Status register - sticky bits */
>> +	writel(readl(BASE_CFG + ICPU_MEMCTRL_STAT), BASE_CFG + ICPU_MEMCTRL_STAT);
>> +}
>> +
>> +/* NB: Assumes inlining as no stack is available! */
>> +static inline int look_for(u32 bytelane)
>> +{
>> +	register u32 i;
>> +
>> +	/* Reset FIFO in case any previous access failed */
>> +	for (i = 0; i < sizeof(training_data); i++) {
>> +		register u32 byte;
>> +
>> +		memphy_soft_reset();
>> +		/* Reset sticky bits */
>> +		writel(readl(BASE_CFG + ICPU_MEMCTRL_STAT),
>> +		       BASE_CFG + ICPU_MEMCTRL_STAT);
>> +		/* Read data */
>> +		byte = ((volatile u8 *)MSCC_DDR_TO)[bytelane + (i * 4)];
>
> __raw_readl()?

I had tried it before but without luck, but after trying harder this
time I managed to use read(b|l)/write(b|l) everywhere and get ride of
the volatile variable.

>
>> +		/*
>> +		 * Prevent the compiler reordering the instruction so
>> +		 * the read of RAM happens after the check of the
>> +		 * errors.
>> +		 */
>> +		asm volatile("" : : : "memory");
>
> this is available as barrier(). But according to context you could use
> rmb(). Anyway with the current volatile pointer or the suggested
> __raw_readl() the compiler shouldn't reorder at all

I had a close look on the code generating the __raw_readl and there is
nothing there to guaranty the ordering. Actually in our case (32 bits)
__read_readl is just:
static inline u32 __raw_readl(const volatile void __iomem *mem)
{
	u32 __val;

	__val = *mem;
	return __val;
}

initial code is here:
https://elixir.bootlin.com/u-boot/v2018.11-rc3/source/arch/mips/include/asm/io.h#L265
but __swizzle_addr_l() did nothing
https://elixir.bootlin.com/u-boot/v2018.11-rc3/source/arch/mips/include/asm/mach-generic/mangle-port.h#L10
same for __raw_ioswabl():
https://elixir.bootlin.com/u-boot/v2018.11-rc3/source/arch/mips/include/asm/io.h#L35

So the code is the same that we have written. I agree it is cleaner
to use __raw_readl but it doesn't add anything about the ordering.

It is the same for the use of the volatile, it ensures that the compiler
will always produce a operation to read the data in memory, but it is
not about ordering.

As you suggested I will use rmb();

>> +static inline int hal_vcoreiii_train_bytelane(u32 bytelane)
>> +{
>> +	register int res;
>> +	register u32 dqs_s;
>> +
>> +	set_dly(bytelane, 0);	// Start training at DQS=0
>
> no C++ style comments

OK

>
>> +	while ((res = look_for(bytelane)) == DDR_TRAIN_CONTINUE)
>> +		;
>> +	if (res != DDR_TRAIN_OK)
>> +		return res;
>> +
>> +	dqs_s = readl(BASE_CFG + ICPU_MEMCTRL_DQS_DLY(bytelane));
>> +	while ((res = look_past(bytelane)) == DDR_TRAIN_CONTINUE)
>> +		;
>> +	if (res != DDR_TRAIN_OK)
>> +		return res;
>> +	/* Reset FIFO - for good measure */
>> +	memphy_soft_reset();
>> +	/* Adjust to center [dqs_s;cur] */
>> +	center_dly(bytelane, dqs_s);
>> +	return DDR_TRAIN_OK;
>> +}
>> +
>> +/* This algorithm is converted from the TCL training algorithm used
>> + * during silicon simulation.
>> + * NB: Assumes inlining as no stack is available!
>> + */
>> +static inline int hal_vcoreiii_init_dqs(void)
>> +{
>> +#define MAX_DQS 32
>> +	register u32 i, j;
>> +
>> +	for (i = 0; i < MAX_DQS; i++) {
>> +		set_dly(0, i);	// Byte-lane 0
>
> no C++ style comments

OK

>
>> +		for (j = 0; j < MAX_DQS; j++) {
>> +			register u32 __attribute__ ((unused)) byte;
>
> why unused? If you really need it, you could use __maybe_unused

Because the purpose of this variable is just to access the memory, we
don't do nothing of the value read, and gcc complain about it. But as
you suggest I will use __maybe_unused.

>
>> +			set_dly(1, j);	// Byte-lane 1
>> +			/* Reset FIFO in case any previous access failed */
>> +			memphy_soft_reset();
>> +			writel(readl(BASE_CFG + ICPU_MEMCTRL_STAT),
>> +			       BASE_CFG + ICPU_MEMCTRL_STAT);
>> +			byte = ((volatile u8 *)MSCC_DDR_TO)[0];
>> +			byte = ((volatile u8 *)MSCC_DDR_TO)[1];
>> +			if (!(readl(BASE_CFG + ICPU_MEMCTRL_STAT) &
>> +			    (ICPU_MEMCTRL_STAT_RDATA_MASKED |
>> +			     ICPU_MEMCTRL_STAT_RDATA_DUMMY)))
>> +				return 0;
>> +		}
>> +	}
>> +	return -1;
>> +}
>> +
>> +static inline int dram_check(void)
>> +{
>> +#define DDR ((volatile u32 *) MSCC_DDR_TO)
>> +	register u32 i;
>> +
>> +	for (i = 0; i < 8; i++) {
>> +		DDR[i] = ~i;
>> +		if (DDR[i] != ~i)
>
> __raw_readl(), __raw_writel() and drop the explicit volatile?

Yes, as explain above, it s done now.

>> +
>> +/*
>> + * Target offset base(s)
>> + */
>> +#define MSCC_IO_ORIGIN1_OFFSET 0x70000000
>> +#define MSCC_IO_ORIGIN1_SIZE   0x00200000
>> +#define MSCC_IO_ORIGIN2_OFFSET 0x71000000
>> +#define MSCC_IO_ORIGIN2_SIZE   0x01000000
>> +#define BASE_CFG        ((void __iomem *)0x70000000)
>> +#define BASE_DEVCPU_GCB ((void __iomem *)0x71070000)
>
> Would it be possible on that SoC to define those register offsets as
> simple physical address and create the mapping when needed?
> For example:
>
> void foo()
> {
>     void __iomem *base_cfg = ioremap(BASE_CFG, ...);
>     writel(base_cfg + XXX, 0);
> }

Actually creating the mapping is just casting the physical address in an
(void __iomem *), see our plat_ioremap.

Calling ioremap in every function will just grow them with little
benefit.

If you really want it, what I could is sharing void __iomem *base_cfg
and void __iomem *base_devcpu_gcb at platform level, and initialize them
only once very early during the boot.

>> +LEAF(lowlevel_init)
>> +	/*
>> +	 * As we have no stack yet, we can assume the restricted
>> +	 * luxury of the sX-registers without saving them
>> +	 */
>> +	move	s0,ra
>> +
>> +	jal	vcoreiii_tlb_init
>> +	nop
>
> we use the same style as Linux MIPS where instructions in the delay slot
> should be indented by an extra space.

OK

Thanks,

Gregory

-- 
Gregory Clement, Bootlin
Embedded Linux and Kernel engineering
http://bootlin.com