[U-Boot] [PATCH 1/2] arm: imx6: Add DDR3 calibration code for MX6 Q/D/DL

Tue Dec 22 02:52:02 CET 2015

On Sunday, December 20, 2015 at 08:31:46 PM, Eric Nelson wrote:
> Hi Marek,
> 
> On 12/16/2015 07:40 AM, Marek Vasut wrote:
> > Add DDR3 calibration code for i.MX6Q, i.MX6D and i.MX6DL. This code
> > fine-tunes the behavior of the MMDC controller in order to improve
> > the signal integrity and memory stability.
> 
> It would be good to have a reference to AN4467 in the commit message,
> since that document is the best reference to understanding this code:
> 
> 	http://cache.freescale.com/files/32bit/doc/app_note/AN4467.pdf

OK

> > Signed-off-by: Marek Vasut <marex at denx.de>
> > Cc: Stefano Babic <sbabic at denx.de>
> > ---
> > 
> >  arch/arm/cpu/armv7/mx6/ddr.c            | 559
> >  ++++++++++++++++++++++++++++++++
> >  arch/arm/include/asm/arch-mx6/mx6-ddr.h |   5 +
> >  2 files changed, 564 insertions(+)
> 
> Note that there's another implementation of this in the barebox
> project:
> 
> http://git.pengutronix.de/?p=barebox.git;a=blob;f=arch/arm/mach-imx/imx6-mm
> dc.c;h=146df573e4eeed7132e93899b4b539e842bb6ba2;hb=HEAD

Yeah, I sent them a patch for it too ;-)

> > diff --git a/arch/arm/cpu/armv7/mx6/ddr.c b/arch/arm/cpu/armv7/mx6/ddr.c
> > index 6b039e4..194411f 100644
> > --- a/arch/arm/cpu/armv7/mx6/ddr.c
> > +++ b/arch/arm/cpu/armv7/mx6/ddr.c
> > @@ -13,6 +13,565 @@
> > 
> >  #include <asm/io.h>
> >  #include <asm/types.h>
> > 
> > +#if defined(CONFIG_MX6QDL) || defined(CONFIG_MX6Q) ||
> > defined(CONFIG_MX6D) +
> 
> Should this use the common wait_for_bit?
> 	http://lists.denx.de/pipermail/u-boot/2015-December/thread.html#237952
> 
> I'm not certain that the <ctrl-c> handling is appropriate when
> used here.

Yes, that's why I called it exactly the same, so once the patches above hit
mainline, I could just nuke this function and replace it with the common one.

> > +static int wait_for_bit(void *reg, const uint32_t mask, bool set)
> > +{
> > +	unsigned int timeout = 1000;
> > +	u32 val;
> > +
> > +	while (--timeout) {
> > +		val = readl(reg);
> > +		if (!set)
> > +			val = ~val;
> > +
> > +		if ((val & mask) == mask)
> > +			return 0;
> > +
> > +		udelay(1);
> > +	}
> > +
> > +	printf("%s: Timeout (reg=%p mask=%08x wait_set=%i)\n",
> > +	      __func__, reg, mask, set);
> > +	hang();	/* DRAM couldn't be calibrated, game over :-( */
> > +}
> > +
> > +static void reset_read_data_fifos(void)
> > +{
> > +	struct mmdc_p_regs *mmdc0 = (struct mmdc_p_regs *)MMDC_P0_BASE_ADDR;
> > +
> > +	/* Reset data FIFOs twice. */
> > +	setbits_le32(&mmdc0->mpdgctrl0, 1 << 31);
> > +	wait_for_bit(&mmdc0->mpdgctrl0, 1 << 31, 0);
> > +
> > +	setbits_le32(&mmdc0->mpdgctrl0, 1 << 31);
> > +	wait_for_bit(&mmdc0->mpdgctrl0, 1 << 31, 0);
> > +}
> > +
> 
> See comments below about chip-selects during calibration.
> 
> > +static void precharge_all(const bool cs0_enable, const bool cs1_enable)
> > +{
> > +	struct mmdc_p_regs *mmdc0 = (struct mmdc_p_regs *)MMDC_P0_BASE_ADDR;
> > +
> > +	/*
> > +	 * Issue the Precharge-All command to the DDR device for both
> > +	 * chip selects. Note, CON_REQ bit should also remain set. If
> > +	 * only using one chip select, then precharge only the desired
> > +	 * chip select.
> > +	 */
> 
> AN4467 says that a wait for tRP is needed here, though none of the
> other implementations explicitly do that.
> 
> The wait for CON_ACK seems like a good idea, though I wonder if
> the waits shouldn't be performed **before** the commands are issued.
> 
> I'd also like to see a named constant for MDSCR_CON_ACK

That'd be great. Do you know of some header where the MMDC register bits
are defined ? I am definitelly not gonna rewrite the the i.MX6 MMDC bits
from the datasheet by hand.

> , and since
> waiting for it is done in multiple places, perhaps wait_for_con_ack().

Implementing a function for the sake of just wrapping another function
seems wasteful to me.

> > +	if (cs0_enable) { /* CS0 */
> > +		writel(0x04008050, &mmdc0->mdscr);
> > +		wait_for_bit(&mmdc0->mdscr, 1 << 14, 1);
> > +	}
> > +
> > +	if (cs1_enable) { /* CS1 */
> > +		writel(0x04008058, &mmdc0->mdscr);
> > +		wait_for_bit(&mmdc0->mdscr, 1 << 14, 1);
> > +	}
> > +}
> > +
> > +static void force_delay_measurement(int bus_size)
> > +{
> > +	struct mmdc_p_regs *mmdc0 = (struct mmdc_p_regs *)MMDC_P0_BASE_ADDR;
> > +	struct mmdc_p_regs *mmdc1 = (struct mmdc_p_regs *)MMDC_P1_BASE_ADDR;
> > +
> > +	writel(0x800, &mmdc0->mpmur0);
> > +	if (bus_size == 0x2)
> > +		writel(0x800, &mmdc1->mpmur0);
> > +}
> > +
> > +static void modify_dg_result(u32 *reg_st0, u32 *reg_st1, u32 *reg_ctrl)
> > +{
> > +	u32 dg_tmp_val, dg_dl_abs_offset, dg_hc_del, val_ctrl;
> > +
> > +	/*
> > +	 * DQS gating absolute offset should be modified from reflecting
> > +	 * (HW_DG_LOWx + HW_DG_UPx)/2 to reflecting (HW_DG_UPx - 0x80)
> > +	 */
> > +
> > +	val_ctrl = readl(reg_ctrl);
> > +	val_ctrl &= 0xf0000000;
> > +
> 
> Though this matches the app note and other implementations, I think
> this would be easier to understand by shifting first and anding with
> 0x7ff.

I think it is better to stick with the appnote, so it's easy to map the
code and the appnote between one another.

> > +	dg_tmp_val = ((readl(reg_st0) & 0x07ff0000) >> 16) - 0xc0;
> 
> Especially since these calculations operate on the shifted values:
> > +	dg_dl_abs_offset = dg_tmp_val & 0x7f;
> > +	dg_hc_del = (dg_tmp_val & 0x780) << 1;
> > +
> > +	val_ctrl |= dg_dl_abs_offset + dg_hc_del;
> > +
> > +	dg_tmp_val = ((readl(reg_st1) & 0x07ff0000) >> 16) - 0xc0;
> > +	dg_dl_abs_offset = dg_tmp_val & 0x7f;
> > +	dg_hc_del = (dg_tmp_val & 0x780) << 1;
> > +
> > +	val_ctrl |= (dg_dl_abs_offset + dg_hc_del) << 16;
> > +
> > +	writel(val_ctrl, reg_ctrl);
> > +}
> > +
> 
> I'd recommend passing parameters of mx6_ddr_sysinfo (input) and
> mx6_mmdc_calibration (output) to this routine.

Why exactly ?

> > +int mmdc_do_write_level_calibration(void)
> > +{
> > +	struct mmdc_p_regs *mmdc0 = (struct mmdc_p_regs *)MMDC_P0_BASE_ADDR;
> > +	struct mmdc_p_regs *mmdc1 = (struct mmdc_p_regs *)MMDC_P1_BASE_ADDR;
> > +	u32 esdmisc_val, zq_val;
> > +	u32 errors = 0;
> > +	u32 ldectrl[4];
> 
> I believe the 4 here comes from this calculation:
> 
> 	ddr_mr1 = ((sysinfo->rtt_nom & 1) ? 1 : 0) << 2 |
> 	          ((sysinfo->rtt_nom & 2) ? 1 : 0) << 6;
> 
> > +	u32 ddr_mr1 = 0x4;
> > +
> 
> I'm not sure how this is useful (trying to proceed if things fail).
> 
> Have you seen this triggered?

What ? I see an assignment , what would trigger here ?

> > +	/*
> > +	 * Stash old values in case calibration fails,
> > +	 * we need to restore them
> > +	 */
> > +	ldectrl[0] = readl(&mmdc0->mpwldectrl0);
> > +	ldectrl[1] = readl(&mmdc0->mpwldectrl1);
> > +	ldectrl[2] = readl(&mmdc1->mpwldectrl0);
> > +	ldectrl[3] = readl(&mmdc1->mpwldectrl1);
> > +
> > +	/* disable DDR logic power down timer */
> > +	clrbits_le32(&mmdc0->mdpdc, 0xff00);
> > +
> 
> Either s/Adopt/Automatic/ or just get rid of Adopt...
> 
> This is a typo in the app note.

OK

> > +	/* disable Adopt power down timer */
> > +	setbits_le32(&mmdc0->mapsr, 0x1);
> > +
> > +	debug("Starting write leveling calibration.\n");
> > +
> > +	/*
> > +	 * 2. disable auto refresh and ZQ calibration
> > +	 * before proceeding with Write Leveling calibration
> > +	 */
> 
> Trusting the esdmisc value read from here seems like a bad idea.
> It would be clearer if the code below that restores things just
> (re)configures mdref directly.

Can you elaborate some more on why is it a bad idea?

> > +	esdmisc_val = readl(&mmdc0->mdref);
> > +	writel(0x0000C000, &mmdc0->mdref);
> > +	zq_val = readl(&mmdc0->mpzqhwctrl);
> 
> clrbits_le32?

The zq_val is used further in the code, so no.

> > +	writel(zq_val & ~0x3, &mmdc0->mpzqhwctrl);
> > +
> > +	/* 3. increase walat and ralat to maximum */
> 
> (3 << 16) | (7 << 6) would be clearer, since these are numeric
> values in tbe bitfields.
> 
> > +	setbits_le32(&mmdc0->mdmisc,
> > +		     (1 << 6) | (1 << 7) | (1 << 8) | (1 << 16) | (1 << 17));
> > +	setbits_le32(&mmdc1->mdmisc,
> > +		     (1 << 6) | (1 << 7) | (1 << 8) | (1 << 16) | (1 << 17));
> 
> I had a concern about this bit of code (setting RALAT and WALAT to
> maximum values). It seems odd that we calibrate with one set of delays
> and then choose a different set later.
> 
> Experimentally, I've found using min or max values seems to have
> no effect on the result.

OK

> > +	/*
> > +	 * 4 & 5. Configure the external DDR device to enter write-leveling
> > +	 * mode through Load Mode Register command.
> > +	 * Register setting:
> > +	 * Bits[31:16] MR1 value (0x0080 write leveling enable)
> > +	 * Bit[9] set WL_EN to enable MMDC DQS output
> > +	 * Bits[6:4] set CMD bits for Load Mode Register programming
> > +	 * Bits[2:0] set CMD_BA to 0x1 for DDR MR1 programming
> > +	 */
> 
> wait_for_con_ack() seems like a good idea here:

Why ?

> > +	writel(0x00808231, &mmdc0->mdscr);
> > +
> > +	/* 6. Activate automatic calibration by setting MPWLGCR[HW_WL_EN] */
> > +	writel(0x00000001, &mmdc0->mpwlgcr);
> > +
> > +	/*
> > +	 * 7. Upon completion of this process the MMDC de-asserts
> > +	 * the MPWLGCR[HW_WL_EN]
> > +	 */
> 
> Why 1 << 0?

It waits for the autocalibration to finish, see 6. above.

> > +	wait_for_bit(&mmdc0->mpwlgcr, 1 << 0, 0);
> > +
> > +	/*
> > +	 * 8. check for any errors: check both PHYs for x64 configuration,
> > +	 * if x32, check only PHY0
> > +	 */
> > +	if (readl(&mmdc0->mpwlgcr) & 0x00000F00)
> > +		errors |= 1;
> 
> This should be conditional on (sysinfo->dsize == 2) to
> 
> allow use with a 32-bit bus:

I believe mmdc1 will contain 0x0 << 8 if you use 32bit bus, right ?

> > +	if (readl(&mmdc1->mpwlgcr) & 0x00000F00)
> > +		errors |= 2;
> > +
> > +	debug("Ending write leveling calibration. Error mask: 0x%x\n", errors);
> > +
> 
> Where are you getting this test from?
> 
> I have code that tests each byte individually against a max
> value of 0x2f, but I'm having trouble finding the source of
> the test at the moment.

This one came I think from the freescale code from novena repo.

> > +	/* check to see if cal failed */
> > +	if ((readl(&mmdc0->mpwldectrl0) == 0x001F001F) &&
> > +	    (readl(&mmdc0->mpwldectrl1) == 0x001F001F) &&
> 
> Ditto (sysinfo->dsize == 2):
> > +	    (readl(&mmdc1->mpwldectrl0) == 0x001F001F) &&
> > +	    (readl(&mmdc1->mpwldectrl1) == 0x001F001F)) {
> > +		debug("Cal seems to have soft-failed due to memory not 
supporting
> > write leveling on all channels. Restoring original write leveling
> > values.\n"); +		writel(ldectrl[0], &mmdc0->mpwldectrl0);
> > +		writel(ldectrl[1], &mmdc0->mpwldectrl1);
> > +		writel(ldectrl[2], &mmdc1->mpwldectrl0);
> > +		writel(ldectrl[3], &mmdc1->mpwldectrl1);
> > +		errors |= 4;
> > +	}
> > +
> > +	/*
> > +	 * User should issue MRS command to exit write leveling mode
> > +	 * through Load Mode Register command
> > +	 * Register setting:
> > +	 * Bits[31:16] MR1 value "ddr_mr1" value from initialization
> > +	 * Bit[9] clear WL_EN to disable MMDC DQS output
> > +	 * Bits[6:4] set CMD bits for Load Mode Register programming
> > +	 * Bits[2:0] set CMD_BA to 0x1 for DDR MR1 programming
> > +	 */
> > +	writel((ddr_mr1 << 16) + 0x8031, &mmdc0->mdscr);
> > +
> 
> As mentioned earlier, you can just set mdref here, instead of
> using a saved value. The only assignment is currently this:
> 
> 	mmdc0->mdref = (0 << 14) |
> 		       (3 << 11);
> 
> > +	/* re-enable auto refresh and zq cal */
> > +	writel(esdmisc_val, &mmdc0->mdref);
> 
> I think you can just use clrbits32_le here and get rid of zq_val.
> 
> > +	writel(zq_val, &mmdc0->mpzqhwctrl);
> > +
> > +	debug("\tMMDC_MPWLDECTRL0 after write level cal: 0x%08X\n",
> > +	      readl(&mmdc0->mpwldectrl0));
> > +	debug("\tMMDC_MPWLDECTRL1 after write level cal: 0x%08X\n",
> > +	      readl(&mmdc0->mpwldectrl1));
> > +	debug("\tMMDC_MPWLDECTRL0 after write level cal: 0x%08X\n",
> > +	      readl(&mmdc1->mpwldectrl0));
> > +	debug("\tMMDC_MPWLDECTRL1 after write level cal: 0x%08X\n",
> > +	      readl(&mmdc1->mpwldectrl1));
> > +
> 
> Why the dummy reads below?
> 
> I'd like to see the values fed back in an output parameter:
> 	calib->p0_mpwldectrl0 = readl(&mmdc0->mpwldectrl0);
> 	calib->p0_mpwldectrl1 = readl(&mmdc0->mpwldectrl1);
> 
> Absent that, I don't think a dummy read is needed.

You do not need those values anymore, since they are already loaded into the 
hardware ... or why would you want them to be adjusted ?

> > +	/* We must force a readback of these values, to get them to stick */
> > +	readl(&mmdc0->mpwldectrl0);
> > +	readl(&mmdc0->mpwldectrl1);
> > +	readl(&mmdc1->mpwldectrl0);
> > +	readl(&mmdc1->mpwldectrl1);
> > +
> > +	/* enable DDR logic power down timer: */
> > +	setbits_le32(&mmdc0->mdpdc, 0x00005500);
> > +
> 
> Again, s/Adopt/Automatic/
> 
> > +	/* Enable Adopt power down timer: */
> > +	clrbits_le32(&mmdc0->mapsr, 0x1);
> > +
> > +	/* Clear CON_REQ */
> > +	writel(0, &mmdc0->mdscr);
> > +
> > +	return errors;
> > +}
> > +
> 
> This should also have parameters of mx6_ddr_sysinfo (input) and
> mx6_mmdc_calibration (output), at least for sysinfo->dsize

Would it be possible for you to send a subsequent patch(set)? I would like to 
have this code as a working base , since I tested this thoroughly. If I apply 
all of your changes, it would basically mean almost complete rewrite of the code 
and that would disallow me bisect possible bugs introduced by these changes.

[...]