[U-Boot] [PATCH v2 14/23] sunxi: H3: add DRAM controller single bit delay support

Andre Przywara andre.przywara at arm.com
Mon Dec 5 12:28:33 CET 2016


Hi,

On 05/12/16 07:58, Chen-Yu Tsai wrote:
> On Mon, Dec 5, 2016 at 2:26 PM, Simon Glass <sjg at chromium.org> wrote:
>> Hi Andre,
>>
>> On 4 December 2016 at 18:52, Andre Przywara <andre.przywara at arm.com> wrote:
>>> From: Jens Kuske <jenskuske at gmail.com>
>>>
>>> Instead of setting the delay for whole bytes allow setting
>>> it for each individual bit. Also add support for
>>> address/command lane delays.
>>>
>>> Signed-off-by: Jens Kuske <jenskuske at gmail.com>
>>> Signed-off-by: Andre Przywara <andre.przywara at arm.com>
>>> ---
>>>  arch/arm/mach-sunxi/dram_sun8i_h3.c | 54 ++++++++++++++++++-------------------
>>>  1 file changed, 27 insertions(+), 27 deletions(-)
>>
>> ACBDLR_WRITE_DELAY_SHIFT
>>
>>>
>>> diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c
>>> index 3dd6803..1647d76 100644
>>> --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c
>>> +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c
>>> @@ -16,12 +16,13 @@
>>>  #include <linux/kconfig.h>
>>>
>>>  struct dram_para {
>>> -       u32 read_delays;
>>> -       u32 write_delays;
>>>         u16 page_size;
>>>         u8 bus_width;
>>>         u8 dual_rank;
>>>         u8 row_bits;
>>> +       const u8 dx_read_delays[4][11];
>>
>> Can we have #defines for 4 and 11?
>>
>>> +       const u8 dx_write_delays[4][11];
>>> +       const u8 ac_delays[31];
>>>  };
>>>
>>>  static inline int ns_to_t(int nanoseconds)
>>> @@ -64,34 +65,25 @@ static void mctl_phy_init(u32 val)
>>>         mctl_await_completion(&mctl_ctl->pgsr[0], PGSR_INIT_DONE, 0x1);
>>>  }
>>>
>>> -static void mctl_dq_delay(u32 read, u32 write)
>>> +static void mctl_set_bit_delays(struct dram_para *para)
>>>  {
>>>         struct sunxi_mctl_ctl_reg * const mctl_ctl =
>>>                         (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE;
>>>         int i, j;
>>> -       u32 val;
>>> -
>>> -       for (i = 0; i < 4; i++) {
>>> -               val = DXBDLR_WRITE_DELAY((write >> (i * 4)) & 0xf) |
>>> -                     DXBDLR_READ_DELAY(((read >> (i * 4)) & 0xf) * 2);
>>> -
>>> -               for (j = DXBDLR_DQ(0); j <= DXBDLR_DM; j++)
>>> -                       writel(val, &mctl_ctl->dx[i].bdlr[j]);
>>> -       }
>>>
>>>         clrbits_le32(&mctl_ctl->pgcr[0], 1 << 26);
>>>
>>> -       for (i = 0; i < 4; i++) {
>>> -               val = DXBDLR_WRITE_DELAY((write >> (16 + i * 4)) & 0xf) |
>>> -                     DXBDLR_READ_DELAY((read >> (16 + i * 4)) & 0xf);
>>> +       for (i = 0; i < 4; i++)
>>> +               for (j = 0; j < 11; j++)
>>> +                       writel(DXBDLR_WRITE_DELAY(para->dx_write_delays[i][j]) |
>>> +                              DXBDLR_READ_DELAY(para->dx_read_delays[i][j]),
>>> +                              &mctl_ctl->dx[i].bdlr[j]);
>>>
>>> -               writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQS]);
>>> -               writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQSN]);
>>> -       }
>>> +       for (i = 0; i < 31; i++)
>>> +               writel(ACBDLR_WRITE_DELAY(para->ac_delays[i]),
>>> +                      &mctl_ctl->acbdlr[i]);
>>>
>>>         setbits_le32(&mctl_ctl->pgcr[0], 1 << 26);
>>> -
>>> -       udelay(1);
>>>  }
>>>
>>>  static void mctl_set_master_priority(void)
>>> @@ -372,11 +364,8 @@ static int mctl_channel_init(struct dram_para *para)
>>>         clrsetbits_le32(&mctl_ctl->dtcr, 0xf << 24,
>>>                         (para->dual_rank ? 0x3 : 0x1) << 24);
>>>
>>> -
>>> -       if (para->read_delays || para->write_delays) {
>>> -               mctl_dq_delay(para->read_delays, para->write_delays);
>>> -               udelay(50);
>>> -       }
>>> +       mctl_set_bit_delays(para);
>>> +       udelay(50);
>>>
>>>         mctl_zq_calibration(para);
>>>
>>> @@ -458,12 +447,23 @@ unsigned long sunxi_dram_init(void)
>>>                         (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE;
>>>
>>>         struct dram_para para = {
>>> -               .read_delays = 0x00007979,      /* dram_tpr12 */
>>> -               .write_delays = 0x6aaa0000,     /* dram_tpr11 */
>>>                 .dual_rank = 0,
>>>                 .bus_width = 32,
>>>                 .row_bits = 15,
>>>                 .page_size = 4096,
>>> +
>>> +               .dx_read_delays =  {{ 18, 18, 18, 18, 18, 18, 18, 18, 18,  0,  0 },
>>> +                                   { 14, 14, 14, 14, 14, 14, 14, 14, 14,  0,  0 },
>>> +                                   { 18, 18, 18, 18, 18, 18, 18, 18, 18,  0,  0 },
>>> +                                   { 14, 14, 14, 14, 14, 14, 14, 14, 14,  0,  0 }},
>>> +               .dx_write_delays = {{  0,  0,  0,  0,  0,  0,  0,  0,  0, 10, 10 },
>>> +                                   {  0,  0,  0,  0,  0,  0,  0,  0,  0, 10, 10 },
>>> +                                   {  0,  0,  0,  0,  0,  0,  0,  0,  0, 10, 10 },
>>> +                                   {  0,  0,  0,  0,  0,  0,  0,  0,  0,  6,  6 }},
>>> +               .ac_delays = {  0,  0,  0,  0,  0,  0,  0,  0,
>>> +                               0,  0,  0,  0,  0,  0,  0,  0,
>>> +                               0,  0,  0,  0,  0,  0,  0,  0,
>>> +                               0,  0,  0,  0,  0,  0,  0      },
>>>         };
>>>
>>>         mctl_sys_init(&para);
>>> --
>>> 2.8.2
>>>
>>
>> I wonder if there is value in moving this to device tree with of-platdata?

While I kind of like the idea of using the DT for this, there are some
issues:

1) There is no binding so far for representing the DRAM data. Given the
lacking documentation for the DRAM controller it sounds very hard to
come up with a good binding anyway. Also we can't push this through the
Linux DT binding review, since this is of no interest to the kernel. And
I'd rather avoid making up some dodgy binding just for this.

There is work underway to improve the DRAM init code and make it more
robust and flexible. Ideally we can use some autodetection and
calibration feature the controller offers to get rid of arbitrary magic
numbers. But this is quite some work ahead and shouldn't block the much
sought after A64 SPL support for now.

2) If there is need, we can detect the SoC easily by reading the ID
register and differentiate at runtime. This is probably less code than
pulling in DT bits, also more robust.

> I think device tree support is unlikely to fit in SPL for sunxi.
> IIRC Andre already mentions the space constraints in his cover letter.

3) Yes, adding DT support for the SPL makes it rather big. I think it
breaks the 28K limit that the mksunxiboot tool currently has. This can
(and will) be fixed later, but just for this exercise I'd rather keep it
small, especially as we would use it only for the DRAM code and not for
the device drivers.

Actually I have a plan to make better use of DT, but not for the SPL. To
a good degree the SPL code mimics the on-SoC boot ROM operation
(accessing storage devices to load code), which has to work with every
board already and thus does not need a board specific DT.
I can elaborate on that if there is interest.

Cheers,
Andre.


More information about the U-Boot mailing list