[U-Boot] [PATCHv2] fsl_ddr: Don't use full 64-bit divides on32-bit PowerPC

Moffett, Kyle D Kyle.D.Moffett at boeing.com
Wed Aug 3 21:07:33 CEST 2011


On Jul 29, 2011, at 17:46, York Sun wrote:
> On Fri, 2011-07-29 at 16:26 -0500, Moffett, Kyle D wrote:
>> On Jul 29, 2011, at 14:46, York Sun wrote:
>>> On Fri, 2011-07-29 at 11:35 -0700, York Sun wrote:
>>>> On Fri, 2011-07-29 at 12:20 -0500, Moffett, Kyle D wrote:
>>>>> On Jul 28, 2011, at 17:41, York Sun wrote:
>>>>>> I found a problem with the round up. Please try
>>>>>> 
>>>>>> picos_to_mclk(mclk_to_picos(3))
>>>>>> 
>>>>>> You will notice the result is round up. I am sure it is unexpected.
>>>>> 
>>>>> Hmm... what does fsl_ddr_get_mem_data_rate() return on your system?
>>>>> I cannot reproduce this here with my memory freq of 800MHz.
>>>>> 
>>>>> [...snip...]
>>>>> 
>>>>> Obviously you could also get rid of the rounding alltogether, but I
>>>>> don't know why it was put in the code in the first place...
>>>> 
>>>> I think the problem comes from the round up. But your calculation of
>>>> remainder is not right.
>> 
>> Can you tell me what your value of get_ddr_freq() is?  I may be able to
>> reproduce the issue here.
> 
> 264000000

Ok, so that's a 264MHz DDR frequency, right?  264 * 10^6?

If I plug that number in to the code I can reproduce your result:
  picos_to_mclk(mclk_to_picos(3)) == 4

But that happens with both the new code *AND* the old code.  I ran them
side-by-side in the same test-harness:
  OLD:
  get_memory_clk_period_ps(): 7580
  picos_to_mclk(mclk_to_picos(3)): 4
  ----------------------
  NEW:
  get_memory_clk_period_ps(): 7580
  picos_to_mclk(mclk_to_picos(3)): 4

So there's a bug, but it's always been there...

Just to show the floating-point math really quick:

  2 * (10^12 picosec/sec) / (264 * 10^6 cyc/sec)
     == 7575.757575..... picoseconds

Since the code rounds to the nearest 10ps, we get 7580ps.  Then 3 mclk is
22740ps, but it should be 22727.272727.....

The picos_to_mclk() function always rounds up, because we must avoid
undercutting the SPD timing margin.  If your SPD specifies a minimum of a
7590ps timing window, you MUST use a 2 mclk delay (even though it's
painfully close to 1 mclk), because otherwise you will get RAM errors.

Therefore mclk_to_picos() and get_memory_clk_period_ps() should always
round down, even though they currently round to the closest 10ps.

Continuing the example:  If we later we do picos_to_mclk(22740):
  22740ps * (264 * 10^5 cyc/sec) * 0.5 / (10^12 picosec/sec)
    == 3.00168

When we do the integer math with remainder, we get:
  3 rem 3360000000

In both cases, we correctly automatically round up to 4.

I believe I have a fix for the problem; here's my fixed test case log:
  get_memory_clk_period_ps(): 7575
  picos_to_mclk(mclk_to_picos(3)): 3

And here is the new code (the picos_to_mclk() function is unchanged):

  unsigned int mclk_to_picos(unsigned int mclk)
  {
          unsigned int data_rate = get_ddr_freq(0);

          /* Be careful to avoid a 64-bit full-divide (result fits in 32) */
          unsigned long long mclk_ps = ULL_2E12 * mclk;
          do_div(mclk_ps, data_rate);
          return mclk_ps;
  }

  unsigned int get_memory_clk_period_ps(void)
  {
          return mclk_to_picos(1);
  }

Cheers,
Kyle Moffett


More information about the U-Boot mailing list