[U-Boot] [PATCHv2] fsl_ddr: Don't use full 64-bit divides on32-bit PowerPC
Moffett, Kyle D
Kyle.D.Moffett at boeing.com
Wed Aug 3 21:07:33 CEST 2011
On Jul 29, 2011, at 17:46, York Sun wrote:
> On Fri, 2011-07-29 at 16:26 -0500, Moffett, Kyle D wrote:
>> On Jul 29, 2011, at 14:46, York Sun wrote:
>>> On Fri, 2011-07-29 at 11:35 -0700, York Sun wrote:
>>>> On Fri, 2011-07-29 at 12:20 -0500, Moffett, Kyle D wrote:
>>>>> On Jul 28, 2011, at 17:41, York Sun wrote:
>>>>>> I found a problem with the round up. Please try
>>>>>>
>>>>>> picos_to_mclk(mclk_to_picos(3))
>>>>>>
>>>>>> You will notice the result is round up. I am sure it is unexpected.
>>>>>
>>>>> Hmm... what does fsl_ddr_get_mem_data_rate() return on your system?
>>>>> I cannot reproduce this here with my memory freq of 800MHz.
>>>>>
>>>>> [...snip...]
>>>>>
>>>>> Obviously you could also get rid of the rounding alltogether, but I
>>>>> don't know why it was put in the code in the first place...
>>>>
>>>> I think the problem comes from the round up. But your calculation of
>>>> remainder is not right.
>>
>> Can you tell me what your value of get_ddr_freq() is? I may be able to
>> reproduce the issue here.
>
> 264000000
Ok, so that's a 264MHz DDR frequency, right? 264 * 10^6?
If I plug that number in to the code I can reproduce your result:
picos_to_mclk(mclk_to_picos(3)) == 4
But that happens with both the new code *AND* the old code. I ran them
side-by-side in the same test-harness:
OLD:
get_memory_clk_period_ps(): 7580
picos_to_mclk(mclk_to_picos(3)): 4
----------------------
NEW:
get_memory_clk_period_ps(): 7580
picos_to_mclk(mclk_to_picos(3)): 4
So there's a bug, but it's always been there...
Just to show the floating-point math really quick:
2 * (10^12 picosec/sec) / (264 * 10^6 cyc/sec)
== 7575.757575..... picoseconds
Since the code rounds to the nearest 10ps, we get 7580ps. Then 3 mclk is
22740ps, but it should be 22727.272727.....
The picos_to_mclk() function always rounds up, because we must avoid
undercutting the SPD timing margin. If your SPD specifies a minimum of a
7590ps timing window, you MUST use a 2 mclk delay (even though it's
painfully close to 1 mclk), because otherwise you will get RAM errors.
Therefore mclk_to_picos() and get_memory_clk_period_ps() should always
round down, even though they currently round to the closest 10ps.
Continuing the example: If we later we do picos_to_mclk(22740):
22740ps * (264 * 10^5 cyc/sec) * 0.5 / (10^12 picosec/sec)
== 3.00168
When we do the integer math with remainder, we get:
3 rem 3360000000
In both cases, we correctly automatically round up to 4.
I believe I have a fix for the problem; here's my fixed test case log:
get_memory_clk_period_ps(): 7575
picos_to_mclk(mclk_to_picos(3)): 3
And here is the new code (the picos_to_mclk() function is unchanged):
unsigned int mclk_to_picos(unsigned int mclk)
{
unsigned int data_rate = get_ddr_freq(0);
/* Be careful to avoid a 64-bit full-divide (result fits in 32) */
unsigned long long mclk_ps = ULL_2E12 * mclk;
do_div(mclk_ps, data_rate);
return mclk_ps;
}
unsigned int get_memory_clk_period_ps(void)
{
return mclk_to_picos(1);
}
Cheers,
Kyle Moffett
More information about the U-Boot
mailing list