[PATCH 2/6] clk: actions: Add SD/MMC clocks

Mon Dec 14 16:29:16 CET 2020

On 14/12/2020 14:12, Amit Tomer wrote:
> Hi,
> 
> Thanks for having the detailed look and providing comments:
> 
>> According to the datasheet the clock source could also be NAND_PLL,
>> depending on bit 9.
>> Both PLLs use the same rate calculation, so it's just matter of the PLL
>> address offset to use for covering both.
> 
> Ok, should I change the comment as Clock output of DEV/NAND_PLL ?

I meant you should read the parent rate either from the NAND or the DEV
PLL, depending on this bit.
Since you force the parent to be DEV that's not strictly needed for this
current use case, but it's much cleaner to do so in the *clock* driver,
and as I said, is not hard to do.

> 
>>> +
>>> +     /* Clock output of DEV_PLL
>>> +      * Range: 48M ~ 756M
>>> +      * Frequency= DEVPLLCLK * 6
>>> +      */
>>> +     parent_rate = readl(priv->base + CMU_DEVPLL) & 0x7f;
>>> +     parent_rate *= 6000000;
>>> +
>>> +     rate *= 2;
>>
>> Where does this come from?
> 
> While experimenting about it, I found that doubling input rate doubles the
> speed , and BSP code does the same.

Sure doubling the input frequency improves the performance, but is also
wrong. If the card can do twice as much *reliably*, then it would
advertise more advanced speed modes, and we could use them. Otherwise
it's just overclocking.
You should check what the card reports (SDR50?), then check which clock
value you set, and what the performance is (for reading a bigger number
of sectors). If that matches up (SDR50 => 50 MHz => ~25MB/s), then it's
correct. Everything else should be investigated.

> Not sure if it is a good enough explanation or maybe its side effect of not
> programming divider value well when its >=128. Would check it again.

This explanation sounds very dodgy. There might be a hidden divider
somewhere, but this should be discoverable by the test I described above.
And since the "divide-by-128" is only needed for the initial 400KHz
communication, I don't think it's related here.

> 
>>> +     div = (parent_rate / rate) - 1;
>>
>> Please check rate being not zero before you dividing by it. And I would
>> keep the real divider value for now, so drop the -1 here.
> 
> Ok. I would make the check.
> 
>>> +
>>> +     if (div >= 128)
>>> +             div |= 0x100;
>>
>> This does not seem right. You would need to divide the value by 128,
>> once you decided to use the "divide-by-128" bit (which is bit 8).
>> What about moving this below, after you know the register value?
>> Then you can divide by 128 and set bit 8.
> 
> Sure, Thanks for explaining it in detail. It helped understanding
> ACTIONS not so usual
> clock design.
> 
>>> +     val = readl(priv->base + (CMU_SD0CLK + sd_index*0x4));
>>> +     /* Bits 4..0 is used to program div value */
>>> +     val &= ~0x1f;
>>> +     val |= div;
>>
>> Please mask div before applying.
> 
> But I am masking the first 4.0 bit above , ~0x1f;

No, you are masking val, not div. If div is bigger than 31, you set
other bits in this register.

>>> +     /* As per Manuals Bits 31..10 are reserved but Bits 11 and
>>> +      * 10 needed to be set for proper operation
>>> +      */
>>> +     val |= 0xc00;
>>
>> Where does this come from? I don't see the Linux driver doing that?
> 
> Actually if I don't set 11th and 10th bit , I see SD speed of 2KiB/s
> and unstable behaviour
> while loading files from the card.
> BSP U-boot driver chooses a value 0xfffffce0 for it.

I wouldn't rely on downstream BSP U-Boot for *anything*.
I guess you see normal speed in (almost) mainline Linux? Check what we
do with the clock register there (can dump it once the driver has
initialised).
I would rather mimic the mainline kernel here than anything else.

Cheers,
Andre