[U-Boot] [PATCH v5 15/26] clk: sunxi: Add ccu clock tree support
Andre Przywara
andre.przywara at arm.com
Tue Jan 8 11:39:35 UTC 2019
On Tue, 8 Jan 2019 16:27:14 +0530
Jagan Teki <jagan at amarulasolutions.com> wrote:
Hi,
> On Mon, Jan 7, 2019 at 6:35 AM André Przywara
> <andre.przywara at arm.com> wrote:
> >
> > On 31/12/2018 16:59, Jagan Teki wrote:
> > > Clock control unit comprises of parent clocks, gates,
> > > multiplexers, dividers, multipliers, pre/post dividers and flags
> > > etc.
> > >
> > > So, the U-Boot implementation of ccu has divided into gates and
> > > tree. gates are generic clock configuration of enable/disable bit
> > > management which can be handle via ccu_clock_gate.
> >
> > So if I understand this correctly, you implement the gate
> > functionality separately from the complex clock code, even if they
> > are the same clock from the DT point of view? So if one wants to
> > enable the MMC0 clock, which is a mux/divider clock, one needs to
> > specify an extra entry in the gate array to describe the enable bit
> > in this special clock register? Sounds a bit surprising, but is
> > probably a neat trick to keep things simple. There should be a
> > comment in the code to this regard then.
>
> Exactly. Idea is to keep the macro's as simple as possible.
>
> Adding gates clocks separately make easy and reasonable way to
> enable/disable clock operations. We even operate with single macro
> with all clock attributes along with gate(either another member or
> common structure like Linux does), but that seems not simple as per as
> my experince since there are many IP's like USB's just need
> enable/disable.
>
> >
> > > Tree clocks are parent clock type, fixed clocks, mp, nk, nkm,
> > > nkmp, pre/post div, flags etc. which were managed via
> > > ccu_clock_tree.
> >
> > For a start, can we use more descriptive names than those very
> > specific MP/NK names? DIV_MUX and PLL sound more descriptive to me.
> > I understand that Linux uses those terms, but it would be great if
> > uninitiated people have a chance to understand this as well.
> >
> > > This patch add support for MP, NK, MISC, FIXED clock types as
> > > part of ccu clock tree with get_rate functionality this
> > > eventually used by uart driver. and rest of the infrastructure
> > > will try to add while CLK is being used on respective peripherals.
> > >
> > > Note that few of the tree type clock would require to enable
> > > gates on their specific clock, in that case we need to add the
> > > gate details via ccu_clock_gate, example: MP with gate so the
> > > gate offset, bit value should add as part of ccu_clock_gate.
> > >
> > > Signed-off-by: Jagan Teki <jagan at amarulasolutions.com>
> > > ---
> > > arch/arm/include/asm/arch-sunxi/ccu.h | 192
> > > +++++++++++++++++++++++++- drivers/clk/sunxi/clk_a64.c
> > > | 40 ++++++ drivers/clk/sunxi/clk_sunxi.c | 182
> > > ++++++++++++++++++++++++ 3 files changed, 413 insertions(+), 1
> > > deletion(-)
> > >
> > > diff --git a/arch/arm/include/asm/arch-sunxi/ccu.h
> > > b/arch/arm/include/asm/arch-sunxi/ccu.h index
> > > 3fdc26978d..61b8c36b3b 100644 ---
> > > a/arch/arm/include/asm/arch-sunxi/ccu.h +++
> > > b/arch/arm/include/asm/arch-sunxi/ccu.h @@ -7,15 +7,204 @@
> > > #ifndef _ASM_ARCH_CCU_H
> > > #define _ASM_ARCH_CCU_H
> > >
> > > +#define OSC_32K_ULL 32000ULL
> >
> > 32768
> >
> > And why ULL? The whole Allwinner clock system works with 32-bit
> > values, so just U would be totally sufficient. This avoid blowing
> > this up to 64 bit unnecessarily, which sounds painful for those
> > poor ARMv7 parts.
> > > +#define OSC_24M_ULL 24000000ULL
> > > +
> > > +/**
> > > + * enum ccu_clk_type - ccu clock types
> > > + *
> > > + * @CCU_CLK_TYPE_MISC: misc clock type
> >
> > What is MISC, exactly? Seems like an artefact clock to me, some
> > placeholder you need because gate clocks are handled separately in
> > the gates struct. Should this be called something with SIMPLE
> > instead, or GATE?
>
> Unlike MP type in MMC, UART doesn't need any clock attributes like
> dividers, mux, etc, just have attached parent. I don't think UART
> clock is simple one It has parent, that indeed have another parents
> and so...on, ie reason I named as MISC.
Not really, as far as I can see the UART clock is a just a gate clock as
many others, with one parent (APB2).
The fact that APB2 in turn can have multiple parents doesn't affect the
UART clock itself, as you model this via the clock tree.
In fact we could have similar clocks in the tree structure for the
other gate clocks (USB, for instance), it's just that the UART is the
only user so far which actually queries the clock rate.
So MISC is way too generic, I would still prefer CCU_CLK_TYPE_GATE.
> > > + * @CCU_CLK_TYPE_FIXED: fixed clock type
> > > + * @CCU_CLK_TYPE_MP: mp clock type
> > > + * @CCU_CLK_TYPE_NK: nk clock type
> >
> > What is the point of those comments, as you are basically repeating
> > the enum name? What about:
> > * @CCU_CLK_TYPE_PLL: PLL clock with two multiplier fields
> > * @CCU_CLK_TYPE_MUX_DIV: multiple parents, two divider fields
> >
> > This gives more speaking names, plus some explanation.
> >
> > > + */
> > > +enum ccu_clk_type {
> > > + CCU_CLK_TYPE_MISC = 0,
> > > + CCU_CLK_TYPE_FIXED = 1,
> > > + CCU_CLK_TYPE_MP = 2,
> > > + CCU_CLK_TYPE_NK = 3,
> > > +};
> > > +
> > > /**
> > > * enum ccu_clk_flags - ccu clock flags
> > > *
> > > - * @CCU_CLK_F_INIT_DONE: clock gate init done check
> > > + * @CCU_CLK_F_INIT_DONE: clock tree/gate init done
> > > check
> >
> > Is this flag to tell implemented clocks apart from unimplemented
> > ones, which would be reset to all zeroes by the compiler? Then it
> > should be called something with VALID in it.
>
> Since the flags attached with real numeric initialized by macro itself
> rather than some other source like complier or any, so marked
> INIT_DONE seems to meaningful.
When I read INIT_DONE I understand some code has initialised this
clock at some point, which isn't true. I don't fight the flag itself,
just the name.
> and eventually the same verified in code whether the init done or not.
>
> >
> > > + * @CCU_CLK_F_POSTDIV: clock post divider
> > > */
> > > enum ccu_clk_flags {
> > > CCU_CLK_F_INIT_DONE = BIT(0),
> > > + CCU_CLK_F_POSTDIV = BIT(1),
> > > };
> > >
> > > +/**
> > > + * struct ccu_mult - ccu clock multiplier
> > > + *
> > > + * @shift: multiplier shift value
> > > + * @width: multiplier width value
> > > + * @offset: multiplier offset
> > > + * @min: minimum multiplier
> > > + * @max: maximum multiplier
> > > + */
> > > +struct ccu_mult {
> > > + u8 shift;
> > > + u8 width;
> > > + u8 offset;
> > > + u8 min;
> > > + u8 max;
> > > +};
> > > +
> > > +#define _CCU_MULT_OFF_MIN_MAX(_shift, _width,
> > > _offset, \
> > > + _min, _max) { \
> > > + .shift = _shift, \
> > > + .width = _width, \
> > > + .offset = _offset, \
> > > + .min = _min, \
> > > + .max = _max, \
> > > +}
> > > +
> > > +#define _CCU_MULT_MIN(_shift, _width, _min) \
> > > + _CCU_MULT_OFF_MIN_MAX(_shift, _width, 1, _min, 0)
> > > +
> > > +#define _CCU_MULT(_shift, _width) \
> > > + _CCU_MULT_OFF_MIN_MAX(_shift, _width, 1, 1, 0)
> > > +
> > > +/**
> > > + * struct ccu_mux - ccu clock multiplexer
> > > + *
> > > + * @shift: multiplexer shift value
> > > + * @width: multiplexer width value
> > > + */
> > > +struct ccu_mux {
> > > + u8 shift;
> > > + u8 width;
> > > +};
> > > +
> > > +#define _CCU_MUX(_shift, _width) { \
> > > + .shift = _shift, \
> > > + .width = _width, \
> > > +}
> > > +
> > > +/**
> > > + * struct ccu_div - ccu clock divider
> > > + *
> > > + * @shift: divider shift value
> > > + * @width: divider width value
> > > + * @offset: divider offset
> > > + * @max: maximum divider value
> > > + */
> > > +struct ccu_div {
> > > + u8 shift;
> > > + u8 width;
> > > + u32 offset;
> > > + u32 max;
> > > +};
> > > +
> > > +#define _CCU_DIV(_shift, _width) { \
> > > + .shift = _shift, \
> > > + .width = _width, \
> > > + .offset = 1, \
> > > + .max = 0, \
> > > +}
> > > +
> > > +/**
> > > + * struct ccu_clk_tree - ccu clock tree
> > > + *
> > > + * @parent: parent clock tree
> > > + * @type: clock type
> > > + * @off: clock tree offset
> > > + * @m: divider m
> > > + * @p: divider p
> > > + * @mux: multiplexer mux
> > > + * @post: post divider value
> > > + * @n: multiplier n
> > > + * @k: multiplier k
> > > + * @fixed_rate: fixed rate
> > > + * @flags: clock tree flags
> > > + */
> > > +struct ccu_clk_tree {
> > > + const unsigned long *parent;
> >
> > Shouldn't that be called "parents" instead?
> >
> > > + enum ccu_clk_type type;
> > > + u16 off;
> > > +
> > > + struct ccu_div m;
> > > + struct ccu_div p;
> > > + struct ccu_mux mux;
> > > + unsigned int postdiv;
> > > +
> > > + struct ccu_mult n;
> > > + struct ccu_mult k;
> > > +
> > > + ulong fixed_rate;
> > > + enum ccu_clk_flags flags;
> > > +};
> > > +
> > > +#define TREE(_parent, _type, _off, \
> >
> > Just a nit, but TREE is somewhat confusing here, as this just
> > constructs a single entry in an array. The tree property is
> > realised through the parent array, if I get this correctly.
> > So should we name this ENTRY or CLK_ENTRY instead?
> >
> > > + _m, _p, \
> > > + _mux, \
> > > + _postdiv, \
> > > + _n, _k, \
> > > + _fixed_rate, \
> > > + _flags) { \
> > > + .parent = _parent, \
> > > + .type = _type, \
> > > + .off = _off, \
> > > + .m = _m, \
> > > + .p = _p, \
> > > + .mux = _mux, \
> > > + .postdiv = _postdiv, \
> > > + .n = _n, \
> > > + .k = _k, \
> > > + .fixed_rate = _fixed_rate, \
> > > + .flags = _flags, \
> > > +}
> > > +
> > > +#define
> > > MISC(_parent) \
> > > + TREE(_parent, CCU_CLK_TYPE_MISC, 0, \
> > > + {0}, {0}, \
> > > + {0}, \
> > > + 0, \
> > > + {0}, {0}, \
> > > + 0, \
> > > + CCU_CLK_F_INIT_DONE)
> >
> > If MISC is really something like GATE or SIMPLE, would it be
> > possible to construct the single element array here, so that the
> > macro just takes the one parent clock ID here, instead of referring
> > to an extra array?
> > > +
> > > +#define FIXED(_fixed_rate) \
> > > + TREE(NULL, CCU_CLK_TYPE_FIXED, 0, \
> > > + {0}, {0}, \
> > > + {0}, \
> > > + 0, \
> > > + {0}, {0}, \
> > > + _fixed_rate, \
> > > + CCU_CLK_F_INIT_DONE)
> > > +
> > > +#define NK(_parent, _off, \
> > > + _nshift, _nwidth, \
> > > + _kshift, _kwidth, _kmin, \
> > > + _postdiv, \
> > > + _flags) \
> > > + TREE(_parent, CCU_CLK_TYPE_NK, _off, \
> > > + {0}, {0}, \
> > > + {0}, \
> > > + _postdiv, \
> > > + _CCU_MULT(_nshift, _nwidth), \
> > > + _CCU_MULT_MIN(_kshift, _kwidth, _kmin), \
> > > + 0, \
> > > + CCU_CLK_F_INIT_DONE | _flags)
> > > +
> > > +#define MP(_parent, _off, \
> > > + _mshift, _mwidth, \
> > > + _pshift, _pwidth, \
> > > + _muxshift, _muxwidth, \
> > > + _postdiv, \
> > > + _flags) \
> > > + TREE(_parent, CCU_CLK_TYPE_MP, _off, \
> > > + _CCU_DIV(_mshift, _mwidth), \
> > > + _CCU_DIV(_pshift, _pwidth), \
> > > + _CCU_MUX(_muxshift, _muxwidth), \
> > > + _postdiv, \
> > > + {0}, {0}, \
> > > + 0, \
> > > + CCU_CLK_F_INIT_DONE | _flags)
> > > +
> > > /**
> > > * struct ccu_clk_gate - ccu clock gate
> > > * @off: gate offset
> > > @@ -59,6 +248,7 @@ struct ccu_reset {
> > > * @resets: reset unit
> > > */
> > > struct ccu_desc {
> > > + const struct ccu_clk_tree *tree;
> > > const struct ccu_clk_gate *gates;
> > > const struct ccu_reset *resets;
> > > };
> > > diff --git a/drivers/clk/sunxi/clk_a64.c
> > > b/drivers/clk/sunxi/clk_a64.c index 162ec769d6..1d0cd98183 100644
> > > --- a/drivers/clk/sunxi/clk_a64.c
> > > +++ b/drivers/clk/sunxi/clk_a64.c
> > > @@ -12,6 +12,45 @@
> > > #include <dt-bindings/clock/sun50i-a64-ccu.h>
> > > #include <dt-bindings/reset/sun50i-a64-ccu.h>
> > >
> > > +#define CLK_APB2 26
> > > +#define CLK_OSC_32K (CLK_GPU + 1)
> > > +#define CLK_OSC_24M (CLK_OSC_32K + 1)
> > > +
> > > +static const unsigned long periph0_parents[] = {
> >
> > Why is this long? That's the DT clock index, which is 32 bit wide,
> > right? Just unsigned int or uint32_t should be sufficient. long is
> > quite a treacherous type to use, especially as we share code
> > between 32 and 64-bit architectures.
> >
> > > + CLK_OSC_24M,
> > > +};
> > > +
> > > +static const unsigned long apb2_parents[] = {
> > > + CLK_OSC_32K,
> > > + CLK_OSC_24M,
> > > + CLK_PLL_PERIPH0,
> > > + CLK_PLL_PERIPH0,
> > > +};
> > > +
> > > +static const unsigned long uart_parents[] = {
> > > + CLK_APB2,
> > > +};
> > > +
> > > +static const struct ccu_clk_tree a64_tree[] = {
> > > + [CLK_OSC_32K] = FIXED(OSC_32K_ULL),
> > > + [CLK_OSC_24M] = FIXED(OSC_24M_ULL),
> > > +
> > > + [CLK_PLL_PERIPH0] = NK(periph0_parents, 0x028,
> > > + 8, 5, /* N */
> > > + 4, 2, 2, /* K */
> > > + 2, /* post-div */
> > > + CCU_CLK_F_POSTDIV),
> > > +
> > > + [CLK_APB2] = MP(apb2_parents, 0x058,
> > > + 0, 5, /* M */
> > > + 16, 2, /* P */
> > > + 24, 2, /* mux */
> > > + 0,
> > > + 0),
> > > +
> > > + [CLK_BUS_UART0] = MISC(uart_parents),
> > > +};
> > > +
> > > static const struct ccu_clk_gate a64_gates[] = {
> > > [CLK_BUS_OTG] = GATE(0x060, BIT(23)),
> > > [CLK_BUS_EHCI0] = GATE(0x060, BIT(24)),
> > > @@ -52,6 +91,7 @@ static const struct ccu_reset a64_resets[] = {
> > > };
> > >
> > > static const struct ccu_desc a64_ccu_desc = {
> > > + .tree = a64_tree,
> > > .gates = a64_gates,
> > > .resets = a64_resets,
> > > };
> > > diff --git a/drivers/clk/sunxi/clk_sunxi.c
> > > b/drivers/clk/sunxi/clk_sunxi.c index 345d706c2a..2aebd257d1
> > > 100644 --- a/drivers/clk/sunxi/clk_sunxi.c
> > > +++ b/drivers/clk/sunxi/clk_sunxi.c
> > > @@ -18,6 +18,187 @@ static const struct ccu_clk_gate
> > > *priv_to_gate(struct ccu_priv *priv, return
> > > &priv->desc->gates[id]; }
> > >
> > > +static const struct ccu_clk_tree *priv_to_tree(struct ccu_priv
> > > *priv,
> > > + unsigned long id)
> >
> > Again, why long here? Especially as it's u32 in the function below.
> >
> > > +{
> > > + return &priv->desc->tree[id];
> > > +}
> > > +
> > > +static int sunxi_get_parent_idx(const struct ccu_clk_tree *tree,
> > > void *base) +{
> > > + u32 reg, idx;
> > > +
> > > + reg = readl(base + tree->off);
> > > + idx = reg >> tree->mux.shift;
> > > + idx &= (1 << tree->mux.width) - 1;
> > > +
> > > + return idx;
> > > +}
> > > +
> > > +static ulong sunxi_fixed_get_rate(struct clk *clk, unsigned long
> > > id)
> >
> > Same "long" question for both the return type and the id.
> > And for everything below.
> >
> > > +{
> > > + struct ccu_priv *priv = dev_get_priv(clk->dev);
> > > + const struct ccu_clk_tree *tree = priv_to_tree(priv, id);
> > > +
> > > + if (!(tree->flags & CCU_CLK_F_INIT_DONE)) {
> > > + printf("%s: (CLK#%ld) unhandled\n", __func__,
> > > clk->id);
> > > + return 0;
> > > + }
> > > +
> > > + return tree->fixed_rate;
> > > +}
> > > +
> >
> > So why are there all those separate functions? Isn't that all the
> > same algorithm: adjust the parent rate based on the clock type?
> > I reworked this with recursive calls and it's MUCH less code:
> >
> > (fill the gaps, using your types for your convenience ;-)
> >
> > static ulong sunxi_apply_pll(struct ccu_priv *priv, ulong id, ulong
> > parent_rate) { NK algorithm of sunxi_nk_get_rate }
> > static ulong sunxi_apply_div(struct ccu_priv *priv, ulong id, ulong
> > parent_rate) { MP algorithm of sunxi_mp_get_rate }
> > static ulong sunxi_calc_clk_rate(struct ccu_priv *priv, ulong clkid)
> > {
> > ....
> > switch (tree->type) {
> > case CCU_CLK_TYPE_MISC:
> > return sunxi_calc_clk_rate(priv, tree->parent[0]);
> > case CCU_CLK_TYPE_FIXED:
> > return tree->fixed_rate;
> > case CCU_CLK_TYPE_NK:
> > rate = sunxi_calc_clk_rate(priv,
> > sunxi_get_parent_id(tree, priv->base));
> > return sunxi_apply_pll(priv, clkid, rate);
> > (similar for _MP)
> > ...
> > }
>
> Initially I would tried the recursive and yes code can reduce but
> using recursive can leed more disadvantage in-terms of code tracing
> during long run. Due to all these factors I used simple function
> calls.
But I find those extra functions much more confusing, due to the
similar names and their very similar functionality. Also it seems that
you just implemented what we need so far, so you will probably need to
extend those functions, making them even more similar and duplicating
more code. Basically you try to roll out the tree structure.
Since the clocks are organised in a tree-like structure, I believe this
recursive definition is a much better fit: A clock takes one of
possibly multiple input clocks and adjusts this rate. Full stop. The
rest is then just connecting them to other clocks.
The code looks much simpler and is much smaller this way:
https://gist.github.com/apritzel/db93dd06b4defb46504bccbfe4fc2c20#file-sunxi_clk-c-L86-L112
Typically the recursion depth is just two or three levels, so I don't
buy the argument of code tracing.
Cheers,
Andre.
More information about the U-Boot
mailing list