[PATCH] pinctrl: single: parse gpio-range as a raw array (O(N^2) -> O(N))
Anshul Dalal
anshuld at ti.com
Mon Jun 22 12:04:41 CEST 2026
On Sun, 14 Jun 2026 12:32:32 +0200, Jordi Trepat Mur <yordy1902 at gmail.com> wrote:
> single_add_gpio_func() calls ofnode_parse_phandle_with_args() once per
> gpio-range entry. With a flat device tree, every call re-iterates the
> property from index 0 and, because cellname is set, performs an
> fdt_node_offset_by_phandle() (a full-FDT scan) at every step. The total
> cost is therefore quadratic in the number of entries and proportional
> to the size of the device tree.
>
> On a TI J722S board this makes the pinctrl probe take 723 ms
I'm unable to reproduce the delays you have observed with the J722s EVM. Are you
using a custom DT perhaps?
>
>
> diff --git a/drivers/pinctrl/pinctrl-single.c b/drivers/pinctrl/pinctrl-single.c
> index 42980e097e0..442979519d4 100644
> --- a/drivers/pinctrl/pinctrl-single.c
> +++ b/drivers/pinctrl/pinctrl-single.c
> @@ -511,27 +511,61 @@ static int single_add_gpio_func(struct udevice *dev)
> const char *cellname = "#pinctrl-single,gpio-range-cells";
> struct single_gpiofunc_range *range;
> struct ofnode_phandle_args gpiospec;
> - int ret, i;
> -
> - for (i = 0; ; i++) {
> - ret = ofnode_parse_phandle_with_args(dev_ofnode(dev), propname,
> - cellname, 0, i, &gpiospec);
> - /* Do not treat it as error. Only treat it as end condition. */
> - if (ret) {
> - ret = 0;
> - break;
> + const __be32 *list;
> + int size, total_cells, cells_per_entry, entries, ret, i;
> + int ret_total = 0;
Why rename 'ret' to 'ret_total' here? From what I can see, it's still being used
for the same purpose.
> +
> + /*
> + * Read the property once and iterate over the raw cell array in
> + * memory. Calling ofnode_parse_phandle_with_args() once per index
> + * re-walks the property from the start on every call and resolves
> + * the phandle (a full-FDT scan) at each step, which is O(N^2) in
> + * the number of entries and very slow pre-relocation with caches
> + * off. Resolve only the first entry's phandle to validate the
> + * cell layout; the target node itself is never used here.
> + */
> + list = ofnode_get_property(dev_ofnode(dev), propname, &size);
> + if (!list || size <= 0)
The size check is redundant here, the size is only negative with error code if
list is NULL.
> [ ... skip 20 lines ... ]
> +
> + for (i = 0; i < entries; i++) {
> + const __be32 *entry = list + i * cells_per_entry;
> + u32 phandle_i = be32_to_cpup(entry + 0);
> +
> + if (phandle_i == 0) {
You could just do `if (!phandle_i)` here. Also, is the validation for a null
phandle required here? Wouldn't a properly configured DT never hit this case?
> + /* empty entry - match original: skip without creating range */
> + continue;
> }
> range = devm_kzalloc(dev, sizeof(*range), GFP_KERNEL);
I wonder if we can just allocate a pool of ranges[entries] outside the loop
since we already know how many entries we would need. This would avoid the
repeated calls to malloc and the subsequent null check.
--
Anshul Dalal <anshuld at ti.com>
More information about the U-Boot
mailing list