imx8mp: pci enumeration fails with current HEAD
Miquel Raynal
miquel.raynal at bootlin.com
Thu Apr 17 16:54:08 CEST 2025
Hello Heiko,
On 17/04/2025 at 16:37:39 +02, Heiko Schocher <hs at denx.de> wrote:
> Hi Miquel,
>
> I have here an imx8mp based board for which I just prepare mainlining.
>
> pci enumeration works fine with U-Boot 2025.04
>
> u-boot=> pci enum
> PCIE-0: Link up (Gen1-x1, Bus0)
> u-boot=>
>
> and tftp through the ethernet interface works fine.
>
> Using current HEAD
>
> * 5b4ae0f3f04 - (origin/master, origin/HEAD) mailmap: update my name and
> email (vor 2 Tagen) <Casey Connolly>
>
> It breaks with:
>
> u-boot=> pci enum
> nxp_imx8_pcie_phy pcie-phy at 32f00000: PHY: Failed to power on pcie-phy at 32f00000: -110.
> pcie_dw_imx pcie at 33800000: failed to power on PHY
> u-boot=>
>
> git bisect dropped your patch:
>
> 197376fbf300e92afa0a1583815d9c9eb52d613a is the first bad commit
> commit 197376fbf300e92afa0a1583815d9c9eb52d613a
> Author: Miquel Raynal <miquel.raynal at bootlin.com>
> Date: Thu Apr 3 09:39:05 2025 +0200
>
> power-domain: Add refcounting
>
> It is very surprising that such an uclass, specifically designed to
> handle resources that may be shared by different devices, is not keeping
> the count of the number of times a power domain has been
> enabled/disabled to avoid shutting it down unexpectedly or disabling it
> several times.
>
> Doing this causes troubles on eg. i.MX8MP because disabling power
> domains can be done in recursive loops were the same power domain
> disabled up to 4 times in a row. PGCs seem to have tight FSM internal
> timings to respect and it is easy to produce a race condition that puts
> the power domains in an unstable state, leading to ADB400 errors and
> later crashes in Linux.
>
> CI tests using power domains are slightly updated to make sure the count
> of on/off calls is even and the results match what we *now* expect.
>
> As we do not want to break existing users while stile getting
> interesting error codes, the implementation is split between:
> - a low-level helper reporting error codes if the requested transition
> could not be operated,
> - a higher-level helper ignoring the "non error" codes, like EALREADY and
> EBUSY.
>
> Signed-off-by: Miquel Raynal <miquel.raynal at bootlin.com>
>
>
> reverting this patch, and it works again fine for me!
That's right, this patch made assumptions that are wrong, I am very
sorry about that. In my understanding there was a power_domain structure
per ID, but in practice there is only one per node, so when two consumer
devices point to the same node that has #power-domain-cells = <1> it
fails to power up the second power domain. I tested this patch using the
imx8mp video pipeline (which works fine), and I wasn't using PCI on it.
It's been reported:
https://lore.kernel.org/u-boot/20250403-ge-mainline-display-support-v6-0-478b5e3dd872@bootlin.com/T/#m68e9d16204a61450084324b99fd571d95932ece0
https://lore.kernel.org/u-boot/20250403-ge-mainline-display-support-v6-0-478b5e3dd872@bootlin.com/T/#me570681ae3cf6c42cd45912de15cb968af55be28
And the patch is being reverted:
https://lore.kernel.org/u-boot/20250417115311.1905411-1-w.egorov@phytec.de/T/#u
> Do you have a imx8mp based hardware with pci?
I do not have PCI on the board I am using for testing, but now that my
attention has been focused on the power domain counting issue I managed
to observe the problem by alternately enabling lcdif1 and lcdif2 which
share the same power domain node.
> Can you (or anyone else with an imx8mp based hardware with pci) approve
> this on a similiar HW?
>
> I try to dig into it, but may you have a fast idea!
>
> Okay, thought about it ... it tries to power on "blk-ctrl at 32f10000"
> driver:/drivers/power/domain/imx8mp-hsiomix.c
>
> imx8mp.dtsi:
> hsio_blk_ctrl: blk-ctrl at 32f10000 {
> compatible = "fsl,imx8mp-hsio-blk-ctrl", "syscon";
>
> which has several different power domains ... and your patch prevents to
> enable more than one of them ... adding some debugs shows:
>
> u-boot=> pci enum
> [...]
> power_domain_on_lowlevel(power_domain=00000000fe5942f8) blk-ctrl at 32f10000 priv->on_count: 0
> power_domain_on_lowlevel(power_domain=00000000fe5942f8) blk-ctrl at 32f10000 power_domain->id: 4
>
> Here pwoer domain id
>
> #define IMX8MP_HSIOBLK_PD_PCIE_PHY 4
>
> gets enabled and on_count increases...
>
> imx8mp_hsiomix_set: ------------ power_domain->id: 4 IMX8MP_HSIOBLK_PD_PCIE: 3 4
> power_domain_on_lowlevel(power_domain=00000000fe5d2408) power-domain at 17 priv->on_count: 0
> power_domain_on_lowlevel(power_domain=00000000fe5d2408) power-domain at 17 power_domain->id: 0
> power_domain_on_lowlevel: ret: 0
> power_domain_on_lowlevel(power_domain=00000000fe5d2480) power-domain at 1 priv->on_count: 0
> power_domain_on_lowlevel(power_domain=00000000fe5d2480) power-domain at 1 power_domain->id: 0
> power_domain_on_lowlevel: ret: 0
> imx8mp_hsiomix_set: ------------ power_domain->id: 4 IMX8MP_HSIOBLK_PD_PCIE: 3 4 END OKAY
> power_domain_on_lowlevel: ret: 0
> power_domain_get_by_index(dev=00000000fe5b2410, power_domain=00000000fe594458)
> power_domain_on_lowlevel(power_domain=00000000fe594458) blk-ctrl at 32f10000 priv->on_count: 1
> power_domain_on_lowlevel(power_domain=00000000fe594458) blk-ctrl at 32f10000 power_domain->id: 3
>
> Now power domain
>
> #define IMX8MP_HSIOBLK_PD_PCIE 3
>
> should be enabled, but since your patch this gets prevented as "priv->on_count: 1"
>
> imx8_pcie_phy_power_on: ------------ pad_mode: 1
> imx8_pcie_phy_power_on: ------------ pad_mode: 1 ret: -110
> nxp_imx8_pcie_phy pcie-phy at 32f00000: PHY: Failed to power on pcie-phy at 32f00000: -110.
> pcie_dw_imx pcie at 33800000: failed to power on PHY
> u-boot=>
>
> I have not a fast solution, may you have an idea? I try to look
> after the easter holidays into it. May we need for each power_domain-> id such
> an on counter...
I fully agree with your observations and conclusion, I was hoping for an
easy fix but the fact that the power domain uclass would not retain any
useful data nor know how many sub-domains there are is a bit difficult
to workaround.
This needs to be thought deeper, I will report whenever I have an idea
how to do it, for now you can revert the patch locally (or wait for the
upstream proposal to get in), and do not hesitate to share your findings
if you have an idea on how to handle that properly.
Sorry for the breakage,
Miquèl
More information about the U-Boot
mailing list