[PATCH v3] pci: Work around PCIe link training failures
Tom Rini
trini at konsulko.com
Sat Jan 15 13:37:38 CET 2022
On Sat, Nov 20, 2021 at 11:03:30PM +0000, Maciej W. Rozycki wrote:
> Attempt to handle cases with a downstream port of a PCIe switch where
> link training never completes and the link continues switching between
> speeds indefinitely with the data link layer never reaching the active
> state.
>
> It has been observed with a downstream port of the ASMedia ASM2824 Gen 3
> switch wired to the upstream port of the Pericom PI7C9X2G304 Gen 2
> switch, using a Delock Riser Card PCI Express x1 > 2 x PCIe x1 device,
> P/N 41433, wired to a SiFive HiFive Unmatched board. In this setup the
> switches are supposed to negotiate the link speed of preferably 5.0GT/s,
> falling back to 2.5GT/s.
>
> However the link continues oscillating between the two speeds, at the
> rate of 34-35 times per second, with link training reported repeatedly
> active ~84% of the time, e.g.:
>
> 02:03.0 PCI bridge [0604]: ASMedia Technology Inc. ASM2824 PCIe Gen3 Packet Switch [1b21:2824] (rev 01) (prog-if 00 [Normal decode])
> [...]
> Bus: primary=02, secondary=05, subordinate=05, sec-latency=0
> [...]
> Capabilities: [80] Express (v2) Downstream Port (Slot+), MSI 00
> [...]
> LnkSta: Speed 5GT/s (downgraded), Width x1 (ok)
> TrErr- Train+ SlotClk+ DLActive- BWMgmt+ ABWMgmt-
> [...]
> LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis+, Selectable De-emphasis: -3.5dB
> Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
> Compliance De-emphasis: -6dB
> [...]
>
> Forcibly limiting the target link speed to 2.5GT/s with the upstream
> ASM2824 device makes the two switches communicate correctly however:
>
> 02:03.0 PCI bridge [0604]: ASMedia Technology Inc. ASM2824 PCIe Gen3 Packet Switch [1b21:2824] (rev 01) (prog-if 00 [Normal decode])
> [...]
> Bus: primary=02, secondary=05, subordinate=09, sec-latency=0
> [...]
> Capabilities: [80] Express (v2) Downstream Port (Slot+), MSI 00
> [...]
> LnkSta: Speed 2.5GT/s (downgraded), Width x1 (ok)
> TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-
> [...]
> LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis+, Selectable De-emphasis: -3.5dB
> Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
> Compliance De-emphasis: -6dB
> [...]
>
> and then:
>
> 05:00.0 PCI bridge [0604]: Pericom Semiconductor PI7C9X2G304 EL/SL PCIe2 3-Port/4-Lane Packet Switch [12d8:2304] (rev 05) (prog-if 00 [Normal decode])
> [...]
> Bus: primary=05, secondary=06, subordinate=09, sec-latency=0
> [...]
> Capabilities: [c0] Express (v2) Upstream Port, MSI 00
> [...]
> LnkSta: Speed 2.5GT/s (downgraded), Width x1 (downgraded)
> TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> [...]
> LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
> Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
> Compliance De-emphasis: -6dB
> [...]
>
> Make use of this observation then and attempt to detect the inability to
> negotiate the link speed automatically, and then handle it by hand. Use
> the Data Link Layer Link Active status flag as the primary indicator of
> successful link speed negotiation, but given that the flag is optional
> by hardware to implement (the ASM2824 does have it though), resort to
> checking for the mandatory Link Bandwidth Management Status flag showing
> that the link speed or width has been changed in an attempt to correct
> unreliable link operation (the ASM2824 does set it too).
>
> If these checks indicate that link may not operate correctly, then poll
> the Data Link Layer Link Active status flag along with the Link Training
> flag for the duration of 200ms to see if the link has stabilised, that
> is either that the Data Link Layer Link Active status flag has been set
> or that Link Training has been inactive during at least the second half
> of the interval.
>
> If that has indicated failure, restrict the target speed to 2.5GT/s,
> request a link retrain and check again if the link has stabilised. If
> that does not work either, then restore the original speed setting and
> claim defeat, otherwise we are done.
>
> NB interestingly enough with the ASM2824 vs PI7C9X2G304 configuration
> referred above asking the ASM2824 to retrain with a higher target link
> speed once the 2.5GT/s speed has been negotiated makes the two devices
> successfully negotiate 5.0GT/s. Lifting the 2.5GT/s speed restriction
> would however prevent our workaround from working with an OS that issues
> a reset and that is unaware of the problem. This is because the devices
> would then try to negotiate a higher link speed from scratch and fail,
> while the sticky property of the Target Link Speed setting will keep the
> 2.5GT/s speed restriction across a reset.
>
> Keep the 2.5GT/s speed restriction then, conservatively, if functional
> once applied.
>
> Signed-off-by: Maciej W. Rozycki <macro at orcam.me.uk>
> Reviewed-by: Stefan Roese <sr at denx.de>
Applied to u-boot/master, thanks!
--
Tom
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 659 bytes
Desc: not available
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20220115/f373197c/attachment.sig>
More information about the U-Boot
mailing list