[PATCH v2] pci: Work around PCIe link training failures
Maciej W. Rozycki
macro at orcam.me.uk
Thu Nov 18 01:03:58 CET 2021
Hi Stefan,
> > Make use of this observation then and attempt to detect the inability to
> > negotiate the link speed automatically, and then handle it by hand. Use
> > the Data Link Layer Link Active status flag as the primary indicator of
> > successful link speed negotiation, but given that the flag is optional
> > by hardware to implement (the ASM2824 does have it though), resort to
> > checking for the mandatory Link Bandwidth Management Status flag showing
> > that the link speed or width has been changed in an attempt to correct
> > unreliable link operation (the ASM2824 does set it too).
> >
> > If these checks indicate that link may not operate correctly, then poll
> > the Data Link Layer Link Active status flag along with the Link Training
> > flag for the duration of 200ms to see if the link has stabilised, that
> > is either that the Data Link Layer Link Active status flag has been set
> > or that Link Training has been inactive during at least the second half
> > of the inteval.
> >
> > If that has indicated failure, reduce the target speed, request a link
> > retrain and check again if the link has stabilised. Repeat until either
> > successful or the link speeds supported by the downstream port have been
> > exhausted.
>
> So in such cases, the link speed will be downgraded? I would expect at
> least a big warning in such cases.
I had mixed feelings about such extra clutter and chose not to include
it, but perhaps it's worth adding after all, especially with the most
recent findings, noted below.
> Did you try to change some other configuration options for the link
> establishment? I remember that on some hardware we were able to get
> better "link-up results" by setting the de-emphasis level to -3.5dB
> instead of -6dB (in the link control status register 2), before trying
> to re-estblish the link. Did you also test with "tuning" such
> parameters. There might be other, which I'm missing right now.
Thank you for the suggestion. I've never been too familiar with analogue
electronics engineering, so I didn't consider working at that level.
So as it has turned out the ASM2824 has the de-emphasis level already set
to -3.5dB by default (at power-up or reset; the power-up default is 0063,
and some bits are sticky, so may not change at reset). Interestingly, the
bit is defined as HwInit, and as such it is meant to be "read-only after
intialization", however with the ASM2824 it appears freely writable at any
time and state reported in other registers and at the other end of the
link indicates these changes do take effect. They do not fix the issue
with link training though; I have tried both settings to no avail.
However while fiddling with the register I have discovered an interesting
phenomenon in that the link will actually switch to 5GT/s and then work
reliably, provided that it is done in two steps: first clamping the target
link speed to 2.5GT/s and letting link training succeed at it, and only
then switching the target link speed to 5GT/s (or for that matter 8GT/s).
At that point the link changes to 5GT/s instantaneously (there's no Link
Training reported active, not even momentarily, or Data Link Layer Link
Active reported inactive), as shown by the Link Status Register at both
ends (and the de-emphasis level does not matter; it works at either value,
as reported in the Link Status 2 register, again at both ends).
It makes me suspect that the problem with link negotiation is at the data
link layer, rather than at the physical layer as I originally thought.
IOW the two devices disagree at the protocol rather than electrical level,
and only allowing a higher link speed once the data link layer has gone up
somehow avoids the incompatibility.
It works the same regardless of whether I change the target link speed in
U-Boot (whether by firmware code itself or by poking with commands entered
at the prompt by hand) or in Linux (with `setpci'). However changing the
target link speed back to any beyond 2.5GT/s in U-Boot has an unfortunate
side effect of devices behind the problematic link being only accessible
until Linux boots. This is because Linux issues a reset to the PCIe tree,
which causes the link to be reinitialised with the target link speed set
beyond 2.5GT/s, and that brings the problem back. The reset however does
not cause an issue and lets devices behind the problematic link continue
working if the target link speed has been set by U-Boot to 2.5GT/s. This
is because the Target Link Speed field in the Link Control 2 register is
sticky, so the clamp continues to be applied.
So I think my observations above have implications as follows:
1. We don't need to try lower and lower target link speeds as a workaround
in U-Boot. It is enough if we force any link found problematic just to
2.5GT/s, as a minimal requirement to make such a link to work and also
the speed all PCIe devices must support.
2. We don't want to try to switch to any higher speed afterwards in U-Boot
as it will prevent an OS that does not have a workaround in place, but
issues a PCIe reset from working with devices behind such a problematic
link. I think we ought to do our best to prevent that from happening,
i.e. have the most robust workaround possible.
3. We do want to have a more sophisticated workaround in Linux (and other
OSes, as someone steps in to implement one) that will ensure correct
hot-plug operation and also give better performance. With hot-plug
events possible at any time an OS driver cannot do aggressive polling
however, so I think unlike with the workaround in U-Boot it'll have to
be structured differently, e.g. if it is vendor:device-specific, then
it can rely on ASM2824 Data Link Layer Link Active reporting facility
and can sleep instead then as it doesn't have to do the link training
detection dance.
4. This all means that changes for U-Boot and Linux will definitely have
to be different each.
I'll work on simplifying the U-Boot change along the lines outlined above
then, and also look into a corresponding workaround for Linux.
Thank you very much indeed for your feedback, I think I have now made
good progress here.
Maciej
More information about the U-Boot
mailing list