[PATCH v2] pci: Work around PCIe link training failures

Maciej W. Rozycki macro at orcam.me.uk
Thu Nov 18 01:03:58 CET 2021


Hi Stefan,

> > Make use of this observation then and attempt to detect the inability to
> > negotiate the link speed automatically, and then handle it by hand.  Use
> > the Data Link Layer Link Active status flag as the primary indicator of
> > successful link speed negotiation, but given that the flag is optional
> > by hardware to implement (the ASM2824 does have it though), resort to
> > checking for the mandatory Link Bandwidth Management Status flag showing
> > that the link speed or width has been changed in an attempt to correct
> > unreliable link operation (the ASM2824 does set it too).
> > 
> > If these checks indicate that link may not operate correctly, then poll
> > the Data Link Layer Link Active status flag along with the Link Training
> > flag for the duration of 200ms to see if the link has stabilised, that
> > is either that the Data Link Layer Link Active status flag has been set
> > or that Link Training has been inactive during at least the second half
> > of the inteval.
> > 
> > If that has indicated failure, reduce the target speed, request a link
> > retrain and check again if the link has stabilised.  Repeat until either
> > successful or the link speeds supported by the downstream port have been
> > exhausted.
> 
> So in such cases, the link speed will be downgraded? I would expect at
> least a big warning in such cases.

 I had mixed feelings about such extra clutter and chose not to include 
it, but perhaps it's worth adding after all, especially with the most 
recent findings, noted below.

> Did you try to change some other configuration options for the link
> establishment? I remember that on some hardware we were able to get
> better "link-up results" by setting the de-emphasis level to -3.5dB
> instead of -6dB (in the link control status register 2), before trying
> to re-estblish the link. Did you also test with "tuning" such
> parameters. There might be other, which I'm missing right now.

 Thank you for the suggestion.  I've never been too familiar with analogue 
electronics engineering, so I didn't consider working at that level.

 So as it has turned out the ASM2824 has the de-emphasis level already set 
to -3.5dB by default (at power-up or reset; the power-up default is 0063, 
and some bits are sticky, so may not change at reset).  Interestingly, the 
bit is defined as HwInit, and as such it is meant to be "read-only after 
intialization", however with the ASM2824 it appears freely writable at any 
time and state reported in other registers and at the other end of the 
link indicates these changes do take effect.  They do not fix the issue 
with link training though; I have tried both settings to no avail.

 However while fiddling with the register I have discovered an interesting 
phenomenon in that the link will actually switch to 5GT/s and then work 
reliably, provided that it is done in two steps: first clamping the target 
link speed to 2.5GT/s and letting link training succeed at it, and only 
then switching the target link speed to 5GT/s (or for that matter 8GT/s).  
At that point the link changes to 5GT/s instantaneously (there's no Link 
Training reported active, not even momentarily, or Data Link Layer Link 
Active reported inactive), as shown by the Link Status Register at both 
ends (and the de-emphasis level does not matter; it works at either value, 
as reported in the Link Status 2 register, again at both ends).

 It makes me suspect that the problem with link negotiation is at the data 
link layer, rather than at the physical layer as I originally thought.  
IOW the two devices disagree at the protocol rather than electrical level, 
and only allowing a higher link speed once the data link layer has gone up 
somehow avoids the incompatibility.

 It works the same regardless of whether I change the target link speed in 
U-Boot (whether by firmware code itself or by poking with commands entered 
at the prompt by hand) or in Linux (with `setpci').  However changing the 
target link speed back to any beyond 2.5GT/s in U-Boot has an unfortunate 
side effect of devices behind the problematic link being only accessible 
until Linux boots.  This is because Linux issues a reset to the PCIe tree, 
which causes the link to be reinitialised with the target link speed set 
beyond 2.5GT/s, and that brings the problem back.  The reset however does 
not cause an issue and lets devices behind the problematic link continue 
working if the target link speed has been set by U-Boot to 2.5GT/s.  This 
is because the Target Link Speed field in the Link Control 2 register is 
sticky, so the clamp continues to be applied.

 So I think my observations above have implications as follows:

1. We don't need to try lower and lower target link speeds as a workaround 
   in U-Boot.  It is enough if we force any link found problematic just to 
   2.5GT/s, as a minimal requirement to make such a link to work and also 
   the speed all PCIe devices must support.

2. We don't want to try to switch to any higher speed afterwards in U-Boot
   as it will prevent an OS that does not have a workaround in place, but 
   issues a PCIe reset from working with devices behind such a problematic 
   link.  I think we ought to do our best to prevent that from happening,
   i.e. have the most robust workaround possible.

3. We do want to have a more sophisticated workaround in Linux (and other 
   OSes, as someone steps in to implement one) that will ensure correct 
   hot-plug operation and also give better performance.  With hot-plug 
   events possible at any time an OS driver cannot do aggressive polling 
   however, so I think unlike with the workaround in U-Boot it'll have to 
   be structured differently, e.g. if it is vendor:device-specific, then 
   it can rely on ASM2824 Data Link Layer Link Active reporting facility 
   and can sleep instead then as it doesn't have to do the link training 
   detection dance.

4. This all means that changes for U-Boot and Linux will definitely have 
   to be different each.

I'll work on simplifying the U-Boot change along the lines outlined above 
then, and also look into a corresponding workaround for Linux.

 Thank you very much indeed for your feedback, I think I have now made 
good progress here.

  Maciej


More information about the U-Boot mailing list