[PATCH] rockchip: Add delay after link-training

Kurt Miller kurt at intricatesoftware.com
Tue Jun 30 19:57:08 CEST 2020


On Sat, 2020-06-27 at 20:57 +0800, Kever Yang wrote:
> Hi Kurt,
> 
> 
> On 2020/6/4 上午5:17, Peter Geis wrote:
> 
> > 
> > On Tue, Jun 2, 2020 at 11:12 AM Kurt Miller <kurt at intricatesoftware.com> wrote:
> > > 
> > > On Tue, 2020-06-02 at 10:23 +0800, Shawn Lin wrote:
> > > > 
> > > > 在 2020/6/2 9:59, Kever Yang 写道:
> > > > > 
> > > > > Hi Kurt,
> > > > > 
> > > > > On 2020/6/2 上午4:30, Kurt Miller wrote:
> > > > > > 
> > > > > > On at least the RockPro64, many cards will trip a
> > > > > > synchronous abort when first accessing PCIe config space
> > > > > > during bus scanning. A delay after link training allows
> > > > > > some of these cards to function.
> > > > > > 
> > > > > > Signed-off-by: Kurt Miller <kurt at intricatesoftware.com>
> > > > > > ---
> > > > > > On the RockPro64, some pci cards trip a synchronous abort when
> > > > > > scanning the
> > > > > > pci bus. For example with HighPoint Rocket Raid 640L which is based on
> > > > > > Marvell 88SE9230 I see this:
> > > > > > 
> > > > > > => pci
> > > > > > "Synchronous Abort" handler, esr 0x96000210
> > > > > > elr: 000000000022d034 lr : 000000000022cfd0 (reloc)
> > > > > > elr: 00000000f4568034 lr : 00000000f4567fd0
> > > > > > x0 : 0000000000100000 x1 : 00000000f8000000
> > > > > > x2 : 0000000000000000 x3 : 0000000000100000
> > > > > > x4 : 00000000f2559290 x5 : 0000000000000000
> > > > > > x6 : 0000000000000001 x7 : 00000000f2559860
> > > > > > x8 : 0000000000000030 x9 : 0000000000000008
> > > > > > x10: 0000000000000010 x11: 00000000f251fd1c
> > > > > > x12: 0000000000001421 x13: 0000000000001468
> > > > > > x14: 00000000f251fd4c x15: 00000000ffffffff
> > > > > > x16: 0000000000060001 x17: 000000000000001f
> > > > > > x18: 00000000f2532dc0 x19: 00000000f251fcd0
> > > > > > x20: 0000000000000001 x21: 0000000000000000
> > > > > > x22: 0000000000010000 x23: 00000000f45d4000
> > > > > > x24: 0000000000000000 x25: 00000000f45bc000
> > > > > > x26: 0000000000000000 x27: 0000000000000000
> > > > > > x28: 00000000f2541440 x29: 00000000f251fc20
> > > > > > 
> > > > > > Code: 540000c1 350000a5 93407c00 f9400081 (b8616800)
> > > > > > Resetting CPU ...
> > > > > > 
> > > > > > Adding a delay after link training works-around the problem. I added this
> > > > > > delay to the OpenBSD rkpcie driver as well:
> > > > > > 
> > > > > > https://github.com/openbsd/src/commit/9857dee3520d8ca5bec68538f4b0708d7e64fc87
> > > > > > 
> > > > > > 
> > > > > > HighPoint Rocket Raid 640L needs a 1.75 sec delay and Crossfield
> > > > > > SAS9211-4i
> > > > > > needs a 1 second delay, so I arbitrarily decided on 2 seconds.
> > > > > > 
> > > > > > The delay work-around was originally discovered by nuumio:
> > > > > > https://github.com/nuumio/linux-kernel/commit/5a65b17686002dc84d461bffa324a2cb68e67aee
> > > > > > 
> > > > > > 
> > > > > >    drivers/pci/pcie_rockchip.c | 8 ++++++++
> > > > > >    1 file changed, 8 insertions(+)
> > > > > > 
> > > > > > diff --git a/drivers/pci/pcie_rockchip.c b/drivers/pci/pcie_rockchip.c
> > > > > > index 0edc2464a8..51cfbf6b18 100644
> > > > > > --- a/drivers/pci/pcie_rockchip.c
> > > > > > +++ b/drivers/pci/pcie_rockchip.c
> > > > > > @@ -288,6 +288,14 @@ static int rockchip_pcie_init_port(struct udevice
> > > > > > *dev)
> > > > > >            goto err_power_off_phy;
> > > > > >        }
> > > > > > +    /*
> > > > > > +     * XXX: On at least the RockPro64, many cards will trip a
> > > > > > +     * synchronous abort when first accessing PCIe config space
> > > > > > +     * during bus scanning. A delay after link training allows
> > > > > > +     * some of these cards to function.
> > > > > > +     */
> > > > > > +    mdelay(2000);
> > > > > I don't understand what kind of delay for init needs 2 Seconds, root
> > > > > cause will preferred.
> > > Hi Kever,
> > > 
> > > While working on this issue for the OpenBSD PCIe driver I was not
> > > able to determine the root cause. I tested the following adapters:
> > > 
> > > ROCKPro64 2 Port SATA
> > > StarTech PEXSAT32 2 Port SATA
> > > Samsung 970 Evo NVMe w/m.2 adapter
> > > IO CREST SI-PEX40148 2 Port SATA
> > > IO CREST SI-PEX40057 4 port
> > > HighPoint Rocket Raid 640L
> > > Crossfield SAS9211-4i
> > > Del PERC H200
> > > Dell PERC 6/i
> > > Intel Gigabit VT Quad Port Server
> > > 
> > > All of the above adapters successfully link trained, however
> > > three of them would panic upon the first read of the PCI config
> > > space. nuumio's work-around of placing a delay after link-training
> > > allows these cards to work.
> > > 
> > > HighPoint Rocket Raid 640L ~1.75 sec delay
> > > Crossfield SAS9211-4i ~1 sec delay
> > > Dell PERC H200 ~1 sec delay
> > > 
> > > In attempt to determine if there was a way to detect how long
> > > to wait, I compared every status and debug register documented
> > > in the rk3399 TRM part 2 for the PCI controller. I compared the
> > > values pre-delay and post-delay. I was not able to find a value
> > > that would indicate it was safe to proceed.
> > > 
> > > > 
> > > > Strictly speaking, how long should we need for this had been provided,
> > > > see this:
> > > > 
> > > > https://patchwork.kernel.org/patch/11561977/
> I can accept the same fix like kernel which is 100ms, but 2 Second is 
> really too much for most cases.
> 

Hi Kever,

Thank you for following up on this. The 100ms delay before link-training
appears to be solving a different issue where some cards fail to link
train. From the comment in the referenced patch "Otherwise we do see some
failures for link training."

The delay I am proposing is after successful link-training to avoid the
synchronous abort that happens when scanning the pci bus and attempting
read the pci config space for some cards.

I performed a test where I added the 100ms delay before link-training
and it did not correct the synchronous abort on the HighPoint Rocket
Raid 640L.

Shawn Lin suggested I check if the power supply is stable before
enabling training. How would I go about doing this? 

Regards,
-Kurt

> 
> Thanks,
> 
> - Kever
> 
> > 
> > > 
> > > > 
> > > > 
> > > > If you need more delay, I  highly suspect you should check if
> > > > the power supply is stable before enabling training.
> > > Hi Shawn,
> > > 
> > > Thank you for pointing out the need for the 100ms delay before
> > > beginning link-training. I believe this is to correct link
> > > training failures, while the delay after link training is
> > > to work-around post link training reading of PCI config
> > > space on certain cards.
> > There are similar issues on the Linux driver with various cards
> > randomly throwing an abort.
> > If it is power, a 2 second wait to allow cards to stabilize after
> > power on might be wise for the Linux driver as well.
> > 
> > I imagine some cards take longer to complete power on reset than
> > others, and attempting to read them immediately after powering them up
> > could be the issue here.
> > 
> > > 
> > > In my testing I discussed above, I suspected that power
> > > was a likely cause. The HighPoint Rocket Raid 640L is
> > > a good test card because it documents its 3.3v power at
> > > 0.7 watts:
> > > 
> > > https://highpoint-tech.com/USA_new/series_r600-specifications.htm
> > > 
> > > How can I check if the power supply is stable?
> > > 
> > > Regards,
> > > -Kurt
> > > 
> > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > Thanks,
> > > > > 
> > > > > - Kever
> > > > > 
> > > > > > 
> > > > > > +
> > > > > >        /* Initialize Root Complex registers. */
> > > > > >        writel(PCIE_LM_VENDOR_ROCKCHIP, priv->apb_base +
> > > > > > PCIE_LM_VENDOR_ID);
> > > > > >        writel(PCI_CLASS_BRIDGE_PCI << 16,
> 



More information about the U-Boot mailing list