[PATCH] mtd: spi-nor: fix dummy buswidth to use data bus width

Begari, Padmarao Padmarao.Begari at amd.com
Fri Jun 19 16:03:18 CEST 2026


AMD General

Hi Takahiro,

We observed a data integrity issue with Quad SPI flashes on our AMD (Xilinx) FPGA
platforms during testing with mainline U-Boot. and this patch submitted to address the issue.
however, it is not required, my detailed response is provided below.

> From: Takahiro.Kuwano at infineon.com <Takahiro.Kuwano at infineon.com>
> Sent: Friday, June 12, 2026 3:04 PM
> To: Begari, Padmarao <Padmarao.Begari at amd.com>; u-boot at lists.denx.de;
> Simek, Michal <michal.simek at amd.com>
> Cc: git (AMD-Xilinx) <git at amd.com>; vigneshr at ti.com; trini at konsulko.com
> Subject: RE: [PATCH] mtd: spi-nor: fix dummy buswidth to use data bus width
>
> Hi,
>
> > In the current implementation, the dummy buswidth is set equal to the
> > address buswidth. In case of quad SPI (mode 1-1-4), where the address
> > width is 1, the dummy buswidth is also set to 1. Due to this, the
> > controller driver introduces 8 dummy cycles on the data line (D0) only
> > during a read operation.
>
> Doesn't the controller driver introduce 8 dummy 'clock' cycles that makes all data
> lines (D3..D0) idle (typically High-Z) before reading?

You are correct, the commit message description is inaccurate. The flash
device drives all data lines (D3..D0) to High-Z during the dummy cycles.

The actual issue lies on the controller side. When the dummy GENFIFO entry
is configured in SPI mode (buswidth = 1), the Xilinx ZynqMP GQSPI controller
actively drives the IO lines with fixed values during the dummy phase
(IO0 = 0, IO2 = 1, IO3 = 1).

When the data phase begins in QUAD mode, the controller must switch IO0, IO2,
and IO3 from output (controller-driven) to input (flash-driven). This direction
transition incurs a one-clock pipeline delay in the hardware. As a result,
during the first data clock, the stale controller-driven values (IO0 = 0,
IO2 = 1, IO3 = 1) are sampled instead of the actual flash data, corrupting
the first received byte.

When the dummy bus width is set equal to the data bus width (i.e., 4), the
GQSPI controller uses QUAD mode for the dummy GENFIFO entry. In this mode,
all four IO lines are released to High-Z during the dummy phase. This allows
the direction switch to complete before the first data clock, ensuring correct
data capture.

However, instead of changing the dummy bus width to match the data bus width,
the proper fix is to address the one-clock delay in the ZynqMP GQSPI controller
driver.

Currently, in the zynqmp_gqspi driver, the function zynqmp_qspi_fill_gen_fifo()
starts execution and waits for completion after each GENFIFO entry. This causes
the GENFIFO to become empty after the dummy entry. When the GENFIFO is empty,
the hardware returns to an idle state with SPI-mode IO pin values (IO0 = 0,
IO2 = 1, IO3 = 1). When the data entry is processed next, the first data clock
samples these stale values instead of the flash data.

Fix: Use zynqmp_qspi_write_gen_fifo() (write-only, without starting execution)
for the command, address, and dummy entries so that they are queued together in
the GENFIFO. The GENFIFO execution should be started only once when the data
entry is added. This ensures that all entries execute back-to-back without
interruption. As a result, the hardware pipeline prefetches the data entry
during the final dummy clock, and the IO direction switch completes cleanly
before the first data clock.

I have validated this approach and will send a patch for the ZynqMP GQSPI driver.
This patch can be ignored.

>
> >
> > Since 4 data lines are used in quad SPI mode, the dummy buswidth
> > should be set to 4. The controller driver computes the number of dummy
> > clock cycles as:
> >
> >   dummy_cycles = op->dummy.nbytes * 8 / op->dummy.buswidth;
> >
> > With dummy.buswidth corrected to 4, the controller produces 2 clock
> > cycles of dummy spread across all 4 data lines (D0-D3), which is
> > equivalent to 8 dummy bit-times, the same as before. This fix applies
> > to all bus width configurations (single, dual, quad and octal).
>
> MTD driver converts dummy cycles to dummy bytes by:
>
>     op.dummy.nbytes = (nor->read_dummy * op.dummy.buswidth) / 8;
>
> So, if you change the dummy.buswidth, the controller driver will get same number of
> dummy_cycles.

Yes, You're correct.

Regards
Padmarao

>
> >
> > Signed-off-by: Padmarao Begari <padmarao.begari at amd.com>
> > ---
> >  drivers/mtd/spi/spi-nor-core.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/mtd/spi/spi-nor-core.c
> > b/drivers/mtd/spi/spi-nor-core.c index 937d79af64e..19446265cd5 100644
> > --- a/drivers/mtd/spi/spi-nor-core.c
> > +++ b/drivers/mtd/spi/spi-nor-core.c
> > @@ -264,7 +264,7 @@ void spi_nor_setup_op(const struct spi_nor *nor,
> >                 op->addr.buswidth =
> > spi_nor_get_protocol_addr_nbits(proto);
> >
> >         if (op->dummy.nbytes)
> > -               op->dummy.buswidth = spi_nor_get_protocol_addr_nbits(proto);
> > +               op->dummy.buswidth =
> > + spi_nor_get_protocol_data_nbits(proto);
> >
> >         if (op->data.nbytes)
> >                 op->data.buswidth =
> > spi_nor_get_protocol_data_nbits(proto);
> > --
> > 2.34.1
>
> Thanks,
> Takahiro



More information about the U-Boot mailing list