[U-Boot] [PATCH] nvme: add more cache flushes

Thu Oct 17 02:55:11 UTC 2019

Hi Patrick,

On Wed, Oct 16, 2019 at 11:35 PM Patrick Wildt <patrick at blueri.se> wrote:
>
> On Wed, Oct 16, 2019 at 06:11:23PM +0800, Bin Meng wrote:
> > On Mon, Oct 14, 2019 at 7:11 PM Patrick Wildt <patrick at blueri.se> wrote:
> > >
> > > On an i.MX8MQ our nvme driver doesn't completely work since we are
> > > missing a few cache flushes.  One is the prp list, which is an extra
> > > buffer that we need to flush before handing it to the hardware.  Also
> > > the block read/write operations needs more cache flushes on this SoC.
> > >
> > > Signed-off-by: Patrick Wildt <patrick at blueri.se>
> > > ---
> > >  drivers/nvme/nvme.c | 15 +++++++++------
> > >  1 file changed, 9 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/nvme/nvme.c b/drivers/nvme/nvme.c
> > > index 2444e0270f..69d5e3eedc 100644
> > > --- a/drivers/nvme/nvme.c
> > > +++ b/drivers/nvme/nvme.c
> > > @@ -123,6 +123,9 @@ static int nvme_setup_prps(struct nvme_dev *dev, u64 *prp2,
> > >         }
> > >         *prp2 = (ulong)dev->prp_pool;
> > >
> > > +       flush_dcache_range((ulong)dev->prp_pool, (ulong)dev->prp_pool +
> > > +                          dev->prp_entry_num * sizeof(u64));
> > > +
> > >         return 0;
> > >  }
> > >
> > > @@ -717,9 +720,10 @@ static ulong nvme_blk_rw(struct udevice *udev, lbaint_t blknr,
> > >         u16 lbas = 1 << (dev->max_transfer_shift - ns->lba_shift);
> > >         u64 total_lbas = blkcnt;
> > >
> > > -       if (!read)
> > > -               flush_dcache_range((unsigned long)buffer,
> > > -                                  (unsigned long)buffer + total_len);
> > > +       flush_dcache_range((unsigned long)buffer,
> > > +                          (unsigned long)buffer + total_len);
> >
> > Why we need this for read?
>
> I'm no processor engineer, but I have read (and "seen") the following
> on ARMs.  If I'm wrong. please correct me.
>
> Since the buffer is usually allocated cache-aligned on the stack,
> it is very possible that this buffer was previously still used
> as it's supposed to be used: as stack.  Thus, the caches can still
> be filled, and are not yet evicted/flushed to memory.  Now it is
> possible that between the DMA access from the device and our cache
> invalidation, the CPU cache writes back its contents to memory,
> overwriting whatever the NVMe just gave us.

OK, this makes sense. So if we allocate the buffer from the heap, we
should only care about flush on write, right? Can you test this?

>
> > > +       invalidate_dcache_range((unsigned long)buffer,
> > > +                               (unsigned long)buffer + total_len);
> > >
> > >         c.rw.opcode = read ? nvme_cmd_read : nvme_cmd_write;
> > >         c.rw.flags = 0;
> > > @@ -755,9 +759,8 @@ static ulong nvme_blk_rw(struct udevice *udev, lbaint_t blknr,
> > >                 buffer += lbas << ns->lba_shift;
> > >         }
> > >
> > > -       if (read)
> > > -               invalidate_dcache_range((unsigned long)buffer,
> > > -                                       (unsigned long)buffer + total_len);
> > > +       invalidate_dcache_range((unsigned long)buffer,
> > > +                               (unsigned long)buffer + total_len);
> >
> > Why we need this for write?
>
> That's a good point.  After the transaction, if it was a read then
> we need to invalidate it, as we might have speculatively read it.
> On a write, we don't care about its contents.  I will test it w/o
> this chunk and report back.
>

Regards,
Bin