[RFC PATCH] nvme: Always invalidate whole cqes[] array
Neil Armstrong
narmstrong at baylibre.com
Fri Feb 26 17:12:37 CET 2021
On 08/02/2021 14:31, Andre Przywara wrote:
> At the moment nvme_read_completion_status() tries to invidate a single
> member of the cqes[] array, which is shady as just a single entry is
> not cache line aligned.
> The structure is dictated by hardware, and with 16 bytes is smaller than
> any cache line we usually deal with. Also multiple entries need to be
> consecutive in memory, so we can't pad them to cover a whole cache line.
>
> As a consequence we can only always invalidate all of them - U-Boot just
> uses two of them anyway. This is fine, as they are only ever read by the
> CPU (apart from the initial zeroing), so they can't become dirty.
>
> Make this obvious by always invalidating the whole array, regardless of
> the entry number we are about to read.
> Also blow up the allocation size to cover whole cache lines, to avoid
> other heap allocations to sneak in.
>
> Signed-off-by: Andre Przywara <andre.przywara at arm.com>
> ---
> Hi,
>
> this is just compile tested, and should fix the only questionable
> cache invalidate call in this driver.
> Please verify if this fixes any issues!
>
> Cheers,
> Andre
>
> drivers/nvme/nvme.c | 16 ++++++++++++----
> 1 file changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/nvme/nvme.c b/drivers/nvme/nvme.c
> index 5d6331ad346..c9efeff4bc9 100644
> --- a/drivers/nvme/nvme.c
> +++ b/drivers/nvme/nvme.c
> @@ -22,6 +22,8 @@
> #define NVME_AQ_DEPTH 2
> #define NVME_SQ_SIZE(depth) (depth * sizeof(struct nvme_command))
> #define NVME_CQ_SIZE(depth) (depth * sizeof(struct nvme_completion))
> +#define NVME_CQ_ALLOCATION ALIGN(NVME_CQ_SIZE(NVME_Q_DEPTH), \
> + ARCH_DMA_MINALIGN)
> #define ADMIN_TIMEOUT 60
> #define IO_TIMEOUT 30
> #define MAX_PRP_POOL 512
> @@ -144,8 +146,14 @@ static __le16 nvme_get_cmd_id(void)
>
> static u16 nvme_read_completion_status(struct nvme_queue *nvmeq, u16 index)
> {
> - u64 start = (ulong)&nvmeq->cqes[index];
> - u64 stop = start + sizeof(struct nvme_completion);
> + /*
> + * Single CQ entries are always smaller than a cache line, so we
> + * can't invalidate them individually. However CQ entries are
> + * read only by the CPU, so it's safe to always invalidate all of them,
> + * as the cache line should never become dirty.
> + */
> + ulong start = (ulong)&nvmeq->cqes[0];
> + ulong stop = start + NVME_CQ_ALLOCATION;
>
> invalidate_dcache_range(start, stop);
>
> @@ -241,7 +249,7 @@ static struct nvme_queue *nvme_alloc_queue(struct nvme_dev *dev,
> return NULL;
> memset(nvmeq, 0, sizeof(*nvmeq));
>
> - nvmeq->cqes = (void *)memalign(4096, NVME_CQ_SIZE(depth));
> + nvmeq->cqes = (void *)memalign(4096, NVME_CQ_ALLOCATION);
> if (!nvmeq->cqes)
> goto free_nvmeq;
> memset((void *)nvmeq->cqes, 0, NVME_CQ_SIZE(depth));
> @@ -339,7 +347,7 @@ static void nvme_init_queue(struct nvme_queue *nvmeq, u16 qid)
> nvmeq->q_db = &dev->dbs[qid * 2 * dev->db_stride];
> memset((void *)nvmeq->cqes, 0, NVME_CQ_SIZE(nvmeq->q_depth));
> flush_dcache_range((ulong)nvmeq->cqes,
> - (ulong)nvmeq->cqes + NVME_CQ_SIZE(nvmeq->q_depth));
> + (ulong)nvmeq->cqes + NVME_CQ_ALLOCATION);
> dev->online_queues++;
> }
>
>
On Amlogic A311D, fixes timeout on nvme scan:
Tested-by: Neil Armstrong <narmstrong at baylibre.com>
More information about the U-Boot
mailing list