[v4 03/12] spi: aspeed: Add ASPEED SPI controller driver

Cédric Le Goater clg at kaod.org
Mon Jul 4 17:24:38 CEST 2022


Hello Chin-Ting,

On 7/3/22 10:47, Chin-Ting Kuo wrote:
> Hi Cédric,
> 
> Thanks for the review.
> 
>> -----Original Message-----
>> From: Cédric Le Goater <clg at kaod.org>
>> Sent: Friday, July 1, 2022 5:28 PM
>> To: Chin-Ting Kuo <chin-ting_kuo at aspeedtech.com>; ChiaWei Wang
>> <chiawei_wang at aspeedtech.com>; lukma at denx.de; seanga2 at gmail.com;
>> Ryan Chen <ryan_chen at aspeedtech.com>; BMC-SW
>> <BMC-SW at aspeedtech.com>; jagan at amarulasolutions.com; vigneshr at ti.com;
>> u-boot at lists.denx.de; p.yadav at ti.com
>> Subject: Re: [v4 03/12] spi: aspeed: Add ASPEED SPI controller driver
>>
>> Hello Chin-Ting,
>>
>> On 5/24/22 07:56, Chin-Ting Kuo wrote:
>>> Add ASPEED BMC FMC/SPI memory controller driver with spi-mem interface
>>> for AST2500 and AST2600 platform.
>>>
>>> There are three SPI memory controllers embedded in an ASPEED SoC.
>>> - FMC: Named as Firmware Memory Controller. After AC on, MCU ROM
>>>          fetches initial device boot image from FMC chip select(CS) 0.
>>>
>>> - SPI1: Play the role of a SPI Master controller. Or, there is a
>>>           dedicated path for HOST(X86) to access its BIOS flash mounted
>>>           under BMC. spi-aspeed.c implements the control sequence when
>>>           SPI1 is a SPI master.
>>>
>>> - SPI2: It is a pure SPI flash controller. For most scenarios, flashes
>>>           mounted under it are for pure storage purpose.
>>>
>>> ASPEED SPI controller supports 1-1-1, 1-1-2 and 1-1-4 SPI flash mode.
>>> Three types of command mode are supported, normal mode, command
>>> read/write mode and user mode.
>>> - Normal mode: Default mode. After power on, normal read command 03h
>> or
>>>                  13h is used to fetch boot image from SPI flash.
>>>                  - AST2500: Only 03h command can be used after power
>> on
>>>                             or reset.
>>>                  - AST2600: If FMC04[6:4] is set, 13h command is used,
>>>                             otherwise, 03h command.
>>>                  The address length is decided by FMC04[2:0].
>>>
>>> - Command mode: SPI controller can send command and address
>>>                   automatically when CPU read/write the related
>> remapped
>>>                   or decoded address area. The command used by this
>> mode
>>>                   can be configured by FMC10/14/18[23:16]. Also, the
>>>                   address length is decided by FMC04[2:0]. This mode
>> will
>>>                   be implemented in the following patch series.
>>>
>>> - User mode: It is a traditional and pure SPI operation, where
>>>                SPI transmission is controlled by CPU. It is the main
>>>                mode in this patch.
>>>
>>> Each SPI controller in ASPEED SoC has its own decoded address mapping.
>>> Within each SPI controller decoded address, driver can assign a
>>> specific address region for each CS of a SPI controller. The decoded
>>> address cannot overlap to each other. With normal mode and command
>>> mode, the decoded address accessed by the CPU determines which CS is
>> active.
>>> When user mode is adopted, the CS decoded address is a FIFO, CPU can
>>> send/receive any SPI transmission by accessing the related decoded
>>> address for the target CS.
>>>
>>> Signed-off-by: Chin-Ting Kuo <chin-ting_kuo at aspeedtech.com>
>>
>> I would split the patch furthermore to ease reading.
> 
> Okay, this will be update in the next version.
> 
>>    1 - Add basic support
>>
>>        with default decoding ranges set for all possible CS, even
>>        without a device.
>>
>>        WE only have USER mode for now. So it's not important to
>>        correctly set the ranges since we won't use them before
>>        direct mapping is introduced. They should not overlap,
>>        that's all.
>>
>>    2 - decoding range adjustments
>>
>>        On that topic, we might want to take the simple DT approach
>>        with a "ranges" property defining the mapping windows of each
>>        CE. I think it is safer than trying to compute perfect ranges
>>        like on Linux.
>>
>>    3 - clock settings
>>
>>        That should simply be the property defined in the DT
>>
>>
>>> ---
>>> v2: Remove defconfig files from this patch.
>>>
>>>    drivers/spi/Kconfig      |   8 +
>>>    drivers/spi/Makefile     |   1 +
>>>    drivers/spi/spi-aspeed.c | 822
>> +++++++++++++++++++++++++++++++++++++++
>>>    3 files changed, 831 insertions(+)
>>>    create mode 100644 drivers/spi/spi-aspeed.c
>>>
>>> diff --git a/drivers/spi/Kconfig b/drivers/spi/Kconfig index
>>> a1e515cb2b..a616294910 100644
>>> --- a/drivers/spi/Kconfig
>>> +++ b/drivers/spi/Kconfig
>>> @@ -387,6 +387,14 @@ config SANDBOX_SPI
>>>    		};
>>>    	  };
>>>
>>> +config SPI_ASPEED
>>> +	bool "ASPEED SPI controller driver"
>>> +	depends on DM_SPI && SPI_MEM
>>> +	default n
>>> +	help
>>> +	  Enable ASPEED SPI controller driver for AST2500
>>> +	  and AST2600 SoCs.
>>> +
>>>    config SPI_SIFIVE
>>>    	bool "SiFive SPI driver"
>>>    	help
>>> diff --git a/drivers/spi/Makefile b/drivers/spi/Makefile index
>>> 06e81b465b..36a4bd5dce 100644
>>> --- a/drivers/spi/Makefile
>>> +++ b/drivers/spi/Makefile
>>> @@ -9,6 +9,7 @@ obj-y += spi-uclass.o
>>>    obj-$(CONFIG_CADENCE_QSPI) += cadence_qspi.o cadence_qspi_apb.o
>>>    obj-$(CONFIG_SANDBOX) += spi-emul-uclass.o
>>>    obj-$(CONFIG_SOFT_SPI) += soft_spi.o
>>> +obj-$(CONFIG_SPI_ASPEED) += spi-aspeed.o
>>>    obj-$(CONFIG_SPI_MEM) += spi-mem.o
>>>    obj-$(CONFIG_TI_QSPI) += ti_qspi.o
>>>    obj-$(CONFIG_FSL_QSPI) += fsl_qspi.o diff --git
>>> a/drivers/spi/spi-aspeed.c b/drivers/spi/spi-aspeed.c new file mode
>>> 100644 index 0000000000..9574aff793
>>> --- /dev/null
>>> +++ b/drivers/spi/spi-aspeed.c
>>> @@ -0,0 +1,822 @@
>>> +// SPDX-License-Identifier: GPL-2.0+
>>> +/*
>>> + * ASPEED FMC/SPI Controller driver
>>> + *
>>> + * Copyright (c) 2022 ASPEED Corporation.
>>> + * Copyright (c) 2022 IBM Corporation.
>>> + *
>>> + * Author:
>>> + *     Chin-Ting Kuo <chin-ting_kuo at aspeedtech.com>
>>> + *     Cedric Le Goater <clg at kaod.org>
>>> + */
>>> +
>>> +#include <asm/io.h>
>>> +#include <clk.h>
>>> +#include <common.h>
>>> +#include <dm.h>
>>> +#include <dm/device_compat.h>
>>> +#include <linux/bitops.h>
>>> +#include <linux/bug.h>
>>> +#include <linux/err.h>
>>> +#include <linux/iopoll.h>
>>> +#include <linux/kernel.h>
>>> +#include <linux/mtd/spi-nor.h>
>>> +#include <linux/sizes.h>
>>> +#include <malloc.h>
>>> +#include <spi.h>
>>> +#include <spi-mem.h>
>>> +
>>> +/* ASPEED FMC/SPI memory control register related */
>>> +#define REG_CE_TYPE_SETTING          0x00
>>> +#define REG_CE_ADDR_MODE_CTRL        0x04
>>> +#define REG_INTR_CTRL_STATUS         0x08
>>> +#define REG_CE0_CTRL_REG             0x10
>>> +#define REG_CE0_DECODED_ADDR_REG     0x30
>>> +
>>> +#define ASPEED_SPI_MAX_CS       3
>>> +#define FLASH_CALIBRATION_LEN   0x400
>>> +
>>> +#define CTRL_IO_SINGLE_DATA     0
>>> +#define CTRL_IO_QUAD_DATA       BIT(30)
>>> +#define CTRL_IO_DUAL_DATA       BIT(29)
>>> +
>>> +#define CTRL_IO_MODE_USER       GENMASK(1, 0)
>>> +#define CTRL_IO_MODE_CMD_READ   BIT(0)
>>> +#define CTRL_IO_MODE_CMD_WRITE  BIT(1)
>>> +#define CTRL_STOP_ACTIVE        BIT(2)
>>> +
>>> +struct aspeed_spi_plat {
>>> +	fdt_addr_t ctrl_base;
>>
>> are these the registers ?
> 
> No, "struct aspeed_spi_plat" is used to record some basic information of this platform.
> 
>>> +	void __iomem *ahb_base; /* AHB address base for all flash devices. */
>>> +	fdt_size_t ahb_sz; /* Overall AHB window size for all flash device. */
>>> +	u32 hclk_rate; /* AHB clock rate */
>>> +	u8 max_cs;
>>
>>
>> I don't think we need a "max_cs" in the controller struct and a "num-cs"
>> property in the DT. We could simply use a HW maxmimum defined in
>> aspeed_spi_info.
>>
> 
> "num-cs" is used to detect the number of active flash node.
> This property is mainly used to maintain the address decoded range.

I am not sure we need "num-cs" since we should configure all decoding
range registers to avoid overlaps. At least on some of the aST2500
controllers.

Does the AST2600 FMC and SPI1/2 allow to configure a single CE to cover
the whole mapping window of the controller while the other CE windows
are closed ?
  
> "max-cs" is used for controlling register access.
> We need to know the maximum CS number supported by the current controller.

Yes. That's fine. It is an HW limit per controller and we need to keep
this property in the DT or in the driver.

I have been defining this HW property in the driver. That might have
been a wrong choice. Something to fix in Linux may be.

>>
>>> +};
>>> +
>>> +struct aspeed_spi_flash {
>>> +	u8 cs;
>>> +	void __iomem *ahb_base;
>>> +	u32 ahb_win_sz;
>>> +	u32 ce_ctrl_user;
>>> +	u32 ce_ctrl_read;
>>> +	u32 max_freq;
>>> +	bool trimmed_decoded_sz;
>>
>> I wonder what this is for. We need to split the patches :)
> 
> Oh, it is the redundant one and it will be removed in the next patch version.
> 
>>
>>> +};
>>> +
>>> +struct aspeed_spi_priv {
>>> +	u32 num_cs;
>>
>> See above.
>>
>>> +	struct aspeed_spi_info *info;
>>> +	struct aspeed_spi_flash flashes[ASPEED_SPI_MAX_CS];
>>> +	u32 decoded_sz_arr[ASPEED_SPI_MAX_CS];
>>> +};
>>
>>
>> Couldn't we have a 'struct aspeed_spi_regs' defining the layout of the
>> registers ?
>>
> 
> Why? The register offset has been defined by macro previously.

Not always. Look at the *_regs struct in the different drivers and :

https://github.com/legoater/u-boot/blob/aspeed-v2022.04/arch/arm/include/asm/arch-aspeed/spi.h#L72

I am not sure what is the preferred practice in U-Boot. But a few
Aspeed drivers have been folowing this scheme.

> 
>>> +struct aspeed_spi_info {
>>> +	u32 cmd_io_ctrl_mask;
>>> +	u32 clk_ctrl_mask;
>>> +	u32 max_data_bus_width;
>>> +	u32 min_decoded_sz;
>>> +	void (*set_4byte)(struct udevice *bus, u32 cs);
>>> +	u32 (*segment_start)(struct udevice *bus, u32 reg);
>>> +	u32 (*segment_end)(struct udevice *bus, u32 reg);
>>> +	u32 (*segment_reg)(u32 start, u32 end);
>>> +	int (*adjust_decoded_sz)(struct udevice *bus, u32 decoded_sz_arr[]);
>>> +	u32 (*get_clk_setting)(struct udevice *dev, uint hz); };
>>> +
>>> +static int aspeed_spi_trim_decoded_size(struct udevice *bus,
>>> +					u32 decoded_sz_arr[]);
>>> +
>>> +static u32 aspeed_spi_get_io_mode(u32 bus_width) {
>>> +	switch (bus_width) {
>>> +	case 1:
>>> +		return CTRL_IO_SINGLE_DATA;
>>> +	case 2:
>>> +		return CTRL_IO_DUAL_DATA;
>>> +	case 4:
>>> +		return CTRL_IO_QUAD_DATA;
>>> +	default:
>>> +		return CTRL_IO_SINGLE_DATA;
>>> +	}
>>> +}
>>> +
>>> +static u32 ast2500_spi_segment_start(struct udevice *bus, u32 reg) {
>>> +	struct aspeed_spi_plat *plat = dev_get_plat(bus);
>>> +	u32 start_offset = ((reg >> 16) & 0xff) << 23;
>>> +
>>> +	if (start_offset == 0)
>>> +		return (u32)plat->ahb_base;
>>> +
>>> +	return (u32)plat->ahb_base + start_offset; }
>>> +
>>> +static u32 ast2500_spi_segment_end(struct udevice *bus, u32 reg) {
>>> +	struct aspeed_spi_plat *plat = dev_get_plat(bus);
>>> +	u32 end_offset = ((reg >> 24) & 0xff) << 23;
>>> +
>>> +	/* Meaningless end_offset, set to physical ahb base. */
>>> +	if (end_offset == 0)
>>> +		return (u32)plat->ahb_base;
>>> +
>>> +	return (u32)plat->ahb_base + end_offset + 0x100000; }
>>> +
>>> +static u32 ast2500_spi_segment_reg(u32 start, u32 end) {
>>> +	if (start == end)
>>> +		return 0;
>>> +
>>> +	return ((((start) >> 23) & 0xff) << 16) | ((((end) >> 23) & 0xff) <<
>>> +24); }
>>> +
>>> +static void ast2500_spi_chip_set_4byte(struct udevice *bus, u32 cs) {
>>> +	struct aspeed_spi_plat *plat = dev_get_plat(bus);
>>> +	u32 reg_val;
>>> +
>>> +	reg_val = readl(plat->ctrl_base + REG_CE_ADDR_MODE_CTRL);
>>> +	reg_val |= 0x1 << cs;
>>> +	writel(reg_val, plat->ctrl_base + REG_CE_ADDR_MODE_CTRL); }
>>> +
>>> +/*
>>> + * For AST2500, the minimum address decoded size for each CS
>>> + * is 8MB instead of zero. This address decoded size is
>>> + * mandatory for each CS no matter whether it will be used.
>>> + * This is a HW limitation.
>>> + */
>>> +static int ast2500_adjust_decoded_size(struct udevice *bus,
>>> +				       u32 decoded_sz_arr[])
>>> +{
>>> +	struct aspeed_spi_plat *plat = dev_get_plat(bus);
>>> +	struct aspeed_spi_priv *priv = dev_get_priv(bus);
>>> +	int ret;
>>> +	int cs;
>>> +
>>> +	/* Assign min_decoded_sz to unused CS. */
>>> +	for (cs = priv->num_cs; cs < plat->max_cs; cs++) {
>>> +		if (decoded_sz_arr[cs] < priv->info->min_decoded_sz)
>>> +			decoded_sz_arr[cs] = priv->info->min_decoded_sz;
>>> +	}
>>> +
>>> +	ret = aspeed_spi_trim_decoded_size(bus, decoded_sz_arr);
>>> +	if (ret != 0)
>>> +		return ret;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +/* Transfer maximum clock frequency to register setting */ static u32
>>> +ast2500_get_clk_setting(struct udevice *dev, uint max_hz) {
>>> +	struct aspeed_spi_plat *plat = dev_get_plat(dev->parent);
>>> +	struct aspeed_spi_priv *priv = dev_get_priv(dev->parent);
>>> +	struct dm_spi_slave_plat *slave_plat = dev_get_parent_plat(dev);
>>> +	u32 hclk_clk = plat->hclk_rate;
>>> +	u32 hclk_div = 0x000; /* default value */
>>> +	u32 i;
>>> +	bool found = false;
>>> +	/* HCLK/1 ..	HCLK/16 */
>>> +	u32 hclk_masks[] = {15, 7, 14, 6, 13, 5, 12, 4,
>>> +				 11, 3, 10, 2, 9,  1, 8,  0};
>>> +
>>> +	/* FMC/SPIR10[11:8] */
>>> +	for (i = 0; i < ARRAY_SIZE(hclk_masks); i++) {
>>> +		if (hclk_clk / (i + 1) <= max_hz) {
>>> +			found = true;
>>> +			priv->flashes[slave_plat->cs].max_freq =
>>> +							hclk_clk / (i + 1);
>>> +			break;
>>> +		}
>>> +	}
>>> +
>>> +	if (found) {
>>> +		hclk_div = hclk_masks[i] << 8;
>>> +		goto end;
>>> +	}
>>> +
>>> +	for (i = 0; i < ARRAY_SIZE(hclk_masks); i++) {
>>> +		if (hclk_clk / ((i + 1) * 4) <= max_hz) {
>>> +			found = true;
>>> +			priv->flashes[slave_plat->cs].max_freq =
>>> +						hclk_clk / ((i + 1) * 4);
>>> +			break;
>>> +		}
>>> +	}
>>> +
>>> +	if (found)
>>> +		hclk_div = BIT(13) | (hclk_masks[i] << 8);
>>> +
>>> +end:
>>> +	dev_dbg(dev, "found: %s, hclk: %d, max_clk: %d\n", found ? "yes" : "no",
>>> +		hclk_clk, max_hz);
>>> +
>>> +	if (found) {
>>> +		dev_dbg(dev, "h_div: %d (mask %x), speed: %d\n",
>>> +			i + 1, hclk_masks[i], priv->flashes[slave_plat->cs].max_freq);
>>> +	}
>>> +
>>> +	return hclk_div;
>>> +}
>>> +
>>> +static u32 ast2600_spi_segment_start(struct udevice *bus, u32 reg) {
>>> +	struct aspeed_spi_plat *plat = dev_get_plat(bus);
>>> +	u32 start_offset = (reg << 16) & 0x0ff00000;
>>> +
>>> +	if (start_offset == 0)
>>> +		return (u32)plat->ahb_base;
>>> +
>>> +	return (u32)plat->ahb_base + start_offset; }
>>> +
>>> +static u32 ast2600_spi_segment_end(struct udevice *bus, u32 reg) {
>>> +	struct aspeed_spi_plat *plat = dev_get_plat(bus);
>>> +	u32 end_offset = reg & 0x0ff00000;
>>> +
>>> +	/* Meaningless end_offset, set to physical ahb base. */
>>> +	if (end_offset == 0)
>>> +		return (u32)plat->ahb_base;
>>> +
>>> +	return (u32)plat->ahb_base + end_offset + 0x100000; }
>>> +
>>> +static u32 ast2600_spi_segment_reg(u32 start, u32 end) {
>>> +	if (start == end)
>>> +		return 0;
>>> +
>>> +	return ((start & 0x0ff00000) >> 16) | ((end - 0x100000) &
>>> +0x0ff00000); }
>>> +
>>> +static void ast2600_spi_chip_set_4byte(struct udevice *bus, u32 cs) {
>>> +	struct aspeed_spi_plat *plat = dev_get_plat(bus);
>>> +	u32 reg_val;
>>> +
>>> +	reg_val = readl(plat->ctrl_base + REG_CE_ADDR_MODE_CTRL);
>>> +	reg_val |= 0x11 << cs;
>>> +	writel(reg_val, plat->ctrl_base + REG_CE_ADDR_MODE_CTRL); }
>>> +
>>> +static int ast2600_adjust_decoded_size(struct udevice *bus,
>>> +				       u32 decoded_sz_arr[])
>>> +{
>>> +	struct aspeed_spi_priv *priv = dev_get_priv(bus);
>>> +	int ret;
>>> +	int i;
>>> +	int cs;
>>> +	u32 pre_sz;
>>> +	u32 lack_sz;
>>> +
>>> +	/*
>>> +	 * If commnad mode or normal mode is used, the start address of a
>>> +	 * decoded range should be multiple of its related flash size.
>>> +	 * Namely, the total decoded size from flash 0 to flash N should
>>> +	 * be multiple of the size of flash (N + 1).
>>> +	 */
>>> +	for (cs = priv->num_cs - 1; cs >= 0; cs--) {
>>> +		pre_sz = 0;
>>> +		for (i = 0; i < cs; i++)
>>> +			pre_sz += decoded_sz_arr[i];
>>> +
>>> +		if (decoded_sz_arr[cs] != 0 && (pre_sz % decoded_sz_arr[cs]) != 0) {
>>> +			lack_sz = decoded_sz_arr[cs] - (pre_sz % decoded_sz_arr[cs]);
>>> +			decoded_sz_arr[0] += lack_sz;
>>> +		}
>>> +	}
>>> +
>>> +	ret = aspeed_spi_trim_decoded_size(bus, decoded_sz_arr);
>>> +	if (ret != 0)
>>> +		return ret;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +/* Transfer maximum clock frequency to register setting */ static u32
>>> +ast2600_get_clk_setting(struct udevice *dev, uint max_hz) {
>>> +	struct aspeed_spi_plat *plat = dev_get_plat(dev->parent);
>>> +	struct aspeed_spi_priv *priv = dev_get_priv(dev->parent);
>>> +	struct dm_spi_slave_plat *slave_plat = dev_get_parent_plat(dev);
>>> +	u32 hclk_clk = plat->hclk_rate;
>>> +	u32 hclk_div = 0x400; /* default value */
>>> +	u32 i, j;
>>> +	bool found = false;
>>> +	/* HCLK/1 ..	HCLK/16 */
>>> +	u32 hclk_masks[] = {15, 7, 14, 6, 13, 5, 12, 4,
>>> +				 11, 3, 10, 2, 9,  1, 8,  0};
>>> +
>>> +	/* FMC/SPIR10[27:24] */
>>> +	for (j = 0; j < 0xf; j++) {
>>> +		/* FMC/SPIR10[11:8] */
>>> +		for (i = 0; i < ARRAY_SIZE(hclk_masks); i++) {
>>> +			if (i == 0 && j == 0)
>>> +				continue;
>>> +
>>> +			if (hclk_clk / (i + 1 + (j * 16)) <= max_hz) {
>>> +				found = true;
>>> +				break;
>>> +			}
>>> +		}
>>> +
>>> +		if (found) {
>>> +			hclk_div = ((j << 24) | hclk_masks[i] << 8);
>>> +			priv->flashes[slave_plat->cs].max_freq =
>>> +						hclk_clk / (i + 1 + j * 16);
>>> +			break;
>>> +		}
>>> +	}
>>> +
>>> +	dev_dbg(dev, "found: %s, hclk: %d, max_clk: %d\n", found ? "yes" : "no",
>>> +		hclk_clk, max_hz);
>>> +
>>> +	if (found) {
>>> +		dev_dbg(dev, "base_clk: %d, h_div: %d (mask %x), speed: %d\n",
>>> +			j, i + 1, hclk_masks[i], priv->flashes[slave_plat->cs].max_freq);
>>> +	}
>>> +
>>> +	return hclk_div;
>>> +}
>>> +
>>> +/*
>>> + * As the flash size grows up, we need to trim some decoded
>>> + * size if needed for the sake of conforming the maximum
>>> + * decoded size. We trim the decoded size from the largest
>>> + * CS in order to avoid affecting the default boot up sequence
>>> + * from CS0 where command mode or normal mode is used.
>>> + * Notice, if a CS decoded size is trimmed, command mode may
>>> + * not work perfectly on that CS.
>>> + */
>>> +static int aspeed_spi_trim_decoded_size(struct udevice *bus,
>>> +					u32 decoded_sz_arr[])
>>> +{
>>> +	struct aspeed_spi_plat *plat = dev_get_plat(bus);
>>> +	struct aspeed_spi_priv *priv = dev_get_priv(bus);
>>> +	u32 total_sz;
>>> +	int cs = plat->max_cs - 1;
>>> +	u32 i;
>>> +
>>> +	do {
>>> +		total_sz = 0;
>>> +		for (i = 0; i < plat->max_cs; i++)
>>> +			total_sz += decoded_sz_arr[i];
>>> +
>>> +		if (decoded_sz_arr[cs] <= priv->info->min_decoded_sz)
>>> +			cs--;
>>> +
>>> +		if (cs < 0)
>>> +			return -ENOMEM;
>>> +
>>> +		if (total_sz > plat->ahb_sz) {
>>> +			decoded_sz_arr[cs] -= priv->info->min_decoded_sz;
>>> +			total_sz -= priv->info->min_decoded_sz;
>>> +			priv->flashes[cs].trimmed_decoded_sz = true;
>>> +		}
>>> +	} while (total_sz > plat->ahb_sz);
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int aspeed_spi_read_from_ahb(void __iomem *ahb_base, void *buf,
>>> +				    size_t len)
>>> +{
>>> +	size_t offset = 0;
>>> +
>>> +	if (IS_ALIGNED((uintptr_t)ahb_base, sizeof(uintptr_t)) &&
>>> +	    IS_ALIGNED((uintptr_t)buf, sizeof(uintptr_t))) {
>>> +		readsl(ahb_base, buf, len >> 2);
>>> +		offset = len & ~0x3;
>>> +		len -= offset;
>>> +	}
>>> +
>>> +	readsb(ahb_base, (u8 *)buf + offset, len);
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int aspeed_spi_write_to_ahb(void __iomem *ahb_base, const void
>> *buf,
>>> +				   size_t len)
>>> +{
>>> +	size_t offset = 0;
>>> +
>>> +	if (IS_ALIGNED((uintptr_t)ahb_base, sizeof(uintptr_t)) &&
>>> +	    IS_ALIGNED((uintptr_t)buf, sizeof(uintptr_t))) {
>>> +		writesl(ahb_base, buf, len >> 2);
>>> +		offset = len & ~0x3;
>>> +		len -= offset;
>>> +	}
>>> +
>>> +	writesb(ahb_base, (u8 *)buf + offset, len);
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +/*
>>> + * Currently, only support 1-1-1, 1-1-2 or 1-1-4
>>> + * SPI NOR flash operation format.
>>> + */
>>> +static bool aspeed_spi_supports_op(struct spi_slave *slave,
>>> +				   const struct spi_mem_op *op)
>>> +{
>>> +	struct udevice *bus = slave->dev->parent;
>>> +	struct aspeed_spi_priv *priv = dev_get_priv(bus);
>>> +
>>> +	if (op->cmd.buswidth > 1)
>>> +		return false;
>>> +
>>> +	if (op->addr.buswidth > 1 || op->addr.nbytes > 4)
>>> +		return false;
>>> +
>>> +	if (op->dummy.buswidth > 1 || op->dummy.nbytes > 7)
>>> +		return false;
>>> +
>>> +	if (op->data.buswidth > priv->info->max_data_bus_width)
>>> +		return false;
>>> +
>>> +	if (!spi_mem_default_supports_op(slave, op))
>>> +		return false;
>>> +
>>> +	return true;
>>> +}
>>
>> You could copy the Linux aspeed_spi_supports_op()
>>
> 
> Okay, this patch series may be too old.
> 
>>> +
>>> +static int aspeed_spi_exec_op_user_mode(struct spi_slave *slave,
>>> +					const struct spi_mem_op *op)
>>> +{
>>> +	struct udevice *dev = slave->dev;
>>> +	struct udevice *bus = dev->parent;
>>> +	struct aspeed_spi_plat *plat = dev_get_plat(bus);
>>> +	struct aspeed_spi_priv *priv = dev_get_priv(bus);
>>> +	struct dm_spi_slave_plat *slave_plat = dev_get_parent_plat(slave->dev);
>>> +	u32 cs = slave_plat->cs;
>>> +	fdt_addr_t ctrl_reg = plat->ctrl_base + REG_CE0_CTRL_REG + cs * 4;
>>> +	struct aspeed_spi_flash *flash = &priv->flashes[cs];
>>> +	u32 ctrl_val;
>>> +	u8 dummy_data[16] = {0};
>>> +	u8 addr[4] = {0};
>>> +	int i;
>>> +
>>> +	dev_dbg(dev,
>> "cmd:%x(%d),addr:%llx(%d),dummy:%d(%d),data_len:0x%x(%d)\n",
>>> +		op->cmd.opcode, op->cmd.buswidth, op->addr.val,
>>> +		op->addr.buswidth, op->dummy.nbytes, op->dummy.buswidth,
>>> +		op->data.nbytes, op->data.buswidth);
>>> +
>>> +	/* Start user mode */
>>> +	ctrl_val = flash->ce_ctrl_user;
>>> +	writel(ctrl_val, ctrl_reg);
>>> +	ctrl_val &= (~CTRL_STOP_ACTIVE);
>>> +	writel(ctrl_val, ctrl_reg);
>>> +
>>> +	/* Send command */
>>> +	aspeed_spi_write_to_ahb(flash->ahb_base, &op->cmd.opcode, 1);
>>> +
>>> +	/* Send address */
>>> +	for (i = op->addr.nbytes; i > 0; i--) {
>>> +		addr[op->addr.nbytes - i] =
>>> +			((u32)op->addr.val >> ((i - 1) * 8)) & 0xff;
>>> +	}
>>> +	aspeed_spi_write_to_ahb(flash->ahb_base, addr, op->addr.nbytes);
>>
>> This could be writing 3 bytes. Not optimal.
> 
> Why? Doesn't it depend on the value of "op->addr.nbytes"? This function has been verified with different flash parts.

It works but you could issue a single 32-bit transaction if the OP was
included in the write, as Linux does. This is an optimisation to avoid
4 writes of one byte. This is minor.

>>
>>> +
>>> +	/* Send dummy cycle */
>>> +	aspeed_spi_write_to_ahb(flash->ahb_base, dummy_data,
>>> +op->dummy.nbytes);
>>> +
>>> +	/* Change io_mode */
>>> +	ctrl_val |= aspeed_spi_get_io_mode(op->data.buswidth);
>>> +	writel(ctrl_val, ctrl_reg);
>>> +
>>> +	/* Send data */
>>> +	if (op->data.dir == SPI_MEM_DATA_OUT) {
>>> +		aspeed_spi_write_to_ahb(flash->ahb_base, op->data.buf.out,
>>> +					op->data.nbytes);
>>> +	} else {
>>> +		aspeed_spi_read_from_ahb(flash->ahb_base, op->data.buf.in,
>>> +					 op->data.nbytes);
>>> +	}
>>> +
>>> +	ctrl_val |= CTRL_STOP_ACTIVE;
>>> +	writel(ctrl_val, ctrl_reg);
>>> +
>>> +	/* Restore controller setting. */
>>> +	writel(flash->ce_ctrl_read, ctrl_reg);
>>> +
>>> +	/* Set controller to 4-byte mode when flash is in 4-byte mode. */
>>> +	if (op->cmd.opcode == SPINOR_OP_EN4B)
>>> +		priv->info->set_4byte(bus, cs);
>>
>> We don't need to set 4B earlier ? I trust you there.
> 
> Do you mean early in this function? It may be okay.
> 
>>
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static struct aspeed_spi_flash *aspeed_spi_get_flash(struct udevice
>>> +*dev) {
>>> +	struct udevice *bus = dev->parent;
>>> +	struct dm_spi_slave_plat *slave_plat = dev_get_parent_plat(dev);
>>> +	struct aspeed_spi_plat *plat = dev_get_plat(bus);
>>> +	struct aspeed_spi_priv *priv = dev_get_priv(bus);
>>> +	u32 cs = slave_plat->cs;
>>> +
>>> +	if (cs >= plat->max_cs) {
>>> +		dev_err(dev, "invalid CS %u\n", cs);
>>> +		return NULL;
>>> +	}
>>> +
>>> +	return &priv->flashes[cs];
>>> +}
>>> +
>>> +static int aspeed_spi_decoded_range_config(struct udevice *bus,
>>> +					   u32 decoded_sz_arr[])
>>> +{
>>> +	struct aspeed_spi_plat *plat = dev_get_plat(bus);
>>> +	struct aspeed_spi_priv *priv = dev_get_priv(bus);
>>> +	int ret;
>>> +	u32 cs;
>>> +	u32 decoded_reg_val;
>>> +	u32 start_addr;
>>> +	u32 end_addr = 0;
>>> +
>>> +	ret = priv->info->adjust_decoded_sz(bus, decoded_sz_arr);
>>> +	if (ret != 0)
>>> +		return ret;
>>> +
>>> +	/* Configure each CS decoded range */
>>> +	for (cs = 0; cs < plat->max_cs; cs++) {
>>> +		if (cs == 0)
>>> +			start_addr = (u32)plat->ahb_base;
>>> +		else
>>> +			start_addr = end_addr;
>>> +		priv->flashes[cs].ahb_base = (void __iomem *)start_addr;
>>> +		priv->flashes[cs].ahb_win_sz = decoded_sz_arr[cs];
>>> +
>>> +		end_addr = start_addr + decoded_sz_arr[cs];
>>> +		decoded_reg_val = priv->info->segment_reg(start_addr, end_addr);
>>> +
>>> +		writel(decoded_reg_val,
>>> +		       plat->ctrl_base + REG_CE0_DECODED_ADDR_REG + cs * 4);
>>> +
>>> +		dev_dbg(bus, "cs: %d, decoded_reg: 0x%x, start: 0x%x, end: 0x%x\n",
>>> +			cs, decoded_reg_val, start_addr, end_addr);
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +/*
>>> + * Initialize SPI controller for each chip select.
>>> + * Here, only the minimum decode range is configured
>>> + * in order to get device (SPI NOR flash) information
>>> + * at the early stage.
>>> + */
>>> +static int aspeed_spi_ctrl_init(struct udevice *bus) {
>>> +	int ret;
>>> +	struct aspeed_spi_plat *plat = dev_get_plat(bus);
>>> +	struct aspeed_spi_priv *priv = dev_get_priv(bus);
>>> +	u32 cs;
>>> +	u32 reg_val;
>>> +	u32 decoded_sz;
>>> +
>>> +	/* Enable write capability for all CS. */
>>> +	reg_val = readl(plat->ctrl_base + REG_CE_TYPE_SETTING);
>>> +	writel(reg_val | (GENMASK(plat->max_cs - 1, 0) << 16),
>>> +	       plat->ctrl_base + REG_CE_TYPE_SETTING);
>>> +
>>> +	memset(priv->flashes, 0x0,
>>> +	       sizeof(struct aspeed_spi_flash) * ASPEED_SPI_MAX_CS);
>>> +
>>> +	/* Initial each CS controller register */
>>> +	for (cs = 0; cs < priv->num_cs; cs++) {
>>> +		priv->flashes[cs].ce_ctrl_user &=
>>> +				~(priv->info->cmd_io_ctrl_mask);
>>> +		priv->flashes[cs].ce_ctrl_user |=
>>> +				(CTRL_STOP_ACTIVE | CTRL_IO_MODE_USER);
>>> +		writel(priv->flashes[cs].ce_ctrl_user,
>>> +		       plat->ctrl_base + REG_CE0_CTRL_REG + cs * 4);
>>> +	}
>>>
>>
>> and we should start by setting sane defaults for the ranges.
>>
>> It's too early to add the decoding ranges calculation.
> 
> Okay.
> 
>>
>> Thanks,
>>
>> C.
>>
>>> +	memset(priv->decoded_sz_arr, 0x0, sizeof(u32) * ASPEED_SPI_MAX_CS);
>>> +
>>> +	for (cs = 0; cs < priv->num_cs; cs++) {
>>> +		reg_val = readl(plat->ctrl_base + REG_CE0_DECODED_ADDR_REG +
>> cs * 4);
>>> +		decoded_sz = priv->info->segment_end(bus, reg_val) -
>>> +			     priv->info->segment_start(bus, reg_val);
>>> +
>>> +		/*
>>> +		 * For CS0, if the default address decoded area exists,
>>> +		 * keep its value in order to make sure that the whole boot
>>> +		 * image can be accessed with normal read mode.
>>> +		 */
>>> +		if (cs == 0 && decoded_sz != 0)
>>> +			priv->decoded_sz_arr[cs] = decoded_sz;
>>> +		else
>>> +			priv->decoded_sz_arr[cs] = priv->info->min_decoded_sz;
>>> +	}
>>> +
>>> +	ret = aspeed_spi_decoded_range_config(bus, priv->decoded_sz_arr);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>> +static const struct aspeed_spi_info ast2500_fmc_info = {
>>> +	.max_data_bus_width = 2,
>>> +	.cmd_io_ctrl_mask = 0x70ff40c3,
>>> +	.clk_ctrl_mask = 0x00002f00,
>>> +	.min_decoded_sz = 0x800000,
>>> +	.set_4byte = ast2500_spi_chip_set_4byte,
>>> +	.segment_start = ast2500_spi_segment_start,
>>> +	.segment_end = ast2500_spi_segment_end,
>>> +	.segment_reg = ast2500_spi_segment_reg,
>>> +	.adjust_decoded_sz = ast2500_adjust_decoded_size,
>>> +	.get_clk_setting = ast2500_get_clk_setting, };
>>> +
>>> +/*
>>> + * There are some different between FMC and SPI controllers.
>>> + * For example, DMA operation, but this isn't implemented currently.
>>> + */
>>> +static const struct aspeed_spi_info ast2500_spi_info = {
>>> +	.max_data_bus_width = 2,
>>> +	.cmd_io_ctrl_mask = 0x70ff40c3,
>>> +	.clk_ctrl_mask = 0x00002f00,
>>> +	.min_decoded_sz = 0x800000,
>>> +	.set_4byte = ast2500_spi_chip_set_4byte,
>>> +	.segment_start = ast2500_spi_segment_start,
>>> +	.segment_end = ast2500_spi_segment_end,
>>> +	.segment_reg = ast2500_spi_segment_reg,
>>> +	.adjust_decoded_sz = ast2500_adjust_decoded_size,
>>> +	.get_clk_setting = ast2500_get_clk_setting, };
>>> +
>>> +static const struct aspeed_spi_info ast2600_fmc_info = {
>>> +	.max_data_bus_width = 4,
>>> +	.cmd_io_ctrl_mask = 0xf0ff40c3,
>>> +	.clk_ctrl_mask = 0x0f000f00,
>>> +	.min_decoded_sz = 0x200000,
>>> +	.set_4byte = ast2600_spi_chip_set_4byte,
>>> +	.segment_start = ast2600_spi_segment_start,
>>> +	.segment_end = ast2600_spi_segment_end,
>>> +	.segment_reg = ast2600_spi_segment_reg,
>>> +	.adjust_decoded_sz = ast2600_adjust_decoded_size,
>>> +	.get_clk_setting = ast2600_get_clk_setting, };
>>> +
>>> +static const struct aspeed_spi_info ast2600_spi_info = {
>>> +	.max_data_bus_width = 4,
>>> +	.cmd_io_ctrl_mask = 0xf0ff40c3,
>>> +	.clk_ctrl_mask = 0x0f000f00,
>>> +	.min_decoded_sz = 0x200000,
>>> +	.set_4byte = ast2600_spi_chip_set_4byte,
>>> +	.segment_start = ast2600_spi_segment_start,
>>> +	.segment_end = ast2600_spi_segment_end,
>>> +	.segment_reg = ast2600_spi_segment_reg,
>>> +	.adjust_decoded_sz = ast2600_adjust_decoded_size,
>>> +	.get_clk_setting = ast2600_get_clk_setting, };
>>> +
>>> +static int aspeed_spi_claim_bus(struct udevice *dev) {
>>> +	struct udevice *bus = dev->parent;
>>> +	struct aspeed_spi_priv *priv = dev_get_priv(dev->parent);
>>> +	struct dm_spi_slave_plat *slave_plat = dev_get_parent_plat(dev);
>>> +	struct aspeed_spi_flash *flash = &priv->flashes[slave_plat->cs];
>>> +	u32 clk_setting;
>>> +
>>> +	dev_dbg(bus, "%s: claim bus CS%u\n", bus->name, slave_plat->cs);
>>> +
>>> +	if (flash->max_freq == 0) {
>>> +		clk_setting = priv->info->get_clk_setting(dev, slave_plat->max_hz);
>>> +		flash->ce_ctrl_user &= ~(priv->info->clk_ctrl_mask);
>>> +		flash->ce_ctrl_user |= clk_setting;
>>> +		flash->ce_ctrl_read &= ~(priv->info->clk_ctrl_mask);
>>> +		flash->ce_ctrl_read |= clk_setting;
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int aspeed_spi_release_bus(struct udevice *dev) {
>>> +	struct udevice *bus = dev->parent;
>>> +	struct dm_spi_slave_plat *slave_plat = dev_get_parent_plat(dev);
>>> +
>>> +	dev_dbg(bus, "%s: release bus CS%u\n", bus->name, slave_plat->cs);
>>> +
>>> +	if (!aspeed_spi_get_flash(dev))
>>> +		return -ENODEV;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int aspeed_spi_set_mode(struct udevice *bus, uint mode) {
>>> +	dev_dbg(bus, "%s: setting mode to %x\n", bus->name, mode);
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int aspeed_spi_set_speed(struct udevice *bus, uint hz) {
>>> +	dev_dbg(bus, "%s: setting speed to %u\n", bus->name, hz);
>>> +	/* ASPEED SPI controller supports multiple CS with different
>>> +	 * clock frequency. We cannot distinguish which CS here.
>>> +	 * Thus, the related implementation is postponed to claim_bus.
>>> +	 */
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int apseed_spi_of_to_plat(struct udevice *bus) {
>>> +	struct aspeed_spi_plat *plat = dev_get_plat(bus);
>>> +	struct clk hclk;
>>> +	int ret;
>>> +
>>> +	plat->ctrl_base = devfdt_get_addr_index(bus, 0);
>>> +	if ((u32)plat->ctrl_base == FDT_ADDR_T_NONE) {
>>> +		dev_err(bus, "wrong AHB base\n");
>>> +		return -ENODEV;
>>> +	}
>>> +
>>> +	plat->ahb_base =
>>> +		(void __iomem *)devfdt_get_addr_size_index(bus, 1, &plat->ahb_sz);
>>> +	if ((u32)plat->ahb_base == FDT_ADDR_T_NONE) {
>>> +		dev_err(bus, "wrong AHB base\n");
>>> +		return -ENODEV;
>>> +	}
>>> +
>>> +	ret = clk_get_by_index(bus, 0, &hclk);
>>> +	if (ret < 0) {
>>> +		dev_err(bus, "%s could not get clock: %d\n", bus->name, ret);
>>> +		return ret;
>>> +	}
>>> +
>>> +	plat->hclk_rate = clk_get_rate(&hclk);
>>> +	clk_free(&hclk);
>>> +
>>> +	plat->max_cs = dev_read_u32_default(bus, "num-cs",
>> ASPEED_SPI_MAX_CS);
>>> +	if (plat->max_cs > ASPEED_SPI_MAX_CS)
>>> +		return -EINVAL;
>>> +
>>> +	dev_dbg(bus, "ctrl_base = 0x%lx, ahb_base = 0x%p, size = 0x%lx\n",
>>> +		plat->ctrl_base, plat->ahb_base, plat->ahb_sz);
>>> +	dev_dbg(bus, "hclk = %dMHz, max_cs = %d\n",
>>> +		plat->hclk_rate / 1000000, plat->max_cs);
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int aspeed_spi_probe(struct udevice *bus) {
>>> +	int ret;
>>> +	struct aspeed_spi_priv *priv = dev_get_priv(bus);
>>> +	struct udevice *dev;
>>> +
>>> +	priv->info = (struct aspeed_spi_info *)dev_get_driver_data(bus);
>>> +
>>> +	priv->num_cs = 0;
>>> +	for (device_find_first_child(bus, &dev); dev;
>>> +	     device_find_next_child(&dev)) {
>>> +		priv->num_cs++;
>>> +	}
>>> +
>>> +	if (priv->num_cs > ASPEED_SPI_MAX_CS)
>>> +		return -EINVAL;
>>> +
>>> +	ret = aspeed_spi_ctrl_init(bus);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>> +static const struct spi_controller_mem_ops aspeed_spi_mem_ops = {
>>> +	.supports_op = aspeed_spi_supports_op,
>>> +	.exec_op = aspeed_spi_exec_op_user_mode, };
>>> +
>>> +static const struct dm_spi_ops aspeed_spi_ops = {
>>> +	.claim_bus = aspeed_spi_claim_bus,
>>> +	.release_bus = aspeed_spi_release_bus,
>>> +	.set_speed = aspeed_spi_set_speed,
>>> +	.set_mode = aspeed_spi_set_mode,
>>> +	.mem_ops = &aspeed_spi_mem_ops,
>>> +};
>>> +
>>> +static const struct udevice_id aspeed_spi_ids[] = {
>>> +	{ .compatible = "aspeed,ast2500-fmc", .data =
>> (ulong)&ast2500_fmc_info, },
>>> +	{ .compatible = "aspeed,ast2500-spi", .data = (ulong)&ast2500_spi_info, },
>>> +	{ .compatible = "aspeed,ast2600-fmc", .data =
>> (ulong)&ast2600_fmc_info, },
>>> +	{ .compatible = "aspeed,ast2600-spi", .data = (ulong)&ast2600_spi_info, },
>>> +	{ }
>>> +};
>>> +
>>> +U_BOOT_DRIVER(aspeed_spi) = {
>>> +	.name = "aspeed_spi",
>>> +	.id = UCLASS_SPI,
>>> +	.of_match = aspeed_spi_ids,
>>> +	.ops = &aspeed_spi_ops,
>>> +	.of_to_plat = apseed_spi_of_to_plat,
>>> +	.plat_auto = sizeof(struct aspeed_spi_plat),
>>> +	.priv_auto = sizeof(struct aspeed_spi_priv),
>>> +	.probe = aspeed_spi_probe,
>>> +};
> 



More information about the U-Boot mailing list