[U-Boot] [PATCH 3/4] mtd: nand: add Freescale NFC driver
Stefan Agner
stefan at agner.ch
Wed Aug 13 13:20:35 CEST 2014
Am 2014-08-13 00:58, schrieb Bill Pringlemeir:
[snip]
>>>>> +static u32 nfc_read(struct mtd_info *mtd, uint reg)
>>>>> +{
>>>>> + struct fsl_nfc *nfc = mtd_to_nfc(mtd);
>>>>> +
>>>>> + if (reg == NFC_FLASH_STATUS1 ||
>>>>> + reg == NFC_FLASH_STATUS2 ||
>>>>> + reg == NFC_IRQ_STATUS)
>>>>> + return __raw_readl(nfc->regs + reg);
>>>>> + /* Gang read/writes together for most registers. */
>>>>> + else
>>>>> + return *(u32 *)(nfc->regs + reg);
>>>>> +}
>>>>> +
>>>>> +static void nfc_write(struct mtd_info *mtd, uint reg, u32 val)
>>>>> +{
>>>>> + struct fsl_nfc *nfc = mtd_to_nfc(mtd);
>>>>> +
>>>>> + if (reg == NFC_FLASH_STATUS1 ||
>>>>> + reg == NFC_FLASH_STATUS2 ||
>>>>> + reg == NFC_IRQ_STATUS)
>>>>> + __raw_writel(val, nfc->regs + reg);
>>>>> + /* Gang read/writes together for most registers. */
>>>>> + else
>>>>> + *(u32 *)(nfc->regs + reg) = val;
>>>>> +}
>>>>
>>>> You should always be using raw I/O accessors. If the intent is to
>>>> bypass I/O ordering for certain registers, the raw accessors should
>>>> already be skipping that.
>
>>> Ok, will do.
>
>> Sorry, the above should say "always be using I/O accesors", not always
>> raw ones.
>
> This is probably because of me. There are lines like this for
> configuration,
>
> /* Set configuration register. */
> nfc_clear(mtd, NFC_FLASH_CONFIG, CONFIG_ADDR_AUTO_INCR_BIT);
> nfc_clear(mtd, NFC_FLASH_CONFIG, CONFIG_BUFNO_AUTO_INCR_BIT);
> nfc_clear(mtd, NFC_FLASH_CONFIG, CONFIG_BOOT_MODE_BIT);
> nfc_clear(mtd, NFC_FLASH_CONFIG, CONFIG_DMA_REQ_BIT);
> nfc_set(mtd, NFC_FLASH_CONFIG, CONFIG_FAST_FLASH_BIT);
>
> If you use some sort of 'volatile', you are saying to the compiler that
> these lines are *not*,
>
> ldr r0, [r1, #CONFIG] # read previous value
> ldr r2, =~MASK
> orr r0, #FAST_FLASH_BIT # set one bit.
> and r0, r2 # clear all bits at once.
> str r0, [r1, #CONFIG] # write it back.
>
> Instead it will change into five different load/modify/stores. The
> memory itself is just like SRAM except a few registers that are actually
> volatile; ie changed via the hardware.
>
> Particularly bad are the memcpy_io() on the ARM. Using these results
> in about 1/2 to 1/4 of the performance of the drivers through-put on the
> Vybrid. I can see that using accessors is good, but just replacing this
> with 'writel' will result in significantly more code and slower
> throughput.
>
> If people insist on this, then something like,
>
> val = nfc_read(mtd, NFC_REG);
> val = nfc_clear(val, CONFIG_ADDR_AUTO_INCR_BIT);
> val = nfc_clear(val, CONFIG_BUFNO_AUTO_INCR_BIT);
> val = nfc_clear(val, CONFIG_BOOT_MODE_BIT);
> val = nfc_clear(val, CONFIG_DMA_REQ_BIT);
> val = nfc_set(val, CONFIG_FAST_FLASH_BIT);
> nfc_write(mtd, NFC_REG, val);
>
> would result in the same code with 'nfc_read()' and 'nfc_write()' using
> the I/O accessors. I found this more difficult to read for the higher
> level functions. Instead some some standard macros for setting and
> clearing bits could be used. The original driver was using 'nfc_set()'
> and 'nfc_clear()'. Maybe they should just go.
>
I just applied this change:
diff --git a/drivers/mtd/nand/fsl_nfc.c b/drivers/mtd/nand/fsl_nfc.c
index df2c8be..01c010f 100644
--- a/drivers/mtd/nand/fsl_nfc.c
+++ b/drivers/mtd/nand/fsl_nfc.c
@@ -191,26 +191,14 @@ static u32 nfc_read(struct mtd_info *mtd, uint
reg)
{
struct fsl_nfc *nfc = mtd_to_nfc(mtd);
- if (reg == NFC_FLASH_STATUS1 ||
- reg == NFC_FLASH_STATUS2 ||
- reg == NFC_IRQ_STATUS)
- return __raw_readl(nfc->regs + reg);
- /* Gang read/writes together for most registers. */
- else
- return *(u32 *)(nfc->regs + reg);
+ return __raw_readl(nfc->regs + reg);
}
static void nfc_write(struct mtd_info *mtd, uint reg, u32 val)
{
struct fsl_nfc *nfc = mtd_to_nfc(mtd);
- if (reg == NFC_FLASH_STATUS1 ||
- reg == NFC_FLASH_STATUS2 ||
- reg == NFC_IRQ_STATUS)
- __raw_writel(val, nfc->regs + reg);
- /* Gang read/writes together for most registers. */
- else
- *(u32 *)(nfc->regs + reg) = val;
+ __raw_writel(val, nfc->regs + reg);
}
static void nfc_set(struct mtd_info *mtd, uint reg, u32 bits)
And did some performance measurement:
=> nand read ${loadaddr} 0x0 0x2000000
===> With if/else
NAND read: device 0 offset 0x0, size 0x2000000
33554432 bytes read in 8494 ms: OK (3.8 MiB/s)
===> With __raw_readl only...
NAND read: device 0 offset 0x0, size 0x2000000
33554432 bytes read in 8184 ms: OK (3.9 MiB/s)
=> nand write ${loadaddr} rootfs-ubi 0x2000000
===> With if/else
NAND write: device 0 offset 0xa00000, size 0x2000000
33554432 bytes written in 18200 ms: OK (1.8 MiB/s)
===> With __raw_readl only...
NAND write: device 0 offset 0xa00000, size 0x2000000
33554432 bytes written in 15095 ms: OK (2.1 MiB/s)
These values are reproducible. I guess by removing the conditional
branch in the nfc_read/nfc_write functions we even gain performance.
IMHO we should use the raw_writel only and "hand optimize" for functions
which are used often. For the initialization/configuration functions,
there is little value to save some register access.
--
Stefan
More information about the U-Boot
mailing list