[U-Boot] [PATCH v2 3/4] mtd: nand: add Freescale NFC driver
Stefan Agner
stefan at agner.ch
Mon Aug 18 11:41:54 CEST 2014
Am 2014-08-14 23:12, schrieb Bill Pringlemeir:
>> On 14 Aug 2014, stefan at agner.ch wrote:
>>
>> This adds initial support for Freescale NFC (NAND Flash Controller)
>> found in ARM Vybrid SoC's, Power Architecture MPC5125 and others.
>> However, this driver is only tested on Vybrid.
>
> This is only to expand on the nand controller register and SRAM use.
>
> [snip]
>
>> diff --git a/drivers/mtd/nand/vf610_nfc.c b/drivers/mtd/nand/vf610_nfc.c
>> new file mode 100644
>> index 0000000..3150ac1
>> --- /dev/null
>> +++ b/drivers/mtd/nand/vf610_nfc.c
>
> [snip]
>
>> +static inline u32 vf610_nfc_read(struct mtd_info *mtd, uint reg)
>> +{
>> + struct vf610_nfc *nfc = mtd_to_nfc(mtd);
>> +
>> + return readl(nfc->regs + reg);
>> +}
>> +
>> +static inline void vf610_nfc_write(struct mtd_info *mtd, uint reg, u32 val)
>> +{
>> + struct vf610_nfc *nfc = mtd_to_nfc(mtd);
>> +
>> + writel(val, nfc->regs + reg);
>> +}
>
> Ok, we always use readl/writel. This is fine, but a little slower and
> bigger. I may try a register cache if I resubmit to the Linux MTD as
> per Scott's suggestion. Especially, this version is good for an
> incremental patch.
I measured the difference and get 1MB/s
Full pages, readl/writel:
NAND read: device 0 offset 0x200000, size 0x800000
8388608 bytes read in 772 ms (10.4 MiB/s): OK
Full pages, __raw_readl/__raw_writel
NAND read: device 0 offset 0x200000, size 0x800000
8388608 bytes read in 696 ms (11.5 MiB/s): OK
Ok, this is actually quite a lot. Especially since I already optimized
the C code (by not using the helper functions like nfc_set/nfc_clear in
vf610_nfc_send_command), one would think there is now almost no
optimization potential. I looked into the disassembled code and could
narrow down the issue. Due to the memory barriers, all offsets were
calculated on each register access (nfc base to reg base, and add reg
offset), multiple instances of:
20: e59cc120 ldr ip, [ip, #288] ; 0x120
24: e59cc134 ldr ip, [ip, #308] ; 0x134
I optimized the code again and calculate the offsets manually and access
__raw_readl/__raw_writel rather then vf610_nfc_read/write in the
vf610_nfc_send_command(s) function, I get the full speed again:
NAND read: device 0 offset 0x200000, size 0x800000
8388608 bytes read in 687 ms (11.6 MiB/s): OK
>
> I think these are in 'arch/arm/include/asm/io.h' of U-Boot.
>
> #define dmb() __asm__ __volatile__ ("" : : : "memory")
> #define __iormb() dmb()
> #define __iowmb() dmb()
>
> #define readl(c) ({ u32 __v = __arch_getl(c); __iormb(); __v; })
> #define writel(v,c) ({ u32 __v = v; __iowmb(); __arch_putl(__v,c); __v; })
>
> Currently, these look like compiler barriers to me. Fine so far.
>
>> +
>> +static inline void vf610_nfc_set(struct mtd_info *mtd, uint reg, u32 bits)
>> +{
>> + vf610_nfc_write(mtd, reg, vf610_nfc_read(mtd, reg) | bits);
>> +}
>> +
>> +static inline void vf610_nfc_clear(struct mtd_info *mtd, uint reg, u32 bits)
>> +{
>> + vf610_nfc_write(mtd, reg, vf610_nfc_read(mtd, reg) & ~bits);
>> +}
>> +
>> +static inline void vf610_nfc_set_field(struct mtd_info *mtd, u32 reg,
>> + u32 mask, u32 shift, u32 val)
>> +{
>> + vf610_nfc_write(mtd, reg,
>> + (vf610_nfc_read(mtd, reg) & (~mask)) | val << shift);
>> +}
>> +
>> +/* Clear flags for upcoming command */
>> +static inline void vf610_nfc_clear_status(struct mtd_info *mtd)
>> +{
>> + u32 tmp = vf610_nfc_read(mtd, NFC_IRQ_STATUS);
>> + tmp |= CMD_DONE_CLEAR_BIT | IDLE_CLEAR_BIT;
>> + vf610_nfc_write(mtd, NFC_IRQ_STATUS, tmp);
>> +}
>> +
>> +/* Wait for complete operation */
>> +static inline void vf610_nfc_done(struct mtd_info *mtd)
>> +{
>> + uint start;
>> +
>> + vf610_nfc_set(mtd, NFC_FLASH_CMD2, START_BIT);
>> + barrier();
>
> This barrier() is not needed then. The vf610_nfc_set() should have
> done it twice already, plus everything is volatile.
>
Agreed, this is not needed any more.
> [snip]
>
>> +static inline void vf610_nfc_read_spare(struct mtd_info *mtd, void *buf,
>> + int len)
>> +{
>> + struct vf610_nfc *nfc = mtd_to_nfc(mtd);
>> +
>> + len = min(mtd->oobsize, (uint)len);
>> + if (len > 0)
>> + memcpy(buf, nfc->regs + mtd->writesize, len);
>
> Notice the 'memcpy(.. nfc->regs);'...
>
>> +}
>> +
>> +/* Read data from NFC buffers */
>> +static void vf610_nfc_read_buf(struct mtd_info *mtd, u_char *buf, int len)
>> +{
>> + struct vf610_nfc *nfc = mtd_to_nfc(mtd);
>> + uint c = nfc->column;
>> + uint l;
>> +
>> + /* Handle main area */
>> + if (!nfc->spareonly) {
>> +
>> + l = min((uint)len, mtd->writesize - c);
>> + nfc->column += l;
>> +
>> + if (!nfc->alt_buf)
>> + memcpy(buf, nfc->regs + NFC_MAIN_AREA(0) + c, l);
>
> Another 'memcpy(.. nfc->regs);'...
>
>> + else
>> + if (nfc->alt_buf & ALT_BUF_ID)
>> + *buf = vf610_nfc_get_id(mtd, c);
>> + else
>> + *buf = vf610_nfc_get_status(mtd);
>> +
>> + buf += l;
>> + len -= l;
>> + }
>> +
>> + /* Handle spare area access */
>> + if (len) {
>> + nfc->column += len;
>> + vf610_nfc_read_spare(mtd, buf, len);
>> + }
>> +}
>> +
>> +/* Write data to NFC buffers */
>> +static void vf610_nfc_write_buf(struct mtd_info *mtd, const u_char *buf,
>> + int len)
>> +{
>> + struct vf610_nfc *nfc = mtd_to_nfc(mtd);
>> + uint c = nfc->column;
>> + uint l;
>> +
>> + l = min((uint)len, mtd->writesize + mtd->oobsize - c);
>> + nfc->column += l;
>> + memcpy(nfc->regs + NFC_MAIN_AREA(0) + c, buf, l);
>
> Another 'memcpy(.. nfc->regs);'...
>
> [snip]
>
> These memcpy's are the same 'bus' interface as the registers. We should
> be just as worried about this SRAM buffer memory as the memory mapped
> registers, shouldn't we? Is a barrier() before reading and a barrier()
> after writing fine for U-Boot? Personally, I think they are safe as
> only the 'vf610_nfc_set(mtd, NFC_FLASH_CMD2, START_BIT)' needs some
I also think that that this is the only place a barrier is really
needed. However, as Scott stated:
On Wed, 2014-08-13 at 22:32, Scott Wood wrote:
> raw_writel() is itself something that should only be used for
> hand-optimized sections. For non-performance-critical code you should
> use normal writel() so that you don't need to worry about manually
> adding I/O barriers.
The reason I choosed readl/writel instead of the raw variants is to
preserve align with other drivers...
> care. Maybe a comment is fine? It seems the Vybrid is safe for
> different access sizes, but it is possible that some other CPU might not
> be able to access this memory via 32/16/8 bit accesses and 'memcpy()'
> may not be appropriate. It seems that 'natural' size of the NFC
> controller itself is 32bits and the CPU interface does lane masking.
> Ie, boot mode documentation talks about remapping
> 'sram_physical_addr[13:3] = {cpu_addr[11:3],cpu_addr[13:12]}' saying
> that bits 2,1 are not used (hopefully one based numbers). This is just
> my guess...
What assumptions do you make how memcpy accesses memory? This latest
patch now uses the optimized versions from the kernel... Maybe they even
try to access 64-bit width (the NIC interconnect supports 64-bit access)
>
> The VF6xx page has a documentation tab,
> http://www.freescale.com/webapp/sps/site/prod_summary.jsp?code=VF6xx
>
> There is an app note, AN4947 'Understanding Vybrid Architecture', which
> describes some timing details for the AHB bus (where this flash
> controller is connected). Pg21 Table 7 of that document gives some
> measurements. The QSPI is a similar peripheral on the AHB. The first
> and second lines give accesses of 4408 and subsequent accesses are 2770
> Cortex-A5 clocks. Normal SDRAM is 258 and 8 clocks. Ie, it is quite
> important in places to minimize accesses and try to make them
> sequential.
We also have caches, hence I don't think the access will take that long.
And on the other side, the SRAM is much faster.
>
> However, it looks like most U-Boot NAND drivers use the memcpy()? With
> the exceptions of fsl_elbc_nand.c, fsl_ifc_nand.c, and mpc5121_nfc.c.
> Maybe it doesn't matter...
--
Stefan
More information about the U-Boot
mailing list