[PATCH] cmd: ti: Add DDRSS ECC test command

Neha Malcom Francis n-francis at ti.com
Mon Oct 6 07:26:03 CEST 2025


Hi Udit

On 01/10/25 14:34, Kumar, Udit wrote:
> 
> On 10/1/2025 1:50 PM, Neha Malcom Francis wrote:
>> From: Georgi Vlaev <g-vlaev at ti.com>
>>
>> Introduce a new version of the Keystone-II "ddr" command for testing the
>> inline ECC support in the DDRSS bridge available on K3 devices. The ECC
>> hardware support in K3's DDRSS and the test method differ substantially
>> from what we support in the K2 variant of the command. The name of the
>> new command is "ddrss" and it presently supports only single controller
>> testing.
> 
> what about keeping command name same as of k2 devices,
> 
> and file name as cmd/ti/ddr4.c

That makes sense, will change it.
> 
>> The ECC test procedure follows these steps:
>> 1) Flush and disable the data cache.
>> 2) Disable the protected ECC Rx range.
>> 3) Flip a bit in the address.
>> 4) Restore the range to original.
>> 5) Read the modified value (corrected).
>> 6) Re-enable the data cache.
>>
>> This will cause the 1-bit ECC error count to increase while the read
>> will return the corrected value.
> 
> code checks for 2-bit ecc as well, Please add more description when
> 2-bit errors are expected

2-bit ECC errors trigger a synchronous abort, this U-Boot test case does
not catch it. I will add in the description about this expected behavior.
> 
> 
>> The K3 version of the command extends the syntax for the "ecc_err"
>> argument by also introducing an argument for range which specifies which
>> range (0, 1, 2) the address is located in.
>>
>> Signed-off-by: Georgi Vlaev <g-vlaev at ti.com>
>> Signed-off-by: Santhosh Kumar K <s-k6 at ti.com>
>> [n-francis at ti.com: Add J7 and multiple-region support, simplify logic]
>> Signed-off-by: Neha Malcom Francis <n-francis at ti.com>
>> ---
>> Test logs on J784S4-EVM (after modifying for single controller with
>> ECC enabled)
>> https://gist.github.com/nehamalcom/80437234ddd2e22007dec3d1c37dcd6a
>>
>>   cmd/ti/Kconfig  |   7 ++
>>   cmd/ti/Makefile |   1 +
>>   cmd/ti/ddrss.c  | 243 ++++++++++++++++++++++++++++++++++++++++++++++++
>>   3 files changed, 251 insertions(+)
>>   create mode 100644 cmd/ti/ddrss.c
>>
>> diff --git a/cmd/ti/Kconfig b/cmd/ti/Kconfig
>> index 9442c9993c1..0d6c86bad3e 100644
>> --- a/cmd/ti/Kconfig
>> +++ b/cmd/ti/Kconfig
>> @@ -8,6 +8,13 @@ config CMD_DDR3
>>          supports memory verification, memory comapre and ecc
>>          verification if supported.
>>   +config CMD_DDRSS
>> +    bool "command for verifying DDRSS Inline ECC features"
> 
> I think you should add depends upon CONFIG_ARCH_K3
> 

Will do.

> 
>> +    help
>> +       Support for testing DDRSS on TI platforms. This command supports
>> +       memory verification, memory compare and inline ECC verification
>> +       if supported.
>> +
>>   config CMD_PD
>>       bool "command for verifying power domains"
>>       depends on TI_POWER_DOMAIN
>> diff --git a/cmd/ti/Makefile b/cmd/ti/Makefile
>> index 5f9c64f598a..d0555e7edf6 100644
>> --- a/cmd/ti/Makefile
>> +++ b/cmd/ti/Makefile
>> @@ -2,4 +2,5 @@
>>   # Copyright (C) 2017 Texas Instruments Incorporated -
>> https://www.ti.com/
>>     obj-$(CONFIG_CMD_DDR3) += ddr3.o
>> +obj-$(CONFIG_CMD_DDRSS) += ddrss.o
>>   obj-$(CONFIG_CMD_PD) += pd.o
>> diff --git a/cmd/ti/ddrss.c b/cmd/ti/ddrss.c
>> new file mode 100644
>> index 00000000000..d1481547938
>> --- /dev/null
>> +++ b/cmd/ti/ddrss.c
>> @@ -0,0 +1,243 @@
>> +// SPDX-License-Identifier: GPL-2.0+
>> +/*
>> + * DDRSS: DDR 1-bit inline ECC test command
>> + *
>> + * Copyright (C) 2025 Texas Instruments Incorporated -
>> http://www.ti.com/
>> + */
>> +
>> +#include <asm/io.h>
>> +#include <asm/cache.h>
>> +#include <asm/global_data.h>
>> +#include <command.h>
>> +#include <cpu_func.h>
>> +#include <linux/bitops.h>
>> +
>> +DECLARE_GLOBAL_DATA_PTR;
> 
> Do you plan to use gd at later stage ?

Yes we use gd to grab ram_base, dram banks etc. below
> 
> 
>> +
>> +#define K3_DDRSS_MAX_ECC_REGIONS    3
>> +
>> +#if (IS_ENABLED(CONFIG_SOC_K3_J784S4) ||\
>> +    IS_ENABLED(CONFIG_SOC_K3_J721S2) ||\
>> +    IS_ENABLED(CONFIG_SOC_K3_J7200) ||\
>> +    IS_ENABLED(CONFIG_SOC_K3_J721E))
>> +#define DDRSS_BASE            0x2980000
>> +#else
>> +#define DDRSS_BASE            0x0f300000
>> +#endif
>> +
>> +#define DDRSS_V2A_CTL_REG        0x0020
>> +#define DDRSS_V2A_INT_RAW_REG        0x00a0
>> +#define DDRSS_V2A_INT_STAT_REG        0x00a4
>> +#define DDRSS_V2A_INT_ECC1BERR        BIT(3)
>> +#define DDRSS_V2A_INT_ECC2BERR        BIT(4)
>> +#define DDRSS_V2A_INT_ECCM1BERR        BIT(5)
>> +#define DDRSS_ECC_CTRL_REG        0x0120
>> +#define DDRSS_ECC_CTRL_REG_ECC_EN    BIT(0)
>> +#define DDRSS_ECC_CTRL_REG_RMW_EN    BIT(1)
>> +#define DDRSS_ECC_CTRL_REG_ECC_CK    BIT(2)
>> +#define DDRSS_ECC_CTRL_REG_WR_ALLOC    BIT(4)
>> +#define DDRSS_ECC_R0_STR_ADDR_REG    0x0130
>> +#define DDRSS_ECC_Rx_STR_ADDR_REG(x)    (0x0130 + ((x) * 8))
>> +#define DDRSS_ECC_Rx_END_ADDR_REG(x)    (0x0134 + ((x) * 8))
>> +#define DDRSS_ECC_1B_ERR_CNT_REG    0x0150
>> +#define DDRSS_ECC_1B_ERR_THRSH_REG    0x0154
>> +#define DDRSS_ECC_1B_ERR_ADR_REG    0x0158
>> +#define DDRSS_ECC_1B_ERR_MSK_LOG_REG    0x015c
>> +
>> +static inline u32 ddrss_read(u32 reg)
>> +{
>> +    return readl((unsigned long)(DDRSS_BASE + reg));
>> +}
>> +
>> +static inline void ddrss_write(u32 value, u32 reg)
>> +{
>> +    writel(value, (unsigned long)(DDRSS_BASE + reg));
>> +}
>> +
>> +/* ddrss_check_ecc_status()
>> + * Report the ECC state after test. Check/clear the interrupt
>> + * status register, dump the ECC err counters and ECC error offset.
>> + */
>> +static void ddrss_check_ecc_status(void)
>> +{
>> +    u32 ecc_1b_err_cnt, v2a_int_raw, ecc_1b_err_msk;
>> +    phys_addr_t ecc_1b_err_adr;
>> +
>> +    v2a_int_raw = ddrss_read(DDRSS_V2A_INT_RAW_REG);
>> +
>> +    /* 1-bit correctable */
>> +    if (v2a_int_raw & DDRSS_V2A_INT_ECC1BERR) {
>> +        puts("\tECC test: DDR ECC 1-bit error\n");
>> +
>> +        /* Dump the 1-bit counter and reset it, as we want a
>> +         * new interrupt to be generated when above the error
>> +         * threshold
> 
> its interrupt or poll for status ?
> 

A single poll on the interrupt's status register for status.

> 
>> +         */
>> +        ecc_1b_err_cnt = ddrss_read(DDRSS_ECC_1B_ERR_CNT_REG);
>> +        if (ecc_1b_err_cnt) {
>> +            printf("\tECC test: 1-bit ECC err count: %u\n",
>> +                   ecc_1b_err_cnt & 0xffff);
>> +            ddrss_write(1, DDRSS_ECC_1B_ERR_CNT_REG);
>> +        }
> 
> in case no ecc count then does this mean test failed ?

If ECC was enabled in that region then it means it's a fail, else it
means ECC was not enabled there.
> 
> 
>> +
>> +        /* ECC fault addresses are also recorded in a 2-word deep
>> +         * FIFO. Calculate and report the 8-byte range of the error
>> +         */
>> +        ecc_1b_err_adr = ddrss_read(DDRSS_ECC_1B_ERR_ADR_REG);
>> +        ecc_1b_err_msk = ddrss_read(DDRSS_ECC_1B_ERR_MSK_LOG_REG);
>> +        if (ecc_1b_err_msk) {
>> +            if ((IS_ENABLED(CONFIG_SOC_K3_AM642)) ||
>> +                (IS_ENABLED(CONFIG_SOC_K3_AM625))) {
>> +                /* AM64/AM62:
>> +                 * The address of the ecc error is 16-byte aligned.
>> +                 * Each bit in 4 bit mask represents 8 bytes ECC quanta
>> +                 * that has the 1-bit error
>> +                 */
>> +                ecc_1b_err_msk &= 0xf;
>> +                ecc_1b_err_adr <<= 4;
>> +                ecc_1b_err_adr += (fls(ecc_1b_err_msk) - 1) * 8;
>> +            } else {
>> +                /* AM62A/AM62P:
>> +                 * The address of the ecc error is 32-byte aligned.
>> +                 * Each bit in 8 bit mask represents 8 bytes ECC quanta
>> +                 * that has the 1-bit error
>> +                 */
>> +                ecc_1b_err_msk &= 0xff;
>> +                ecc_1b_err_adr <<= 5;
>> +                ecc_1b_err_adr += (fls(ecc_1b_err_msk) - 1) * 8;
>> +            }
>> +
>> +            printf("\tECC test: 1-bit error in [0x%llx:0x%llx]\n",
>> +                   ecc_1b_err_adr, ecc_1b_err_adr + 8);
>> +            /* Pop the top of the addr/mask FIFOs */
>> +            ddrss_write(1, DDRSS_ECC_1B_ERR_ADR_REG);
>> +            ddrss_write(1, DDRSS_ECC_1B_ERR_MSK_LOG_REG);
>> +        }
>> +        ddrss_write(DDRSS_V2A_INT_ECC1BERR, DDRSS_V2A_INT_STAT_REG);
>> +    }
>> +
>> +    /* 2-bit uncorrectable */
>> +    if (v2a_int_raw & DDRSS_V2A_INT_ECC2BERR) {
>> +        puts("\tECC test: DDR ECC 2-bit error\n");
>> +        ddrss_write(DDRSS_V2A_INT_ECC2BERR, DDRSS_V2A_INT_STAT_REG);
>> +    }
> 
> 
>> +    /* multiple 1-bit errors (uncorrectable) in multiple words */
>> +    if (v2a_int_raw & DDRSS_V2A_INT_ECCM1BERR) {
>> +        puts("\tECC test: DDR ECC multi 1-bit errors\n");
>> +        ddrss_write(DDRSS_V2A_INT_ECCM1BERR, DDRSS_V2A_INT_STAT_REG);
>> +    }
> 
> 
> is this error case, if you are hitting uncorrectable errors ?

This is in the case of multiple 1-bit errors which also depends on the
configuration for the threshold for the number of 1-bit errors that
should cause an uncorrectable error. If the threshold is 2 or more, it
will not become a synchronous abort.

> 
> 
>> +}
>> +
>> +/* ddrss_memory_ecc_err()
>> + * Simulate an ECC error - change a 32b word at address in an ECC
>> enabled
> 
> variables used are u64

Will correct.

> 
> 
>> + * range. This removes the tested address from the ECC checks, changes a
>> + * word, and then restores the ECC range as configured by k3_ddrss in
>> R5 SPL.
>> + */
>> +static int ddrss_memory_ecc_err(u64 addr, u64 ecc_err, int range)
>> +{
>> +    u64 ecc_start_addr, ecc_end_addr, ecc_temp_addr;
>> +    u64 val1, val2, val3;
>> +
>> +    /* Flush and disable dcache */
>> +    flush_dcache_all();
>> +    dcache_disable();
>> +
>> +    /* Setup a threshold for 1-bit errors to generate interrupt */
>> +    ddrss_write(1, DDRSS_ECC_1B_ERR_THRSH_REG);
>> +
>> +    puts("Testing DDR ECC:\n");
>> +    /* Get the Rx range configuration */
>> +    ecc_start_addr = ddrss_read(DDRSS_ECC_Rx_STR_ADDR_REG(range));
>> +    ecc_end_addr = ddrss_read(DDRSS_ECC_Rx_END_ADDR_REG(range));
>> +
>> +    /* Calculate the end of the Rx ECC region up to the tested
>> address */
>> +    ecc_temp_addr = (addr - gd->ram_base) >> 16;
>> +
>> +    puts("\tECC test: Disabling DDR ECC ...\n");
>> +    /* Disable entire region */
>> +    ddrss_write(ecc_start_addr, DDRSS_ECC_Rx_END_ADDR_REG(range));
>> +    ddrss_write(ecc_end_addr, DDRSS_ECC_Rx_STR_ADDR_REG(range));
>> +
>> +    /* Inject error in the address */
>> +    val1 = readl((unsigned long long)addr);
>> +    val2 = val1 ^ ecc_err;
>> +    writel(val2, (unsigned long long)addr);
>> +    val3 = readl((unsigned long long)addr);
>> +
>> +    /* Re-enable the ECC checks for the R0 region */
>> +    ddrss_write(ecc_end_addr, DDRSS_ECC_Rx_END_ADDR_REG(range));
>> +    ddrss_write(ecc_start_addr, DDRSS_ECC_Rx_STR_ADDR_REG(range));
>> +    /* Make sure the ECC range is restored before doing anything else */
>> +    mb();
>> +
>> +    printf("\tECC test: addr 0x%llx, read data 0x%llx, written data
>> 0x%llx, err pattern: 0x%llx, read after write data 0x%llx\n",
>> +           addr, val1, val2, ecc_err, val3);
>> +
>> +    puts("\tECC test: Enabled DDR ECC ...\n");
>> +    /* Read again from the address. This creates an ECC 1-bit error
>> +     * condition, and returns the corrected value
>> +     */
>> +    val1 = readl((unsigned long long)addr);
>> +    printf("\tECC test: addr 0x%llx, read data 0x%llx\n", addr, val1);
>> +
>> +    /* Set threshold for 1-bit errors to 0 to disable the interrupt */
>> +    ddrss_write(0, DDRSS_ECC_1B_ERR_THRSH_REG);
>> +    /* Report the ECC status */
>> +    ddrss_check_ecc_status();
>> +
>> +    dcache_enable();
>> +
> 
> Do we have some cases, where test may fail ?

This depends on the configuration of ECC, which is why we are not
explicitly saying pass/fail; rather based on the configuration given and
the behavior seen, the user takes the call to decide if it was a pass/fail.

> 
> 
>> +    return 0;
>> +}
>> +
>> +/* ddrss_is_ecc_enabled()
>> + * Report if ECC is enabled.
>> + */
>> +static int ddrss_is_ecc_enabled(void)
>> +{
>> +    u32 ecc_ctrl = ddrss_read(DDRSS_ECC_CTRL_REG);
>> +
>> +    /* Assume ECC is enabled only if all bits set by k3_ddrss are set */
>> +    return (ecc_ctrl & (DDRSS_ECC_CTRL_REG_ECC_EN |
>> +                DDRSS_ECC_CTRL_REG_RMW_EN |
>> +                DDRSS_ECC_CTRL_REG_WR_ALLOC |
>> +                DDRSS_ECC_CTRL_REG_ECC_CK));
>> +}
>> +
>> +static int do_ddrss_test(struct cmd_tbl *cmdtp, int flag, int argc,
>> +             char *const argv[])
>> +{
>> +    u64 start_addr, ecc_err, x;
> 
> x to range, please
> 
> 

Will change

>> +
>> +    if (!(argc == 5 && (strncmp(argv[1], "ecc_err", 8) == 0)))
>> +        return cmd_usage(cmdtp);
>> +
>> +    if (!ddrss_is_ecc_enabled()) {
>> +        puts("ECC not enabled. Please enable ECC any try again\n");
>> +        return CMD_RET_FAILURE;
>> +    }
>> +
>> +    start_addr = simple_strtoul(argv[2], NULL, 16);
>> +    ecc_err = simple_strtoul(argv[3], NULL, 16);
> 
> Do you want to impose some max side on ecc_err ?
> 
> as per description, 32 bits are used for

Yes I will add a check for this.

> 
> 
>> +    x = simple_strtoul(argv[4], NULL, 16);
> check for range (wrong value)
>> +
>> +    if (!((start_addr >= gd->bd->bi_dram[0].start &&
>> +           (start_addr <= (gd->bd->bi_dram[0].start +
>> gd->bd->bi_dram[0].size - 1))) ||
>> +          (start_addr >= gd->bd->bi_dram[1].start &&
>> +           (start_addr <= (gd->bd->bi_dram[1].start +
>> gd->bd->bi_dram[1].size - 1))))) {
>> +        puts("Address is not in the DDR range\n");
>> +        return CMD_RET_FAILURE;
>> +    }
>> +
>> +    ddrss_memory_ecc_err(start_addr, ecc_err, x);
> 
> results of ddrss_memory_ecc_err, should be reported back as CMD_FAIL or
> CMD_PASS ,
> 
> In case, you are assuming ddrss_memory_ecc_err will always pass . Please
> add in comments.

Will add in the comments.

> 
>> +    return 0;
>> +}
>> +
>> +U_BOOT_CMD(ddrss, 5, 1, do_ddrss_test,
>> +       "DDRSS test",
>> +       "ecc_err <addr in hex> <bit_err in hex> <range 0/1/2> -
>> generate bit errors\n"
>> +       "    in DDR data at <addr>, the command will read a 32-bit
>> data\n"
>> +       "    from <addr>, and write (data ^ bit_err) back to <addr>\n"
>> +       "    in range 0, 1, or 2 (if default full region ECC is
>> enabled, choose 0)\n"
>> +);

Thanks for the review!

-- 
Thanking You
Neha Malcom Francis


More information about the U-Boot mailing list