[V1 PATCH 1/2] rockchip: sdram: Support getting banks from TPL for rk3568 and rk3588

Fri Apr 5 13:19:23 CEST 2024

Hi Chris,

On 4/4/24 23:33, Chris Morgan wrote:
> On Tue, Apr 02, 2024 at 06:38:59PM +0200, Quentin Schulz wrote:
>> Hi Chris,
>>
>> On 4/1/24 20:14, Chris Morgan wrote:
>>> From: Chris Morgan <macromorgan at hotmail.com>
>>>
>>> Allow RK3568 and RK3588 based boards to get the RAM bank configuration
>>> from the ROCKCHIP_TPL stage instead of the current logic. This fixes
>>> both an issue where 256MB of RAM is blocked for devices with >= 4GB
>>> of RAM and where memory holes need to be defined for devices with
>>>> = 16GB of RAM. In the event that neither SOC is used and the
>>> ROCKCHIP_TPL stage is not used, fall back to existing logic.
>>>
>>> Signed-off-by: Chris Morgan <macromorgan at hotmail.com>
>>> ---
>>>    arch/arm/mach-rockchip/sdram.c | 100 +++++++++++++++++++++++++++++++++
>>>    1 file changed, 100 insertions(+)
>>>
>>> diff --git a/arch/arm/mach-rockchip/sdram.c b/arch/arm/mach-rockchip/sdram.c
>>> index 0d9a0aef6f..e02fb03c5f 100644
>>> --- a/arch/arm/mach-rockchip/sdram.c
>>> +++ b/arch/arm/mach-rockchip/sdram.c
>>> @@ -12,6 +12,7 @@
>>>    #include <asm/io.h>
>>>    #include <asm/arch-rockchip/sdram.h>
>>>    #include <dm/uclass-internal.h>
>>> +#include <linux/errno.h>
>>>    DECLARE_GLOBAL_DATA_PTR;
>>> @@ -35,11 +36,110 @@ struct tos_parameter_t {
>>>    	s64 reserve[8];
>>>    };
>>> +/* Tag magic */
>>> +#define ATAGS_CORE_MAGIC	0x54410001
>>> +#define ATAGS_DDR_MEM_MAGIC	0x54410052
>>> +
>>> +/* Tag size and offset */
>>> +#define ATAGS_SIZE		SZ_8K
>>> +#define ATAGS_OFFSET		(SZ_2M - ATAGS_SIZE)
>>> +#define ATAGS_PHYS_BASE		(CFG_SYS_SDRAM_BASE + ATAGS_OFFSET)
>>> +
>>> +/* ATAGS memory structure. */
>>> +struct tag_ddr_mem {
>>> +	u32 count;
>>> +	u32 version;
>>> +	u64 bank[20];
>>> +	u32 flags;
>>> +	u32 data[2];
>>> +	u32 hash;
>>> +} __packed;
>>> +
>>> +/**
>>> + * rockchip_dram_init_banksize() - Get RAM banks from Rockchip TPL
>>> + *
>>> + * Iterate through the defined ATAGS memory location to first find a
>>> + * valid core header, then find a valid ddr_info header. Sanity check
>>> + * the number of banks found. Then, iterate through the data to add
>>> + * each individual memory bank. Perform fixups on memory banks that
>>> + * overlap with a reserved space. If an error condition is received,
>>> + * it is expected that memory bank setup will fall back on existing
>>> + * logic. If ROCKCHIP_EXTERNAL_TPL is false then immediately return,
>>> + * and if neither ROCKCHIP_RK3588 or ROCKCHIP_RK3568 is enabled
>>> + * immediately return.
>>> + *
>>> + * Return number of banks found on success or negative on error.
>>> + */
>>> +__weak int rockchip_dram_init_banksize(void)
>>> +{
>>> +	struct tag_ddr_mem *ddr_info;
>>> +	size_t val;
>>> +	size_t addr = ATAGS_PHYS_BASE;
>>
>> I think this should be phys_addr_t instead of size_t?
>>
>> size_t is an unsigned long on aarch64 and phys_addr_t is an unsigned long
>> long so 4B vs 8B.
>>
>> This however would likely prevent us from reusing this code on aarch32
>> machines, but maybe it's a problem for the people who'll look into
>> supporting this :) (also, aarch32 and >= 3.75GiB may be a bit optimistic :)
>> ).
> 
> Could I just specify a size and not worry about it? A u32 should be more
> than enough to hold the maximum RAM address of an RK3588 board (32GB).
> That would allow this to work on both 32 and 64 right? Otherwise I could
> further restrict this code to the AARCH64 ifdef.
> 

There's even a bigger issue here as phys_t is an 8B structure (on 
Aarch64) while the atags are actually 4B aligned, so that would 
unnecessarily increase the complexity for arithmetic on those addresses. 
So u32 would probably be fine then.

>>
>>> +	int i;
>>> +
>>
>> u8 is plenty enough here :)
> 
> I use it as a return value where there are negative numbers (though
> obviously this should never be negative since it's an increment
> counter). Does that matter?
> 

You could return i still and let the compiler do the conversion.

No, it's not a blocker, but that may unnecessarily increase the size of 
the TPL.

[...]

>>> +			break;
>>> +		addr += 4;
>>
>> This is an incorrect step size, addr is 4B, so this will result in 16B
>> increments, which may be too much. Additionally, we shouldn't read every 4B
>> as the tag is only ever guaranteed to be 4B aligned, not that we would have
>> a tag every 4B. This also means that it's possible somehow the content of a
>> tag at a 4B-aligned offset has the CORE_MAGIC for some reason, but we
>> shouldn't match on it.
>>
> 
> I'm not quite sure I follow. Are you saying I need to increment every
> 4 * the value of size in the tag_header? The value I show is 0x5 in
> my header meaning increment every 0x14?
> 

What we have in memory is (each 4B) (correct me if I misread Rockchip's 
code)

tag1.size = 6
tag1.magic
data1[0]
data1[1]
data1[2]
data1[3]
tag2.size = 4
tag2.magic
data2[0]
data2[1]

...

You start with addr = &tag1.size
when you do addr +4, you now have addr= &data1[2]

By casting data1[2] into a tag struct, you would have

tagX.size = data1[2]
tagX.magic = data1[3]

If somehow data1[3] matches the magic, you'll detect tag data as a tag 
header, and that's no good. Also, you may be missing a tag by checking 
every 16B, which isn't guaranteed by Rockchip's ATAGS (only guaranteed 
to be 4B aligned)

tag1.size is the size of the tag1 in multiples of 4B (u32), so the 
header + data.

That's what I got from

"""
#define tag_next(t)     ((struct tag *)((u32 *)(t) + (t)->hdr.size))
"""

because of pointer arithmetic there.

[...]

>>> +	}
>>> +	if (addr >= (ATAGS_PHYS_BASE + ATAGS_SIZE))
>>> +		return -ENODATA;
>>> +
>>> +	ddr_info = (void *)addr + 4;
>>
>> It seems that arithmetic operations on void pointers is illegal in the C
>> standard. This also quite obfuscate what you want to do here.
>>
>> Considering that in this patch you're iterating for each 4B until you get
>> the MAGIC, the next 4B are data for that header of that magic.
>>
>> If you go for the tag_header I suggested above, I would recommend to do the
>> following instead:
>>
>> """
>> ddr_info = (u8*)addr + sizeof(addr);
>> """
>>
>> This is a bit more explicit wrt to what's expected, we want to have the
>> address of the data right after the tag_header we point to currently. (and
>> in-code comments also do not hurt :) ).
>>
> 
> Should I use a (u8*) though? While that works for now, if the ATAGS get
> put at a higher offset wouldn't something like a u32 make more sense?
> 

That's a good point, but because of pointer arithmetic you cannot simply 
have

"""
ddr_info = (u32*)addr + sizeof(addr);
"""

Because that would make ddr_info 4* 4B (sizeof(addr) * u32 pointer 
arithmetic).

What you could do is:

"""
bool core_found = false;
u32* addr = (void *)ATAGS_PHYS_BASE;
struct tag_header *tag = addr;

while (addr < (ATAGS_PHYS_BASE + ATAGS_SIZE)) {
	val = tag->magic;
	if (!core_found) {
         	if (val == ATAGS_CORE_MAGIC)
			core_found = true;
         } else if (val == ATAGS_DDR_MEM_MAGIC) {
		break;
	}

	addr += tag->size;
         tag = addr;
}

if (addr >= (ATAGS_PHYS_BASE + ATAGS_SIZE))
	return -ENODATA;

ddr_info = addr + sizeof(tag) / sizeof(u32);
if (!ddr_info->count || ddr_info->count > CONFIG_NR_DRAM_BANKS)
	return -ENODATA;
"""

(NOT COMPILED/TESTED)

Cheers,
Quentin