[U-Boot] [PATCH 3/9] zymqmp: Replace home grown mmu code with generic table approach

Michal Simek michal.simek at xilinx.com
Fri Feb 26 09:29:01 CET 2016


On 26.2.2016 01:49, Alexander Graf wrote:
> 
> 
> On 23.02.16 14:07, Michal Simek wrote:
>> Hi,
>>
>> before I comment the rest. You need to also fix gem driver because it is
>> using 1MB mapping.
>>
>> diff --git a/drivers/net/zynq_gem.c b/drivers/net/zynq_gem.c
>> index b3821c31a91d..cf1376ce1bd7 100644
>> --- a/drivers/net/zynq_gem.c
>> +++ b/drivers/net/zynq_gem.c
>> @@ -150,10 +150,10 @@ struct emac_bd {
>>  };
>>
>>  #define RX_BUF 32
>> -/* Page table entries are set to 1MB, or multiples of 1MB
>> - * (not < 1MB). driver uses less bd's so use 1MB bdspace.
>> +/* Page table entries are set to 2MB, or multiples of 2MB
>> + * (not < 2MB). driver uses less bd's so use 2MB bdspace.
>>   */
>> -#define BD_SPACE       0x100000
>> +#define BD_SPACE       0x200000
>>  /* BD separation space */
>>  #define BD_SEPRN_SPACE (RX_BUF * sizeof(struct emac_bd))
> 
> Looks like I didn't reply to this before. The change above makes a lot
> of space, but is not required for this patch set. We simply break the
> page table down to 4k maps.

Unfortunately 1MB can be divided by 4k pages but with the v1 series I
was testing this driver stopped to work.


> So to keep the bits I touch in this patch set as small as possible, I
> think we should (if we really want to) move this into a separate,
> follow-up patch.

It means this has to be solved before this series is applied because
none wants to break any driver.


>>
>>
>>
>>
>> On 23.2.2016 12:33, Alexander Graf wrote:
>>>
>>>
>>> On 23.02.16 12:04, Michal Simek wrote:
>>>> On 22.2.2016 02:57, Alexander Graf wrote:
>>>>> Now that we have nice table driven page table creating code that gives
>>>>> us everything we need, move to that.
>>>>>
>>>>> Signed-off-by: Alexander Graf <agraf at suse.de>
>>>>> ---
>>>>>  arch/arm/cpu/armv8/zynqmp/cpu.c | 169 ----------------------------------------
>>>>>  include/configs/xilinx_zynqmp.h |  44 +++++++++++
>>>>>  2 files changed, 44 insertions(+), 169 deletions(-)
>>>>>
>>>>> diff --git a/arch/arm/cpu/armv8/zynqmp/cpu.c b/arch/arm/cpu/armv8/zynqmp/cpu.c
>>>>> index c71f291..3f661a9 100644
>>>>> --- a/arch/arm/cpu/armv8/zynqmp/cpu.c
>>>>> +++ b/arch/arm/cpu/armv8/zynqmp/cpu.c
>>>>> @@ -44,172 +44,3 @@ unsigned int zynqmp_get_silicon_version(void)
>>>>>  
>>>>>  	return ZYNQMP_CSU_VERSION_SILICON;
>>>>>  }
>>>>> -
>>>>> -#ifndef CONFIG_SYS_DCACHE_OFF
>>>>> -#include <asm/armv8/mmu.h>
>>>>> -
>>>>> -#define SECTION_SHIFT_L1	30UL
>>>>> -#define SECTION_SHIFT_L2	21UL
>>>>> -#define BLOCK_SIZE_L0		0x8000000000UL
>>>>> -#define BLOCK_SIZE_L1		(1 << SECTION_SHIFT_L1)
>>>>> -#define BLOCK_SIZE_L2		(1 << SECTION_SHIFT_L2)
>>>>> -
>>>>> -#define TCR_TG1_4K		(1 << 31)
>>>>> -#define TCR_EPD1_DISABLE	(1 << 23)
>>>>> -#define ZYNQMO_VA_BITS		40
>>>>> -#define ZYNQMP_TCR		TCR_TG1_4K | \
>>>>> -				TCR_EPD1_DISABLE | \
>>>>> -				TCR_SHARED_OUTER | \
>>>>> -				TCR_SHARED_INNER | \
>>>>> -				TCR_IRGN_WBWA | \
>>>>> -				TCR_ORGN_WBWA | \
>>>>> -				TCR_T0SZ(ZYNQMO_VA_BITS)
>>>>> -
>>>>> -#define MEMORY_ATTR	PMD_SECT_AF | PMD_SECT_INNER_SHARE |	\
>>>>> -			PMD_ATTRINDX(MT_NORMAL) |	\
>>>>> -			PMD_TYPE_SECT
>>>>> -#define DEVICE_ATTR	PMD_SECT_AF | PMD_SECT_PXN |	\
>>>>> -			PMD_SECT_UXN | PMD_ATTRINDX(MT_DEVICE_NGNRNE) |	\
>>>>> -			PMD_TYPE_SECT
>>>>> -
>>>>> -/* 4K size is required to place 512 entries in each level */
>>>>> -#define TLB_TABLE_SIZE	0x1000
>>>>> -
>>>>> -struct attr_tbl {
>>>>> -	u32 num;
>>>>> -	u64 attr;
>>>>> -};
>>>>> -
>>>>> -static struct attr_tbl attr_tbll1t0[4] = { {16, 0x0},
>>>>> -					   {8, DEVICE_ATTR},
>>>>> -					   {32, MEMORY_ATTR},
>>>>> -					   {456, DEVICE_ATTR}
>>>>> -					 };
>>>>> -static struct attr_tbl attr_tbll2t3[4] = { {0x180, DEVICE_ATTR},
>>>>> -					   {0x40, 0x0},
>>>>> -					   {0x3F, DEVICE_ATTR},
>>>>> -					   {0x1, MEMORY_ATTR}
>>>>> -					 };
>>>>> -
>>>>> -/*
>>>>> - * This mmu table looks as below
>>>>> - * Level 0 table contains two entries to 512GB sizes. One is Level1 Table 0
>>>>> - * and other Level1 Table1.
>>>>> - * Level1 Table0 contains entries for each 1GB from 0 to 511GB.
>>>>> - * Level1 Table1 contains entries for each 1GB from 512GB to 1TB.
>>>>> - * Level2 Table0, Level2 Table1, Level2 Table2 and Level2 Table3 contains
>>>>> - * entries for each 2MB starting from 0GB, 1GB, 2GB and 3GB respectively.
>>>>> - */
>>>>> -static void zynqmp_mmu_setup(void)
>>>>> -{
>>>>> -	int el;
>>>>> -	u32 index_attr;
>>>>> -	u64 i, section_l1t0, section_l1t1;
>>>>> -	u64 section_l2t0, section_l2t1, section_l2t2, section_l2t3;
>>>>> -	u64 *level0_table = (u64 *)gd->arch.tlb_addr;
>>>>> -	u64 *level1_table_0 = (u64 *)(gd->arch.tlb_addr + TLB_TABLE_SIZE);
>>>>> -	u64 *level1_table_1 = (u64 *)(gd->arch.tlb_addr + (2 * TLB_TABLE_SIZE));
>>>>> -	u64 *level2_table_0 = (u64 *)(gd->arch.tlb_addr + (3 * TLB_TABLE_SIZE));
>>>>> -	u64 *level2_table_1 = (u64 *)(gd->arch.tlb_addr + (4 * TLB_TABLE_SIZE));
>>>>> -	u64 *level2_table_2 = (u64 *)(gd->arch.tlb_addr + (5 * TLB_TABLE_SIZE));
>>>>> -	u64 *level2_table_3 = (u64 *)(gd->arch.tlb_addr + (6 * TLB_TABLE_SIZE));
>>>>> -
>>>>> -	level0_table[0] =
>>>>> -		(u64)level1_table_0 | PMD_TYPE_TABLE;
>>>>> -	level0_table[1] =
>>>>> -		(u64)level1_table_1 | PMD_TYPE_TABLE;
>>>>> -
>>>>> -	/*
>>>>> -	 * set level 1 table 0, covering 0 to 512GB
>>>>> -	 * set level 1 table 1, covering 512GB to 1TB
>>>>> -	 */
>>>>> -	section_l1t0 = 0;
>>>>> -	section_l1t1 = BLOCK_SIZE_L0;
>>>>> -
>>>>> -	index_attr = 0;
>>>>> -	for (i = 0; i < 512; i++) {
>>>>> -		level1_table_0[i] = section_l1t0;
>>>>> -		level1_table_0[i] |= attr_tbll1t0[index_attr].attr;
>>>>> -		attr_tbll1t0[index_attr].num--;
>>>>> -		if (attr_tbll1t0[index_attr].num == 0)
>>>>> -			index_attr++;
>>>>> -		level1_table_1[i] = section_l1t1;
>>>>> -		level1_table_1[i] |= DEVICE_ATTR;
>>>>> -		section_l1t0 += BLOCK_SIZE_L1;
>>>>> -		section_l1t1 += BLOCK_SIZE_L1;
>>>>> -	}
>>>>> -
>>>>> -	level1_table_0[0] =
>>>>> -		(u64)level2_table_0 | PMD_TYPE_TABLE;
>>>>> -	level1_table_0[1] =
>>>>> -		(u64)level2_table_1 | PMD_TYPE_TABLE;
>>>>> -	level1_table_0[2] =
>>>>> -		(u64)level2_table_2 | PMD_TYPE_TABLE;
>>>>> -	level1_table_0[3] =
>>>>> -		(u64)level2_table_3 | PMD_TYPE_TABLE;
>>>>> -
>>>>> -	section_l2t0 = 0;
>>>>> -	section_l2t1 = section_l2t0 + BLOCK_SIZE_L1; /* 1GB */
>>>>> -	section_l2t2 = section_l2t1 + BLOCK_SIZE_L1; /* 2GB */
>>>>> -	section_l2t3 = section_l2t2 + BLOCK_SIZE_L1; /* 3GB */
>>>>> -
>>>>> -	index_attr = 0;
>>>>> -
>>>>> -	for (i = 0; i < 512; i++) {
>>>>> -		level2_table_0[i] = section_l2t0 | MEMORY_ATTR;
>>>>> -		level2_table_1[i] = section_l2t1 | MEMORY_ATTR;
>>>>> -		level2_table_2[i] = section_l2t2 | DEVICE_ATTR;
>>>>> -		level2_table_3[i] = section_l2t3 |
>>>>> -				    attr_tbll2t3[index_attr].attr;
>>>>> -		attr_tbll2t3[index_attr].num--;
>>>>> -		if (attr_tbll2t3[index_attr].num == 0)
>>>>> -			index_attr++;
>>>>> -		section_l2t0 += BLOCK_SIZE_L2;
>>>>> -		section_l2t1 += BLOCK_SIZE_L2;
>>>>> -		section_l2t2 += BLOCK_SIZE_L2;
>>>>> -		section_l2t3 += BLOCK_SIZE_L2;
>>>>> -	}
>>>>> -
>>>>> -	/* flush new MMU table */
>>>>> -	flush_dcache_range(gd->arch.tlb_addr,
>>>>> -			   gd->arch.tlb_addr + gd->arch.tlb_size);
>>>>> -
>>>>> -	/* point TTBR to the new table */
>>>>> -	el = current_el();
>>>>> -	set_ttbr_tcr_mair(el, gd->arch.tlb_addr,
>>>>> -			  ZYNQMP_TCR, MEMORY_ATTRIBUTES);
>>>>> -
>>>>> -	set_sctlr(get_sctlr() | CR_M);
>>>>> -}
>>>>> -
>>>>> -int arch_cpu_init(void)
>>>>> -{
>>>>> -	icache_enable();
>>>>> -	__asm_invalidate_dcache_all();
>>>>> -	__asm_invalidate_tlb_all();
>>>>> -	return 0;
>>>>> -}
>>>>> -
>>>>> -/*
>>>>> - * This function is called from lib/board.c.
>>>>> - * It recreates MMU table in main memory. MMU and d-cache are enabled earlier.
>>>>> - * There is no need to disable d-cache for this operation.
>>>>> - */
>>>>> -void enable_caches(void)
>>>>> -{
>>>>> -	/* The data cache is not active unless the mmu is enabled */
>>>>> -	if (!(get_sctlr() & CR_M)) {
>>>>> -		invalidate_dcache_all();
>>>>> -		__asm_invalidate_tlb_all();
>>>>> -		zynqmp_mmu_setup();
>>>>> -	}
>>>>> -	puts("Enabling Caches...\n");
>>>>> -
>>>>> -	set_sctlr(get_sctlr() | CR_C);
>>>>> -}
>>>>> -
>>>>> -u64 *arch_get_page_table(void)
>>>>> -{
>>>>> -	return (u64 *)(gd->arch.tlb_addr + 0x3000);
>>>>> -}
>>>>> -#endif
>>>>> diff --git a/include/configs/xilinx_zynqmp.h b/include/configs/xilinx_zynqmp.h
>>>>> index 28622de..439f063 100644
>>>>> --- a/include/configs/xilinx_zynqmp.h
>>>>> +++ b/include/configs/xilinx_zynqmp.h
>>>>> @@ -29,6 +29,50 @@
>>>>>  #define CONFIG_SYS_MEMTEST_START	CONFIG_SYS_SDRAM_BASE
>>>>>  #define CONFIG_SYS_MEMTEST_END		CONFIG_SYS_SDRAM_SIZE
>>>>>  
>>>>> +#define CONFIG_SYS_FULL_VA
>>>>> +#define CONFIG_SYS_MEM_MAP {						\
>>>>> +	{								\
>>>>> +		.base = 0x0UL,						\
>>>>> +		.size = 0x80000000UL,					\
>>>>> +		.attrs = PTE_BLOCK_MEMTYPE(MT_NORMAL) |			\
>>>>> +			 PTE_BLOCK_INNER_SHARE				\
>>>>> +	}, {								\
>>>>> +		.base = 0x80000000UL,					\
>>>>> +		.size = 0x70000000UL,					\
>>>>> +		.attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) |		\
>>>>> +			 PTE_BLOCK_NON_SHARE |				\
>>>>> +			 PTE_BLOCK_PXN | PTE_BLOCK_UXN			\
>>>>> +	}, {								\
>>>>> +		.base = 0xf8000000UL,					\
>>>>> +		.size = 0x07e00000UL,					\
>>>>> +		.attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) |		\
>>>>> +			 PTE_BLOCK_NON_SHARE |				\
>>>>> +			 PTE_BLOCK_PXN | PTE_BLOCK_UXN			\
>>>>> +	}, {								\
>>>>> +		.base = 0xffe00000UL,					\
>>>>> +		.size = 0x00200000UL,					\
>>>>> +		.attrs = PTE_BLOCK_MEMTYPE(MT_NORMAL) |			\
>>>>> +			 PTE_BLOCK_INNER_SHARE				\
>>>>> +	}, {								\
>>>>> +		.base = 0x400000000UL,					\
>>>>> +		.size = 0x200000000UL,					\
>>>>> +		.attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) |		\
>>>>> +			 PTE_BLOCK_NON_SHARE |				\
>>>>> +			 PTE_BLOCK_PXN | PTE_BLOCK_UXN			\
>>>>> +	}, {								\
>>>>> +		.base = 0x600000000UL,					\
>>>>> +		.size = 0x800000000UL,					\
>>>>> +		.attrs = PTE_BLOCK_MEMTYPE(MT_NORMAL) |			\
>>>>> +			 PTE_BLOCK_INNER_SHARE				\
>>>>> +	}, {								\
>>>>> +		.base = 0xe00000000UL,					\
>>>>> +		.size = 0xf200000000UL,					\
>>>>> +		.attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) |		\
>>>>> +			 PTE_BLOCK_NON_SHARE |				\
>>>>> +			 PTE_BLOCK_PXN | PTE_BLOCK_UXN			\
>>>>> +	},								\
>>>>> +	}
>>>>> +
>>>>>  /* Have release address at the end of 256MB for now */
>>>>>  #define CPU_RELEASE_ADDR	0xFFFFFF0
>>>>>  
>>>>
>>>> No problem with this default map. I didn't check every entry. Siva will
>>>> look at it.
>>>> My problem is that this is static. You should enable option option to
>>>> generate this table dynamically. The reason is that systems can run just
>>>
>>> The way the code works today, I could easily just use a 0-size element
>>> as array terminator and use an external pointer rather than the local
>>> memory map array.
>>>
>>> However, that contradicts with a runtime memory contrained approach to
>>> determine the page table size. The only good path I could come up with
>>> here is to generate the page table during build time and then save how
>>> big it was. We couldn't do that with runtime changing tables.
>>
>> 0-size element is not enough. The problem are base addresses too.
>> ZynqMP memory map is just this
>>
>> 1. 0-2GB PS memory up to 2GB
>> 2. 2GB-3GB PL part
>> 3. 3GB-4GB PCIE + device + small memories OCM/TCM
>> 4. 4GB-16GB reserved
>> 5. 16-24GB PL part
>> 6. 24-32GB PCIe
>> 7. 32-64GB DDR
>> 8. 64-512GB PL
>> 9. 512-768 PCIe
>> 10. 768-1TB DDR
>> 11. 1TB-16TB PL
>>
>> PL regions 2, 5, 11 can also contain memories but you are able to find
>> them via DT for particular HW design.
>> And start addresses can be in this range.
>> In Xilinx tree I have patches for reading memory configuration from DT
>> and saving them to gd->bd->bi_dram based on CONFIG_NR_DRAM_BANKS setting
>> and I think this should be wired.
>> It means you should setup just memory attributes for memories which
>> u-boot is aware of. For the rest of address space it is not that sensitive.
> 
> I think that makes a lot of sense. The current patch set simply moves
> the logic as it stands today into tables rather than open coded code. To
> modify the tables dynamically as next step sounds like a good idea to me.

ok

>>>> with PL DDR not PS and you can't map memory which is not there which can
>>>> end up in system lock.
>>>> We want to change current code to setup MMU table for memories at run
>>>> time based on DT.
>>>
>>> So in your use case, the regions would still be there, just their size
>>> changes. That means we could still have static tables that then get
>>> modified by runtime code later on to just span less?
>>
>> Not entirely. Definitely the part of table can be static but I tend to
>> have especially memory mapping more dynamic.
> 
> I guess the way the table is now a struct in a .c file works for you? It
> means you can modify it to whatever extent you like, maybe even generate
> it completely dynamically if you wanted to.

I have seen that and will look at it again when I start to work with HW.

Thanks,
Michal




More information about the U-Boot mailing list