[U-Boot] [PATCH] board_f: fix noncached reservation calculation

Stephen Warren swarren at wwwdotorg.org
Tue Aug 27 22:49:32 UTC 2019


On 8/27/19 4:10 PM, Vikas MANOCHA wrote:
> Hi Stephen,
> 
>> -----Original Message-----
>> From: Stephen Warren <swarren at wwwdotorg.org>
>> Sent: Tuesday, August 27, 2019 10:55 AM
>> To: Tom Rini <trini at konsulko.com>
>> Cc: twarren at wwwdotorg.org; u-boot at lists.denx.de; Stephen Warren
>> <swarren at nvidia.com>; Vikas MANOCHA <vikas.manocha at st.com>
>> Subject: [PATCH] board_f: fix noncached reservation calculation
>>
>> From: Stephen Warren <swarren at nvidia.com>
>>
>> The current code in reserve_noncached() has two issues:
>>
>> 1) The first update of gd->start_addr_sp always rounds down to a section
>> start. However, the equivalent calculation in cache.c:noncached_init()
>> always first rounds up to a section start, then subtracts a section size.
>> These two calculations differ if the initial value is already rounded to section
>> alignment.
> 
> It shouldn't cause any issue, first one round down to section size. Second
> one(cache.c: noncached_init()) rounds up, so needs section size subtraction.

Here's an example where it fails, based on code before my patch:

Assume that MMU section size is 2, and that mem_malloc_start and 
gd->start_addr_sp are both 1000M on entry to the functions, and the 
noncached region is 1 (what Jetson TX1 uses). The example uses values 
assumed to be multiples of 1M to make the numbers easier to read.

noncached_init:

// mem_malloc_start = 1000
end = ALIGN(mem_malloc_start, MMU_SECTION_SIZE) - MMU_SECTION_SIZE;
// end = 1000 - 2 = 998 // was already aligned, so 1000 not 1002
size = ALIGN(CONFIG_SYS_NONCACHED_MEMORY, MMU_SECTION_SIZE);
// size = 2
start = end - size;
// start = 998 - 2 = 996
// region is 996...998

reserve_noncached:

// gd->start_addr_sp = 1000
gd->start_addr_sp &= ~(MMU_SECTION_SIZE - 1);
// gd->start_addr_sp = 1000
gd->start_addr_sp -= CONFIG_SYS_NONCACHED_MEMORY;
// gd->start_addr_sp = 1000 - 1 = 999
// region is 999...1000

So, the end of the region that's been reserved is 1000, yet the code 
that sets up the noncached region believes the end of the region is at 
998. Even ignoring the difference in size calculation due to issue (2) 
below, that still means the reservation is in the wrong place, and the 
stack can end up overlaid with the noncached reservation, or even other 
data below it.

>> 2) The second update of gd->start_addr_sp subtracts exactly
>> CONFIG_SYS_NONCACHED_MEMORY, whereas the equivalent calculation in
>> cache.c:noncached_init() rounds the noncached size up to section alignment
>> before subtracting it. The two calculations differ if the noncached region size
>> is not a multiple of the MMU section size.
> 
> Never thought CONFIG_SYS_NON_CACACHED_MEMORY could be non-multiple of
> MMU section size for basic MMU setup in u-boot. It has granularity of section size.
> Is it the case with Jetson TX1 ?

Yes, on Jetson TX1, the MMU section size is 2M, yet the noncached region 
is 1M. Nothing in the README docs for the nocached region state or imply 
that the noncached region needs to be a multiple of the MMU section 
size, and all code that uses the config symbol before your patch rounds 
the config symbol to MMU section size, implying that its value doens't 
need to be rounded already.

>> In practice, one/both of those issues causes a practical problem on Jetson
>> TX1; U-Boot triggers a synchronous abort during initialization, likely due to
>> overlapping use of some memory region.
>>
>> This change fixes both these issues by duplicating the exact calculations from
>> noncached_init() into reserve_noncached().
>>
>> However, this fix assumes that gd->start_addr_sp on entry to
>> reserve_noncached() exactly matches mem_malloc_start on entry to
>> noncached_init(). I haven't traced the code to see whether it absolutely
>> guarantees this in all (or indeed any!) cases. Consequently, I added some
>> comments in the hope that this condition will continue to be true.
> 
> It is enforced it in the code, reserve_noncached is called from 
> reserve_malloc()after malloc area reservation.

That's a bit implicit still; nothing in reserve_malloc sets or uses the 
value of mem_malloc_start, so the two could easily become decoupled if 
the reservation calculations don't match the code which actually sets up 
the region usage, which incidentally is exactly what happened here, and 
hence why I found this bug.

>> diff --git a/common/board_r.c b/common/board_r.c index
>> b7f68bba4a7e..d6fb5047a265 100644
>> --- a/common/board_r.c
>> +++ b/common/board_r.c
>> @@ -247,6 +247,10 @@ static int initr_malloc(void)
>>   	      gd->malloc_ptr / 1024);
>>   #endif
>>   	/* The malloc area is immediately below the monitor copy in DRAM
>> */
>> +	/*
>> +	 * This value MUST match the value of gd->start_addr_sp in
>> board_f.c:
>> +	 * reserve_noncached().
>> +	 */
> 
> minor cosmetic suggestion:
> gd->start_addr_sp is moving pointer, difficult to comprehend sometimes here, same is true for malloc
> area also, how about merging two comments like:
> 
> 	/* The malloc area is immediately below the monitor copy in DRAM followed by noncached
> 	*/

I'd rather have an explicit separate comment which mentions the other 
function and variable names; if someone searches the code later, it's 
more likely they'll find this comment that way. I guess I could replace 
the intermediate /* and */ lines with just * to merge the comments 
without changing the text in them if you want.


More information about the U-Boot mailing list