drivers/ram/rockchip/sdram_common.c :: sdram_detect_dbw() LPDDR3/2 calculations seem wrong
Bjoern A. Zeeb
bzeeb-lists at lists.zabbadoz.net
Wed Nov 11 10:46:24 CET 2020
Hi,
I got a nanopc-t4 amongst others which shipped with:
DDR Version 1.15 20181010
Channel 0: LPDDR3, 933MHz
Bus Width=32 Col=10 Bank=8 Row=15/15 CS=2 Die Bus-Width=32 Size=2048MB
..
I have since upgraded to more recent u-boot versions:
U-Boot TPL 2020.07 (Sep 27 2020 - 12:34:15)
Channel 0: LPDDR3, 933MHz
BW=32 Col=10 Bk=8 CS0 Row=15 CS1 Row=15 CS=2 Die BW=16 Size=2048MB
U-Boot TPL 2020.10 (Nov 10 2020 - 13:37:45)
Channel 0: LPDDR3, 933MHz
BW=32 Col=10 Bk=8 CS0 Row=15 CS1 Row=15 CS=2 Die BW=16 Size=2048MB
The machine was highly instable showing memory and locking issues.
When only using two little cores, it was a lot more stable.
I went and also tried:
DDR Version 1.24 20191016
Channel 0: LPDDR3, 933MHz
Bus Width=32 Col=10 Bank=8 Row=15/15 CS=2 Die Bus-Width=16 Size=2048MB
which seems to match recent u-boot but all of them are different to the
original Die BW of 32 which I currently assume to be correct for the
Samsung K4E6E304EC-EGCG (so possibly the error also migrated into
rokchip-linux/rkbin ?).
Looking at sdram_common.c::sdram_detect_dbw()
300 cs_cap = (1 << (row + col + bk + bw - 20));
301 if (bw == 2) {
302 if (cs_cap <= 0x2000000) /* 256Mb */
303 die_bw_0 = (col < 9) ? 2 : 1;
304 else if (cs_cap <= 0x10000000) /* 2Gb
*/
305 die_bw_0 = (col < 10) ? 2 : 1;
306 else if (cs_cap <= 0x40000000) /* 8Gb
*/
307 die_bw_0 = (col < 11) ? 2 : 1;
308 else
309 die_bw_0 = (col < 12) ? 2 : 1;
310 if (cs > 1) {
311 row = cap_info->cs1_row;
312 cs_cap = (1 << (row + col + bk
+ bw - 20));
313 if (cs_cap <= 0x2000000) /*
256Mb */
314 die_bw_0 = (col < 9) ?
2 : 1;
315 else if (cs_cap <= 0x10000000)
/* 2Gb */
316 die_bw_0 = (col < 10) ?
2 : 1;
317 else if (cs_cap <= 0x40000000)
/* 8Gb */
318 die_bw_0 = (col < 11) ?
2 : 1;
319 else
320 die_bw_0 = (col < 12) ?
2 : 1;
321 }
322 } else {
ca_cap is off by 20 bits compared to the values you are comparing to;
in my case 0x400 and not 0x40000000:
type 6 row 15 col 10 bk 3 cs 2 bw 2 cs_cap 8 cs1_row 15
1 << (15 + 10 + 3 + 2 - 20) == 1 << 10 == 0x400
And similar in the 2nd case with cs1_row given cs > 1.
Now I know very little about all the memory chips out there but it seems
very unlikely to regain these 20 bits in these calculations.
So either the “-20” goes or the cs_cap <= values need adjustment.
The problem now comes from the fact that cap_info->dbw gets the wrong
value from die_bw_0 this way and given it is LPDDR3 I assume that
set_cap_relate_config() in sdram_rk3399.c later restores the wrong
values for the “memdata_ratio”.
There might be more problems lingering, but changing this, my machine
got a lot more reliable, though I still see memory errors when I push it
to its (temperature) limits running on all 6 cores, even with decent
cooling, but that might be a secondary problem.
Can someone with a lot more insight into this magic have a look and if
needed please fix it?
/bz
More information about the U-Boot
mailing list