[PATCH v5 11/11] riscv: Add FPIOA and GPIO support for Kendryte K210

Sat Sep 5 16:40:08 CEST 2020

On 9/2/20 11:59 AM, Sean Anderson wrote:
> On 9/2/20 8:26 AM, Heinrich Schuchardt wrote:
>> After adding some debug functions the error appeared and disappeared
>> when changing the code in function panic(). So my guess is that there is
>> some alignment problem in the static data section.
> 
> I investigated this further using the following script
> 
> while true; do
> 	sed -i 's/nop();$/nop(); nop();/g' board/sipeed/maix/maix.c &&
> 	git commit --amend --no-edit board/sipeed/maix/maix.c &&
> 	CROSS_COMPILE=riscv64-linux-gnu- make -j$(nproc) &&
> 	kflash -tp /dev/ttyUSB0 -B bit_mic -b 1500000 u-boot-dtb.bin"
> done
> 
> To start this process, create a commit which adds a nop() to
> board/sipeed/maix/maix.c. On every iteration, this script will amend
> that commit by adding another nop. I tried up to 65 nops. If the amount
> of nops is 0, 24, 28, 29, 30, 31, 32, 40, 44, 46, 49, 56, 60, 61, 62,
> 63, or 64 the board fails to boot. Of these failures, all printed up to
> "DRAM: ..." except for those with 28, 29, 30, 31, 60, 61, 62, or 64
> nops. There is clearly a pattern whith failures occuring at or near
> (but not always exactly on) multiples of 4, and in the lead-up to
> multiples of 32.

These patterns cause two different bugs, which I will refer to as the
"multiple-of-four" bug and the "periodic-32" bug.

The multiple-of-four bug is fixed by [1]. This bug is not present in
u-boot/master atm, but I am not sure why. Perhaps adding more drivers
which depend on the device tree triggers the behavior.

>> On 01.09.20 03:19, Rick Chen wrote:
>>> To see if this way can pin down which instruction or the crucial code
>>> to cause the bus hang problem. And guess what maybe the root-cause.
>>>
>>> If you can find the instruction which may cause the bus hang, you can
>>> info all-registers and compare the differences between NG and OK. And
>>> guess what maybe the root-cause.
>>
>> Trying to narrow down on the problem I found the following:
>>
>> The system hangs before arch_cpu_init_dm() is called.
> 
> This is not always the case. On most boots, the following output is
> present:
> 
> U-Boot 2020.10-rc3-00045-g7532b003f0 (Sep 02 2020 - 11:09:16 -0400)
> 
> DRAM:  8 MiB
> 
> which means at least everything up to dram_init gets called.

This output is symptomatic of the multiple-of-four bug. After applying
[1], either a successful boot should take place, or there should be no
output at all (corresponding to the periodic-32 bug).

>>>
>>> Maybe you can try to set a break and access the bus, if the bus access
>>> fail, then you re-set a break a bit ahead until the bus access NOT
>>> fail.
> 
> Yeah, I was investigating that, however I was unable to get the k210 to
> break at 0x80000000. I suspect this may be a problem with openocd, as
> the k210 port is rather buggy (e.g. it can cause address misaligned
> errors, and sometimes leaves the pc in the debug dection of memory). I
> *can* get it to break in the otp (0x88000000), so perhaps I just need to
> identify the address before it jumps to U-Boot.

When attempting to boot with a U-boot which has the periodic-32 bug,
there is no output, even with an early uart and debug logs enabled. I
have tried using two versions of openocd to determine the cause. The kendryte
openocd [2] supports attaching to either core of the k210, but must be
restarted to debug the other core. It also keeps the non-debugged core
halted. The riscv openocd [3] only supports debugging core 0. However,
it contains numerous bugfixes and improvements which the kendryte
openocd does not contain.

When attaching the kendryte openocd, the following output (among other
things) is printed:

Core [1] halted at 0x80015590 due to debug interrupt
Core [0] halted at 0x88008c00 due to debug interrupt

The core 1's pc is located within _sifive_serial_putc, and this is
reflected by the console containing a partially-printed announcement
for the first printed initcall. This initcall is initf_bootstage, which
is the first initcall after log_init. If openocd is attached to core 1,
and it is resumed, then U-Boot boots as normal (in contrast to the
behavior exhibited if openocd is not attached).

What is more interesting is the pc of core 0. It is located in the boot
rom of the K210. According to [4] (and as verified by myself), this
address corresponds with the uarths_getc function. This function is
only called by isp_run. This means core 0 is stuck in ISP mode, which is
used to flash firmware.

Attaching a debugger clearly affects the boot process of this core, but
I am not sure in what manner. I am attempting to modify the riscv
openocd to support multiple cores, but it is slow going (there is a lot
of undocumented code with side-effects).

--Sean

[1] https://patchwork.ozlabs.org/project/uboot/patch/20200905132211.412711-1-seanga2@gmail.com/
[2] https://github.com/kendryte/openocd-kendryte
[3] https://github.com/riscv/riscv-openocd
[4] https://github.com/kelas/k210-sdk-stuff/blob/master/rom/k210.rom.h