RPi4 U-Boot freeze

Sean Anderson seanga2 at gmail.com
Sat Sep 19 23:20:54 CEST 2020


On 9/19/20 7:55 AM, Stefan Agner wrote:
> On 2020-09-14 10:15, Matthias Brugger wrote:
>> On 10/09/2020 23:12, Stefan Agner wrote:
>>> On 2020-09-07 16:36, Peter Robinson wrote:
>>>>> Any thoughts on this issue?
>>>>
>>>> Any reason why you're using 2020.01 and not at least 2020.07, or at
>>>> least seeing if it's reproducible on 2020.10-rc3? The RPi4 support has
>>>> changed quite a bit since then I suspect.
>>>>
>>>
>>> Hi Peter,
>>>
>>> It's a stable release and we support a couple of devices with the same
>>> U-Boot version. I'd rather prefer to stay with 2020.01 for RPi4 as well.
>>>
>>> We are on 2020.07 on development branch, and it does work fine there. So
>>> I thought it can't be that hard, just bisect and backport whatever fixes
>>> it... Unfortunately, it seems that there is no particular commit which
>>> fixes it (the bisect ended up in a random unrelated commit, and it seems
>>> that the issue appears/disappears depending on alignment/size...).
>>>
>>> I also did applied pretty much every RPi4 related commit made after
>>> 2020.01 up until master back to 2020.01, no success either.
>>>
>>
>> Which version of the Raspberry Pi firmware did you use?
>> Unfortunately changes in the FW breaks stuff on U-Boot from time to time.
>>
> 
> Ok, I am now able to reproduce the issue on master as well as 2020.07
> with standard rpi_4_32b_defconfig, but I still need to have parts of a
> change which seems to trigger the issue in. From what I can tell, the
> change *really* should not lead to a freeze. The change is just
> accessing global variables from the data section... (see below).
> 
> To me it still seems as if relocation somehow did not work correctly in
> one way or another.
> 
> Are there maybe restrictions in U-Boot when the data section can be
> accessed? E.g. is it not legal to access the data section from the
> serial driver?

One thing bit me recently, and might be relevant here. Because
putc_retry is initialized to zero, it is located in bss and not data.
In U-Boot, bss is not accessable before relocation. The serial driver is
one of the devices which U-Boot needs before relocating, so setting
putc_retry may overwrite data in the device tree. To get around this,
you could try adding __attribute__((section(".data"))) to that variable.

--Sean

> 
> 
> diff --git a/drivers/serial/serial_bcm283x_mu.c
> b/drivers/serial/serial_bcm283x_mu.c
> index 8a4af87eb6..74de6801ab 100644
> --- a/drivers/serial/serial_bcm283x_mu.c
> +++ b/drivers/serial/serial_bcm283x_mu.c
> @@ -50,7 +50,8 @@ struct bcm283x_mu_regs {
>  struct bcm283x_mu_priv {
>         struct bcm283x_mu_regs *regs;
>  };
> -
> +static char *fs_argv[15];
> +static uint32_t putc_retry = 0;
>  static int bcm283x_mu_serial_getc(struct udevice *dev);
>  
>  static int bcm283x_mu_serial_setbrg(struct udevice *dev, int baudrate)
> @@ -95,6 +96,8 @@ static int bcm283x_mu_serial_putc(struct udevice *dev,
> const char data)
>         struct bcm283x_mu_priv *priv = dev_get_priv(dev);
>         struct bcm283x_mu_regs *regs = priv->regs;
>  
> +       putc_retry++;
> +
>         /* Wait until there is space in the FIFO */
>         if (!(readl(&regs->lsr) & BCM283X_MU_LSR_TX_EMPTY))
>                 return -EAGAIN;
> @@ -162,6 +165,10 @@ static int bcm283x_mu_serial_probe(struct udevice
> *dev)
>         struct bcm283x_mu_priv *priv = dev_get_priv(dev);
>         fdt_addr_t addr;
>  
> +       /* Make sure compiler does not optimize out this fs_argv
> instance */
> +       if (fs_argv[0])
> +               fs_argv[0] = "test";
> +
>         /* Don't spawn the device if it's not muxed */
>         if (!bcm283x_is_serial_muxed())
>                 return -ENODEV;
> 
> Most curious of all, it seems that the name (!!!) of the variable
> fs_argv matters! I am not sure if that changes order of variables in
> data section or something. I can also reproduce the issue with two
> compilers (GCC 8.3 and GCC 9.2), so a compiler error seems somewhat
> unlikely...
> 
> Any ideas? I am a bit out of idea how to debug this (I guess JTAG/gdb
> might help, but I don't have such a setup).
> 
> FWIW, I plan to just drop the change which seems to at least partially
> cause the isssue
> (https://github.com/home-assistant/operating-system/blob/dev/buildroot-external/board/raspberrypi/patches/uboot/0002-avoid-block-uart-write.patch).
> Still I think there is something wrong which will show itself someday in
> a certain configuration.
> 
> --
> Stefan
> 
> 
>> Regards,
>> Mathias
>>
>>> I fear that the problem in fact is still in master, but only appears if
>>> certain things align a certain way... That is why I thought I bring it
>>> up, to see if anybody else has noticed something along this lines. We do
>>> have a rather trimmed down configuration, which might make the problem
>>> appear more (fit in a D cache...).
>>>
>>> I probably will just disable cached around relocation for 2020.01 and
>>> see if it resurfaces on development branch.
>>>
>>> --
>>> Stefan
>>>
>>>
>>>>> Just to be sure, I did some memory testing on the 2GB module, but no
>>>>> issues found.
>>>>>
>>>>> I still somehow suspected that something else might be wrong with my
>>>>> hardware, so I bought a new RPi4 (this time with 4GB of RAM) but the
>>>>> very same with that:
>>>>>
>>>>> U-Boot 2020.01 (Aug 23 2020 - 22:02:31 +0000)
>>>>>
>>>>> DRAM:  3.9 GiB
>>>>> <freeze>
>>>>>
>>>>> I still think there is something wrong with caching. From what I
>>>>> understand caches are enabled by the RPi (4) firmware. Is it safe to run
>>>>> 32-bit ARM U-Boot with enabled caches?
>>>>>
>>>>> --
>>>>> Stefan
>>>>>
>>>>> On 2020-08-23 19:06, Stefan Agner wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I noticed a quite common freeze when running 32-bit U-Boot 2020.01
>>>>>> (rpi_4_32b_defconfig) on a 2GB RPi4 model:
>>>>>>
>>>>>> U-Boot 2020.01 (Aug 07 2020 - 13:00:23 +0000)
>>>>>>
>>>>>> DRAM:  1.9 GiB
>>>>>> <freeze, no more output>
>>>>>>
>>>>>> This happens fairly often, I would say 4 out of 5 boot tries. However,
>>>>>> if it boots, everything seems to run fine.
>>>>>>
>>>>>> The issue seems to go away when using 2020.04 or any newer release,
>>>>>> however, when trying to find the actual patch fixing the issue using git
>>>>>> bisect I ended up with a MMC merge request which really seems unrelated
>>>>>> (36bdcf7f3b). It seems that the problem is quite evasive and disappears
>>>>>> if certain structure are aligned differently etc.
>>>>>>
>>>>>> Enabling initcall debugging showed that U-Boot crashes right after
>>>>>> relocation:
>>>>>>
>>>>>> ...
>>>>>> initcall: 00016f2c
>>>>>>
>>>>>> RAM Configuration:
>>>>>> Bank #0: 0 948 MiB
>>>>>> Bank #1: 40000000 1 GiB
>>>>>> Bank #2: 0 0 Bytes
>>>>>> Bank #3: 0 0 Bytes
>>>>>>
>>>>>> DRAM:  1.9 GiB
>>>>>> initcall: 00016bb8
>>>>>> New Stack Pointer is: 3af6d9e0
>>>>>> initcall: 00016da4
>>>>>> initcall: 00016ef0
>>>>>> initcall: 00016ef8
>>>>>> initcall: 00016d38
>>>>>> Relocation Offset is: 3b375000
>>>>>> Relocating to 3b37d000, new gd at 3af78ec0, sp at 3af6d9e0
>>>>>> initcall: 00016ec8 [clear_bss]
>>>>>> initcall: 0004465c [display_options?? only appears sometimes]
>>>>>> <freeze>
>>>>>>
>>>>>> I realized when using CONFIG_SYS_(I|D)CACHE_OFF=y the problem seems to
>>>>>> disappear. But to be 100% certain that it is cache related, I used my
>>>>>> original configuration (which is known to "reliably" freeze), and
>>>>>> replaced 00016ec8 with 00008688 manually in the binary, essentially
>>>>>> swapping out function pointers in "init_sequence_f" [00008688 is
>>>>>> cleanup_before_linux, which flushes and disables I-cache/D-cache]. And
>>>>>> indeed, that hacked up binary does boot reliably every time:
>>>>>>
>>>>>> ...
>>>>>> initcall: 00016f2c
>>>>>>
>>>>>> RAM Configuration:
>>>>>> Bank #0: 0 948 MiB
>>>>>> Bank #1: 40000000 1 GiB
>>>>>> Bank #2: 0 0 Bytes
>>>>>> Bank #3: 0 0 Bytes
>>>>>>
>>>>>> DRAM:  1.9 GiB
>>>>>> initcall: 00016bb8
>>>>>> New Stack Pointer is: 3af6d9e0
>>>>>> initcall: 00016da4
>>>>>> initcall: 00016ef0
>>>>>> initcall: 00016ef8
>>>>>> initcall: 00016d38
>>>>>> Relocation Offset is: 3b375000
>>>>>> Relocating to 3b37d000, new gd at 3af78ec0, sp at 3af6d9e0
>>>>>> initcall: 00008688
>>>>>> initcall: 3b38c10c
>>>>>> initcall: 3b38c114
>>>>>> initcall: 000172e0 (relocated to 3b38c2e0)
>>>>>> initcall: 0001712c (relocated to 3b38c12c)
>>>>>> ...
>>>>>>
>>>>>>  From what I understand on RPi4 caches are enabled when entering U-Boot.
>>>>>> I was wondering if the relocation code really can handle that?
>>>>>>
>>>>>> --
>>>>>> Stefan
>>>



More information about the U-Boot mailing list