i.MX8MP SPL failures due to memory corruption/overflow?

Emanuele Ghidoli ghidoliemanuele at gmail.com
Wed Mar 15 22:25:14 CET 2023


On 15/03/2023 16:24, Frieder Schrempf wrote:
> On 15.03.23 15:42, Frieder Schrempf wrote:
>> On 15.03.23 15:17, Michael Nazzareno Trimarchi wrote:
>>> Hi
>>>
>>> On Wed, Mar 15, 2023 at 3:13 PM Frieder Schrempf
>>> <frieder.schrempf at kontron.de> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'm trying to bring up a new board based on the i.MX8MP and I have an
>>>> issue I'm hoping someone can help solving.
>>>>
>>>> I'm seeing failures in the early SPL code, usually in the DDR
>>>> initialization. Often they look like:
>>>>
>>>>    U-Boot SPL 2023.04-rc3 (Mar 07 2023 - 14:32:34 +0000)
>>>>    Training FAILED
>>>>    Failed to initialize DDR RAM!
>>>>    ### ERROR ### Please RESET the board ###
>>>>
>>>> But sometimes ddr_init() doesn't even return an error and only the
>>>> get_ram_size() afterwards which tries to allocate the memory fails.
>>>>
>>>
>>> In my experience you don't have space inside the cpu internal memory. It means
>>> that you overlap some stack with the code. Change the printf means
>>> move a bit. So you have
>>> problem but depends what you are going to destroy
>>
>> Thanks for your reply. That's exactly what I'm thinking, too.
>>
>>>
>>>> The strange thing is that the issues appear or disappear
>>>> deterministically on the binary level. This means I sometimes get a
>>>> U-Boot binary which runs just fine in 100% of cases. Then I change for
>>>> example one of the following:
>>>>
>>>> * Adding a single printf() somewhere in the boards spl.c
>>>> * Using the same binary but booting from SD card instead of USB loader
>>>> * Using the same source but switching from the OS cross compiler to the
>>>> one from Yocto/OE
>>>>
>>>> And afterwards I get 100% failure rate with an error as described above.
>>>>
>>>> My suspicion is that there is some memory corruption/conflict. My SPL is
>>>> quite large and I wonder if it exceeds some limit.
>>>>
>>>> SPL is loaded to 0x920000 and CONFIG_SPL_STACK is set to 0x960000, which
>>>> leaves 256 KiB in between for the SPL. But all i.MX8MP boards seem to
>>>> set CONFIG_SPL_MAX_SIZE=0x26000 (152 KiB) for some reason. My
>>>> u-boot-spl-ddr.bin currently has around 193 KiB but I don't get any
>>>> warning about exceeding the SPL_MAX_SIZE.
>>>>
>>>> My questions:
>>>>
>>>> * Why is CONFIG_SPL_MAX_SIZE set to 152 KiB?
>>
>> I guess the remainder between the SPL code and the SPL stack is for the
>> DDR firmware. Which explains why I get failures with SPL exceeding 152
>> KiB size.
> 
> Still, it doesn't really make sense to me at the moment as the
> u-boot-spl-ddr.bin already contains the DDR firmware it should be fine
> to exceed the 152 KiB size. My u-boot-spl.bin (without DDR firmware) is
> only 135 KiB.
> 
> Sorry for spamming you by thinking out loud... ;)
> 
>>
>> Now I also understand the reason why the power init code was implemented
>> using legacy non-DM drivers in other i.MX8MP boards. I probably also
>> need to do this to save some space.
>>
>>>> * Why is there no warning in my case?
>>
>> Still, I fail to see why there isn't any error or where the size check
>> is even implemented.
>>
>>>> * Any other ideas or pointers?
>>>>
>>>> Thanks for your help!
>>>>
>>>> Best regards
>>>> Frieder
> 

Hello,
I fall in a similar problem.

Some hints:
- commit 5004901efb3b ("board_init: Do not reserve MALLOC_F area on stack
   if non-zero MALLOC_F_ADDR") - but you should already have it
- Reduce (set to something different from default value) SPL_SYS_MALLOC_F_LEN.
   Normally that area is not used a lot. Stack start before heap area and,
   if I remember well, start address of heap area depend upon this config.
   And... its default value is equal to SYS_MALLOC_F_LEN, that normally is high.

Suggestions from Rasmus are precious. I adopt a rather similar approch to find
that stack / gd (global data) was overlapping DDR firmware / cfg.

Best regards,
Emanuele Ghidoli





More information about the U-Boot mailing list