[PATCH] Revert "spl: Drop bd_info in the data section"

Alex G. mr.nuke.me at gmail.com
Mon Apr 12 15:51:11 CEST 2021



On 4/12/21 8:25 AM, Tom Rini wrote:
> On Fri, Apr 09, 2021 at 05:29:36PM -0700, Tim Harvey wrote:
>> On Fri, Apr 9, 2021 at 1:53 PM Tom Rini <trini at konsulko.com> wrote:
>>>
>>> On Fri, Apr 09, 2021 at 03:24:41PM -0500, Adam Ford wrote:
>>>> On Fri, Apr 9, 2021 at 2:20 PM Alex G. <mr.nuke.me at gmail.com> wrote:
>>>>>
>>>>> Hi Simon
>>>>>
>>>>> On 4/8/21 6:55 PM, Simon Glass wrote:
>>>>>> Hi Alexandru,
>>>>>>
>>>>>> On Fri, 9 Apr 2021 at 04:56, Alexandru Gagniuc <mr.nuke.me at gmail.com> wrote:
>>>>>>>
>>>>>>> This reverts commit 38d6b7ebdaee3e0e8426ef1b9df88bdce8ae2e75.
>>>>>>>
>>>>>>> struct global_data contains a pointer to the bd_info structure. This
>>>>>>> pointer was populated spl_set_bd() to a pre-allocated bd_info in the
>>>>>>> ".data" section. The referenced commit replaced this mechanism to one
>>>>>>> that uses malloc(). That new mechanism is only used if SPL_ALLOC_BD=y.
>>>>>>> which very few boards do.
>>>>>>>
>>>>>>> The result is that (struct global_data)->bd is NULL in SPL on most
>>>>>>> platforms. This breaks falcon mode, since arch_fixup_fdt() tries to
>>>>>>> access (struct global_data)->bd and set the "/memory" node in the
>>>>>>> devicetree. The result is that the "/memory" node contains garbage
>>>>>>> values, causing linux to panic() as it sets up the page table.
>>>>>>>
>>>>>>> Instead of trying to fix the mess, potentially causing other issues,
>>>>>>> revert to the code that worked, while this change is reworked.
>>>>>>
>>>>>> The goal here is to drop a feature that few boards use and reduce
>>>>>> memory usage in SPL. It has been in place for two releases now.
>>>>>>
>>>>>> If Falcon mode needs it, perhaps you could add an 'imply' in the
>>>>>> Kconfig for that feature? Is there one? Or perhaps
>>>>>> CONFIG_ARCH_FIXUP_FDT_MEMORY ?
>>>>>>
>>>>>> One option would be to return an error in arch_fixup_fdt(). In
>>>>>> general, fixups make things tricky because there is no way to
>>>>>> determine when they are used but at least this one has a CONFIG.
>>>>>>
>>>>>
>>>>> The argument that this has been in place for two releases is incorrect.
>>>>> Commit 38d6b7ebdaee3e0e8426ef1b9df88bdce8ae2e75 was only introduced with
>>>>> the v2021.04 release. It definitely was not in 2021.01. It's only in the
>>>>> last release, which is four days old t the time of this writing.
>>>
>>> Yes.  But another way of saying that is that we're 4 days in to the
>>> merge window.  That's a bit early to say we must revert the change.  If
>>> this was just before the release, yes, revert would be the right answer.
>>> It's also the case the original commit fixes some cases while also
>>> saving size, if I read it right.  So a strict revert isn't right, we'd
>>> need to also probably also default y SPL_ALLOC_BD in some cases.
>>>
>>>>> Although I was able to find one example, the reality is that we don't
>>>>> know the full extent of the breakage. The prudent thing at this point is
>>>>> to revert.
>>>>>
>>>>> The knowledge of how to init the platform is in the devicetree and code.
>>>>> Why should kconfig also be involved in storing this knowledge? By this
>>>>> model, as the number of boards increases without bounds, every "if"
>>>>> predicate tends to be Kconfig driven. That is not maintainable, and why
>>>>> I think the original change --and the proposed fixes-- are broken by design.
>>>>>
>>>>> Furthermore, I'm happy to discuss what to do about Falcon mode, and if
>>>>> we should kill it entirely (I have an alternative proposal).  But we
>>>>> shouldn't have that discussion in a manner holding my platform hostage.
>>>>> To be fair to you, I don't think reverting a 64-byte savings in .data
>>>>> will hold your platform hostage either.
>>>>
>>>> That original patch broke a bunch of OMAP boards, but enabling
>>>> SPL_ALLOC_BD was the solution to fix the issue.
>>>> Can you try enabling SPL_ALLOC_BD and see if that fixes it?  Maybe we
>>>> can make falcon mode imply it.
>>>
>>> It would be "select" since it needs it rather than imply.
>>>
>>
>> I just ran into this as well finding that commit 38d6b7ebdaee ("spl:
>> Drop bd_info in the data section") breaks Gateworks Ventana and
>> defining SPL_ALLOC_BD does resolve it. In this case, Falcon mode is
>> not being used and the breakage is because arch/arm/mach-imx/spl.c
>> dram_init_banksize() accesses gd->bd which is NULL.
>>
>> So I would guess that this probably broke a whole lot of IMX based
>> boards that use SPL.
>>
>> I don't quite know what the best solution is... we now have a v2021.04
>> that is broken for several or many boards and I dont' know if its
>> clear what cases break.
> 
> Looking forward, I think we need to rework this.  Simon, I gather you
> have some platforms where we need to set gd->bd to something that we
> malloc() ?  So perhaps spl_set_bd() should have been __weak so that
> architectures / platforms could override as needed, but also move it
> just past mem_malloc_init().

Let's try and avoid making new weak functions. Why not introduce a 
spl_platform_alloc_bd(), that each plat- *must* implement? No diversion 
to Kconfig and no __weak__ required.

Alex


More information about the U-Boot mailing list