[U-Boot] [PATCH v3 1/6] spl: add Kconfig option to clear bss early

Simon Goldschmidt simon.k.r.goldschmidt at gmail.com
Sun Mar 31 20:01:34 UTC 2019



On 30.03.19 23:37, Simon Glass wrote:
> Hi Simon,
> 
> On Sat, 30 Mar 2019 at 15:40, Simon Goldschmidt
> <simon.k.r.goldschmidt at gmail.com> wrote:
>>
>>
>>
>> Simon Glass <sjg at chromium.org> schrieb am Sa., 30. März 2019, 22:18:
>>>
>>> Hi Simon,
>>>
>>> On Sat, 30 Mar 2019 at 14:50, Simon Goldschmidt
>>> <simon.k.r.goldschmidt at gmail.com> wrote:
>>>>
>>>>
>>>>
>>>> Simon Goldschmidt <simon.k.r.goldschmidt at gmail.com> schrieb am Sa., 30. März 2019, 21:18:
>>>>>
>>>>>
>>>>>
>>>>> Simon Glass <sjg at chromium.org> schrieb am Sa., 30. März 2019, 21:06:
>>>>>>
>>>>>> Hi Simon,
>>>>>>
>>>>>> On Wed, 27 Mar 2019 at 13:40, Simon Goldschmidt
>>>>>> <simon.k.r.goldschmidt at gmail.com> wrote:
>>>>>>>
>>>>>>> This introduces a new Kconfig option SPL_CLEAR_BSS_F. If enabled, it clears
>>>>>>> the bss before calling board_init_f() instead of clearing it before calling
>>>>>>> board_init_r().
>>>>>>>
>>>>>>> This also ensures that variables placed in BSS can be shared between
>>>>>>> board_init_f() and board_init_r() in SPL.
>>>>>>>
>>>>>>> Such global variables are used, for example, when loading things from FAT
>>>>>>> before SDRAM is available: the full heap required for FAT uses global
>>>>>>> variables and clearing BSS after board_init_f() would reset the heap state.
>>>>>>> An example for such a usage is socfpa_arria10 where an FPGA configuration
>>>>>>> is required before SDRAM can be used.
>>>>>>>
>>>>>>> Make the new option depend on ARM for now until more implementations follow.
>>>>>>>
>>>>>>
>>>>>> I still have objections to this series and I think we should discuss
>>>>>> other ways of solving this problem.
>>>>>>
>>>>>> Does socfgpa have SRAM that could be used before SDRAM is available?
>>>>>> If so, can we not use that for the configuration? What various are
>>>>>> actually in BSS that are needed before board_init_r() is called? Can
>>>>>> they not be in a struct created from malloc()?
>>>>>
>>>>>
>>>>> The problem is the board needs to load an FPGA configuration from FAT before SDRAM is available. Yes, this is loaded into SRAM of course, but the whole code until that is done uses so many malloc/free iterations that The simple mall of implementation would require too much memory.
>>>>>
>>>>> And it's the full malloc state variables only that use BSS, not the FAT code.
>>>>>
>>>>> One way out could be to move the full mall of state variables into 'gd'...
>>>>
>>>>
>>>> Maybe I should point out again that the whole purpose of this series is to have an SPL that uses full malloc right from the start. This is currently not possible as full malloc needs BSS.
>>>
>>> Right, and our assumption/design is that full malloc() requires SRAM.
>>>
>>> Here we have an architecture that requires FAT just to init its SDRAM.
>>> FAT requires malloc() and free() and there is not enough SRAM to skip
>>> the free() calls. So we have to use full malloc() and that uses BSS.
>>> But BSS is not available before board_init_r(). But we need to init
>>> SDRAM before board_init_r().
>>>
>>> Firstly I'd point out that someone should speak to the chip designers.
>>> Did anyone try to boot U-Boot on the SoC model?
>>
>>
>> Well, it's a U-Boot thing to load it from FAT. It could probably be loaded from RAW mmc without problems, so I don't know if it's a chip designers issue. I think it's an issue that we need to fix in U-Boot: we have a good full malloc implementation but it's not usable in all cases were it should be.
> 
> OK then why use FAT? I assumed it was a boot-ROM requirement. How does
> the boot ROM load SPL?

Honestly, I don't know. It's Altera/Intel that decided for this boot 
flow. However, since FPGA development is often done from Windows, I 
guess it's convenient for FPGA development to write the resulting binary 
to a FAT partition on a sdcard.

> 
>>
>>>
>>> I think it is possible to change dlmalloc to put its variables in a
>>> struct. Then I suppose the struct pointer could be kept in gd. That
>>> would be one solution. Then full malloc() could be inited in
>>> spl_common_init() perhaps.
>>
>>
>> That might be worth a try.
> 
> Yes shouldn't be too painful. I suspect it would be neutral on code size, too.
> 
>>
>>>
>>> Another option might be to update the FAT code to use
>>> ALLOC_CACHE_ALIGN_BUFFER() instead of malloc(), as it already does in
>>> places, and perhaps to disable all caching for this case. Then it
>>> might work with simple malloc().
>>
>>
>> Hmm, then the next platform will have problems because allocating via malloc would be preferable. If really rather fix using dlmalloc instead.
> 
> Hmm but there is no obvious analysis behind your preference. If we
> have code like this:
> 
> do_something_with_fat()
> {
>     void *buf = malloc(...);
> 
>   ...
> 
>     free(buf);
> }
> 
> it seems to me we should convert it to use the stack instead, unless
> there is some recursion problem, etc. Otherwise we are just causing
> pointless churn on malloc().

I don't think it's that easy. There are platforms where a big stack size 
hurts. I don't think such big buffers like MAX_CLUSTERSIZE should be 
allocated on stack.

> 
>>
>>>
>>> Another option would be to put the FPGA image in a known position on
>>> the media, outside the FAT partition. The position could perhaps be
>>> written into a FAT file, and reading just that file in SPL might not
>>> involve many free() operations.
>>
>>
>> Sounds like a workaround, too. I think the U-Boot infrastructure should work for the boards, not placing restrictions on them.
> 
> I'll await your answer to my first question in this email before
> passing judgement.
> 
>>
>>>
>>> I hesitate to suggest enhancing simple malloc() to support a free
>>> list. That would increase the code size and I'm not sure it would be
>>> better than using full malloc().
>>
>>
>> Yes, I think that might result in making a second dlmalloc.... :-)
>>
>> Thanks for your thoughts and input!
> 
> Overall I feel that the current trade-offs and phases of boot are
> reasonable. We should be suspicious of attempts to make them more
> complex.

I think denying BSS access in SPL before board_init_r is a bit outdated. 
In times where board_init_f was ASM, it could have been OK, but nowadays 
where this is C and you can't check the code easily to *not* use BSS, 
I'm not too sure this limitation should holf.

Anyway, I realize there are people wanting to keep this up, so I'll work 
in that direction.

The initial intention of this series was to use full heap just like tiny 
heap in SPL. Tiny heap is configured via Kconfig by setting its size, 
full heap (available some time during board_init_r) currently needs 
CONFIG_SYS_SPL_MALLOC_SIZE and CONFIG_SYS_SPL_MALLOC_START

I'd prefer it if the full heap implementation could use the same 
allocation strategy as the tiny heap. That should make configuration 
easier, not more complex. Maybe I started it the wrong way, maybe you 
have a better idea of how to make is simpler in that direction?

Regards,
Simon

> 
> Regards,
> Simon
> 


More information about the U-Boot mailing list