[PATCH] riscv: bypass malloc when spl fit boots from ram

Wed Dec 28 05:26:00 CET 2022

On 12/27/22 21:22, Rick Chen wrote:
> Hi Samuel,
> 
> Samuel Holland <samuel at sholland.org> 於 2022年12月28日 週三 上午10:47寫道：
>>
>> On 12/22/22 01:21, Rick Chen wrote:
>>> When fit image boots from ram, the payload will
>>> be prepared in the address of SPL_LOAD_FIT_ADDRESS.
>>> In spl fit generic flow, it will malloc another
>>> memory address and copy whole fit image to this
>>> malloc address.  But it is un-necessary for booting
>>> from RAM.
>>
>> This should mostly be solved by using `mkimage -E` or binman's
>> `fit,external-offset`. Then only the FIT structure, without any data,
>> will be copied.
>>
>>> This patch improves this flow by declare the
>>> board_spl_fit_buffer_addr() to replace the original one.
>>> The larger image size (eq: Kernel Image 10~20MB), it
>>> can save more booting time.
>>>
>>> Also enhance memcpy function by checking source and
>>> destination address. If they are the same address,
>>> just return and don't copy data anymore.
>>>
>>> Signed-off-by: Rick Chen <rick at andestech.com>
>>> ---
>>>  arch/riscv/lib/memcpy.S |  2 ++
>>>  arch/riscv/lib/spl.c    | 16 ++++++++++++++++
>>>  2 files changed, 18 insertions(+)
>>>
>>> diff --git a/arch/riscv/lib/memcpy.S b/arch/riscv/lib/memcpy.S
>>> index 00672c19ad..9884077c93 100644
>>> --- a/arch/riscv/lib/memcpy.S
>>> +++ b/arch/riscv/lib/memcpy.S
>>> @@ -9,6 +9,7 @@
>>>  /* void *memcpy(void *, const void *, size_t) */
>>>  ENTRY(__memcpy)
>>>  WEAK(memcpy)
>>> +     beq     a0, a1, .copy_end
>>
>> I see one call to memcpy() in common/spl/spl_ram.c. I think it would
>> make more sense to do the check there instead of adding a branch to
>> every memcpy() call.
> 
> There is not only one memcpy being called in spl_ram.c during spl
> booting progress, but also in spl_fit.c .

Either your FIT uses external data, and the one in spl_ram.c is trivial,
or it uses internal data, and the one in spl_fit.c is unavoidable.
Either way, there is only one place where this change makes a difference.

> Otherwise I prefer not to modify it in generic flow level but arch
> level, because the src and dest checking
> may not be necessary for other booting devices.

I see arch/arm/lib/memcpy.S has similar logic, so other code may be
depending on this optimization, and there is precedent for this change.

>>
>>>       /* Save for return value */
>>>       mv      t6, a0
>>>
>>> @@ -121,6 +122,7 @@ WEAK(memcpy)
>>>  2:
>>>
>>>       mv      a0, t6
>>> +.copy_end:
>>>       ret
>>>
>>>  .Lmisaligned_word_copy:
>>> diff --git a/arch/riscv/lib/spl.c b/arch/riscv/lib/spl.c
>>> index f4d3b67e5d..18f86ee207 100644
>>> --- a/arch/riscv/lib/spl.c
>>> +++ b/arch/riscv/lib/spl.c
>>> @@ -61,3 +61,19 @@ void __noreturn jump_to_image_no_args(struct spl_image_info *spl_image)
>>>  #endif
>>>       image_entry(gd->arch.boot_hart, fdt_blob);
>>>  }
>>> +
>>> +#ifdef CONFIG_SPL_RAM_SUPPORT
>>> +struct legacy_img_hdr *spl_get_load_buffer(ssize_t offset, size_t size)
>>> +{
>>> +     return (void *)(CONFIG_SPL_LOAD_FIT_ADDRESS + offset);
>>> +}
>>> +
>>> +void *board_spl_fit_buffer_addr(ulong fit_size, int sectors, int bl_len)
>>> +{
>>> +     void *buf;
>>> +
>>> +     buf = spl_get_load_buffer(0, sectors * bl_len);
>>> +
>>> +     return buf;
>>> +}
>>> +#endif
>>
>> You cannot provide strong definitions of these functions at an
>> architecture level, as this prevents any board from overriding them.
> 
> Originally I wanted to put it in arch/riscv/cpu/ax25/spl.c .
> However It is a fundamental improvement for spl_ram flow, but not
> likely a private change.
> That's why I put it here, so that all boards of RISC-V can be benefited.

This change isn't in any way specific to RISC-V. If you want to make
this generic, it would belong in spl_ram.c. But I still think it is
inappropriate to provide a strong definition of board_* functions
outside board code.

Regards,
Samuel