[U-Boot] Early malloc() summary

Sun Aug 19 15:21:23 CEST 2012

Hello Graeme!

On Fri, Aug 17, 2012 at 3:15 AM, Graeme Russ <graeme.russ at gmail.com> wrote:
> dm_malloc(bytes, driver *)
>   |
>   +-> early_malloc(bytes, reloc_helper *)  /* Pre-Relocation */
>   |     |
>   |     +->register_helper(reloc_helper *)
>   |     |
>   |     +->pre_reloc_malloc(size_t bytes)
>   |
>   +-> malloc(bytes)                        /* Post-Relocation */
>
>
> Drivers call dm_malloc(), helper functions call early_malloc()
>
> dm_malloc() is implemented in the DM core code and checks for whether the
> call is pre- or post- relocation. If pre-relocation, it checks for the
> driver having a relocation helper (or the 'I don't need one' flag)
>
> early_malloc() is implemented in the early malloc code seperate from the
> DM code.
>
> **** WARNING!!! STOP READING NOW!!! ****
>
> early_malloc() registers a relocation function (if provided) which will be
> called during relocation. DM core will strip this out as it will (for the
> time being) handle the calling of the relocation helper for each of the
> registered drivers. In the long term, I think that responsibility might be
> able to be taken away from DM core (but there may be call-order issues that
> might make that impossible)
>
> The way I imagine it in the future, any code that might possible allocate
> memory prior to relocation would do something like:
>
> static int my_relocator(void *data)
> {
>   struct foo *new_bar;
>
>   new_bar = malloc(sizeof(struct foo));
>   mem_cpy(new_bar, data, sizeof(struct foo));
>
>   /* Tweak internal new_bar members */
>
>   return 0;
> }
>
> int some_function()
> {
>   struct foo *bar;
>
>   bar = malloc(sizeof(struct foo));
>   register_helper(bar, my_relocator);
>
>   return 0;
> }
>
>
> And behind the scenes we have:
>
> data = malloc(bytes);
>           |
>           +->data = pre_reloc_malloc(size_t bytes)   /* Pre-Relocation */
>           |     |
>           |     +->add_to_reloc_list(data)
>           |     |
>           |     +->return data;
>           |
>           +->malloc(size_t bytes);                   /* Post-Relocation */
>
> register_helper(data, reloc_helper *)
>           |
>           +->update_reloc_list(data, reloc_helper *) /* Pre-Relocation */
>           |
>           +->Do Nothing                              /* Post-Relocation */
>
> During relocation, the 'reloc list' is processed. Each 'data' entry with no
> 'reloc_helper' will elicite a (debug) warning to let you know about data
> that was allocated but will not be relocated.

OK, I got this. It seems to me that everything starts with
pre_reloc_malloc(). And I think that this is roughly equivalent to my
void *early_malloc(size_t) function in previous experimental patches.
But I am not sure that the identifier pre_reloc_malloc() is proper for
this function because on archs without strict separation of
board_init_f and board_init_r, where the U-Boot is running in RAM from
the very beginning and no relocation is needed (microblaze, nios2,
openrisc, sh) it does not reflect the actual use - it is the function
used to obtain allocation from early_heap. And I think that in case of
that architectures we still need early_heap and working dm_malloc()
before the true malloc() is initialized. (It is because we might still
need to create the DM tree before malloc is initialized to facilitate
DM part of actual memory and malloc initialization.)

I am thinking about a way to obtain some space for the first
early_heap (assuming that I have the heap header you suggested some
time ago that has void *next_early_heap for future expansion with
arch-specific or CPU/board-specific ways to grab non-contiguous
early_heap). Do you know some elegant way to obtain some early_heap
space that would work on each architectures in question? It came to my
mind that I can steal the space from the early stack by something like
this:

#define DECLARE_EARLY_HEAP_ON_STACK char
__early_heap[CONFIG_SYS_EARLY_HEAP_SIZE]; \
					gd->early_heap_first = (void *)__early_heap

void board_init_f()
{
...
memset(gd) here
...
DECLARE_EARLY_HEAP_ON_STACK;

Although it is somehow architecture independent (except the fact that
we need sensible value of CONFIG_SYS_EARLY_HEAP_SIZE and it is perhaps
not feasible for x86 which has 3 init stages - board_init_f,
board_init_f_r and board_init_r, the stack is lost in between
board_init_f and board_init_f_r, but true malloc() is initialized as
late as in board_init_r, if I understand it well), but I am not sure
whether it is acceptable way to grab early_heap space like that.

My intention is to keep the prospective patch with early_heap and
pre_reloc_malloc() relatively low-profile and do it without
unnecessary architecture/CPU/ board specific code when possible.
Anyway I think we are going to need only as low as 20B of early_heap
for the root DM node on wast majority boards and therefore we could go
forward with really small early_heap in the beginning.

What do you think?

Tomas

-- 
Tomáš Hlaváček <tmshlvck at gmail.com>