[U-Boot] [PATCH] Fix board init code to use a valid C runtime environment (LONG, sorry)

Tue Nov 10 08:14:26 CET 2015

Hello Bin,

On Tue, 10 Nov 2015 11:54:08 +0800, Bin Meng <bmeng.cn at gmail.com> wrote:
> Hi Albert,
> 
> On Tue, Nov 10, 2015 at 8:20 AM, Albert ARIBAUD
> <albert.u.boot at aribaud.net> wrote:
> > board_init_f_mem() alters the C runtime environment's
> > stack it ls actually already using. This is not a valid
> > C runtime environment and may conflict with the C compiler's
> > expectations.
> 
> What C compiler's expectations conflicts with current
> board_init_f_mem()? Did you see any actual error on any board?

I did not see any error directly caused by board_init_f_mem()
manipulating its own stack and approximating its own usage of it in
the process --just like, if you'll allow me this analogy, I saw quite a
few worn tires but never saw personally a car crashing because of that;
I do however change my own car's tires when they're worn. It's not a
question of fixing an /failure/, is a question of fixing an /obvious/
/risk/.

Back to board_init_f_mem(): it is a C function, therefore, the calling
context provides a working C environment for it and expects it to not
abuse that environment; and once the function has returned, the calling
context may do whatever it likes with this environment as long as it
does not trash any currently running function's stack use and does not
lose the function's return value.

This is somewhat akin to 'programming by contract': if any side
(calling or callee) breaks the contract, anything could happen. The C
environment provides a writable stack hopefully large enough to
accommodate the function's local storage needs, but nothing says where
that stack comes from or where it will go. The C stack could even be at
a different location for every C function and still respect the caller
vs callee contract.

Now, board_init_f_mem:

The 64 bytes reserve at the start of board_init_f_mem() does in fact
acknowledge a breach of contract: it is an ad hoc fix to the fact that
the C stack passed to board_init_f_mem() is actually the one used by its
prologue for local storage before the first instruction is hit.

And this value is arbitrary: for now it works, but there is no guarantee
that it will work for every architecture and configuration. Worse, add
a few local variables and it /will/ not work any more.

And as I hinted, you don't know what the caller of board_init_f_mem
might do on the stack when it gets control back, and this may include
trashing whatever board_init_f_mem() put there -- Yes, right /now/ we
*do* know; but later on we /might not/ necessarily.

IOW, the current "contract" between board_init_f_mem() and its caller
is unique, diverges from the usual C contract, and is plain /weird/.
which implies that whenever modifying either the caller or callee, one
has to keep that divergent and unusual contract in mind, and pray that
the compiler's own (standard) view won't interfere (see aside below).

So it's best if we keep our code compliant with the assumptions of
the C run-time that the C stack is provided to the /compiler/ of a
function for local variable storage and temporary register saves, not
to the function itself to work on.

[ASIDE]

Unrelated to this patch (at least unrelated what board_init_f_mem
does), here is an example of assuming the compiler will behave one way
and being bitten hard when it does not.

in gcc, option -ffixed-reg directs the compiler to leave register 'reg'
alone so that it keeps its value and can be used as some sort of system-
wide global variable. In ARM, we use -ffixed-r9 (because the ARM EABI
says that of all registers, r9 is the one to be used as that system-wide
global), and r9 points to our GD. This is efficient, and compliant with
the 'contract' which defines how '--fixed-reg' works. As long as the
compiler respects it, we're fine.

Previously, r9 was set to point to GD outside the C environment. Then
came the arch_setup_gd() function which did it from inside the C env,
and that went fine -- after all, that function was the only one which
did affected r9 since in all C functions the /compiler/ wad told to
leave r9 alone by -ffixed-r9, and the user code only touched it in one
place.

Now comes gcc 5.2.1. It does not break the contract: in ARM builds,
nothing changes, and in Thumb-1 builds, the only change is that r9 may
be saved on entry to the function and then restored on exit. This
still leaves r9 alone as far as user code is concerned (I haven't done
ad hoc testing, but the test I've done shows so, see below).

But this breaks arch_setup_gd() in Thumb-1 builds, because now, on
entry it saves r9, then t sets it to GD, then... it restores it.

Once I 'fixed' this by making arch_setup_gd an empty-body function and
setting r9 from crt0.S outside the C environment execution, the Thumb-1
build of U-Boot on Open-RD worked: everywhere GD was accessed by user
code, r9 was correct, despite just about any C function pushing and
popping r9 all the time.

The break of arch_setup_gd() was because of how strict we and the GCC
folks interpret(ed) the 'leave reg alone if -ffixed-reg is given'
contract.

With arch_setup_gd(), we went from stricty assuming 'reg is constant
from the user code perspective' to less strictly assuming 'reg is
constant for both user and compiler-generated code'.

The GCC folks kept strictly assuming the former, and their ARM and
Thumb-1 code generations both respect their constant assumption; but
only their ARM code generation respects ours in gcc 5.2.1.

Had we always assumed the strictest definition for the -ffixed-reg
contract, arch_setup_gd() would not have broken with GCC 5.2.1 Thumb-1
(because it would not have been written in C, due to the fact that by
definition a C function cannot alter the fixed register).

[END OF ASIDE]

That is the same with board_init_f_mem() here: we are not in control of
how the compiler generates a call sequence or the prologue or epilogue,
especially across all architectures and compiler options, and for
instance we can't tell if there is a configuration of compiler options
which breaks the 64-byte assumption.

We /could/ assume that a C function altering its own stack is OK as
long as we are careful; but we've just seen with the r9 aside that
"careful" may not always be careful enough.

Or we can strictly assume that a C function altering its own stack is
*not* OK, and do all stack alterations outside any C function. We can
still leave size computations and data initializations in C functions
where they are clearer and easier to maintain within the constraints of
the C runtime.

Think of it as discovering some out-of-spec value written to a
register's reserved area which apparently cause no harm (and possibly
even were copied from code issued by the manufacturer) because the
reserved area is not used. Then comes a new version of the silicon
where the area is still described as reserved but for which the value
has now a functional effect. Boom. And then (or, more probably, after
the few hours it will take to finally find out that the cause of the
boom was the out-of-spec value) you'll wish that you had fixed that
out-of-spec value years before.

Same here: we have an out-of-spec situation; let's make it in-spec.

Amicalement,
-- 
Albert.