[U-Boot] [RFC] [PATCH] rewrite doc/README.arm-unaligned-accesses

Albert ARIBAUD albert.u.boot at aribaud.net
Fri Feb 21 11:25:36 CET 2014


Hi Tom,

On Thu, 20 Feb 2014 16:34:44 -0500, Tom Rini <trini at ti.com> wrote:

> On Tue, Feb 18, 2014 at 01:52:42PM +0100, Albert ARIBAUD wrote:
> 
> > There has been a few back-and-forths (and sideways too) about how
> > unaligned accesses are considered in ARM U-Boot. This post is to (try
> > and) get us together in one place, get things straight about what is
> > currently done and why as far as alignment is concerned, and to get
> > doc/README.arm-unaligned-accesses is clear and consistent with it.
> 
> I agree that we need to get on the same page and understand what is and
> isn't (and is and cannot be done for us) going on, so we can get back to
> bigger problems that need solving.
> 
> I wonder if we shouldn't largely crib from
> https://www.kernel.org/doc/Documentation/unaligned-memory-access.txt and
> modify the examples to be relevant to our sources, and perhaps drop the
> networking section (as that doesn't directly apply).
> 
> And as an aside, building with -Wcast-align shows some places we _may_
> need to investigate.

Thanks for the suggestion. If investigation shows it to be a good sign
of some alignment issue, then maybe we could add it systematically if
it does not adversely affect build or run performance. 

> [snip]
> > +ii) In order to make sure the following is self-sufficient, it goes
> > +through the basics of alignment and assumes only good, not expert,
> > +knowledge of the C language.
> > +
> > +1. C99 alignment requirements
> > +
> > +The C99 standard [1] (henceforth: 'C99') defines alignment requirements
> > +as a "requirement that objects of a particular type be located on
> > +storage boundaries with addresses that are particular multiples of a
> > +byte address".
> > +
> > +In C99, unaligned accesses (those which which do not respect alignment
> > +requirements of the object type being accessed) are deemed 'undefined'.
> > +This means programs which contain unaligned accesses might build and
> > +execute as if there were no alignment constraints, or build and execute
> > +but not as expected, or build and crash at execution, or not build
> > +at all--or even crash the compiler.
> 
> OK.
> 
> > +2. Implementation alignment requirements
> > +
> > +While C99 does define alignment requirements, it does not lay out any
> > +actual alignment requirements, because these depend greatly on the
> > +hardware involved; they are thus defined by C99 implementations.
> > +
> > +For ARM, the C alignment requirements are laid out in the ARM EABI [2].
> > +For instance, is is the ARM EABI which sets the alignment constraint of
> > +type 'long' to four-byte boundaries.
> > +
> > +Alignment requirements for a given architecture may differ from
> > +hardware capabilities, i.e. they might be stricter than what the
> > +hardware can actually do. One example is (32-bit) x86, which can do
> > +unaligned accesses at the hardware level at some performance cost, but
> > +has stricter requirement analogous to the ARM EABI ones. ARM is a mixed
> > +bag: some older ARM hardware cannot perform unaligned access at all;
> > +some can but at a cost; some can at virtually no cost.
> > +
> > +Further, even when a given architecture is capable of emitting such
> > +unaligned accesses at the core or CPU level, at a higher (system) level
> > +they might be forbidden because the target address falls within a
> > +region in which only aligned accesses are possible.
> 
> The compiler must make reasonable decisions based on what it knows about
> a particular architecture and ABI based on what it can know.  The
> developer must make correct decisions based on things the compiler
> cannot know such as memory region constraints.

OK -- is this an addition suggestion?

> > +3. Native vs emulated unaligned accesses.
> > +
> > +There is a different between the alignment of accesses in a C program
> > +source code and the alignment of accesses at the core or CPU level. In
> > +the following, translated accesses (core or CPU accesses) will be
> > +qualified as /native/ accesses to distinguish them from (untranslated) C
> > +program accesses.
> > +
> > +A C99 implementation might translate an unaligned access into a native
> > +unaligned access, or it might emulate it by a combination of native
> > +aligned accesses. 
> 
> This isn't relevant, given the contraint on the compiler to do sane
> things for a given architecture.

"Sane things", yes; but not one sane thing only: for instance, both
options are sane with an architecture where native unaligned accesses
are possible at some cost, possibly without one option being clearly
better than the other" so there is a choice here.

So I think this is relevant here, in order to inform the reader that
there are options in the way unaligned accesses should be translated,
and that this option might have to be decided on a case-by-case, or
even access-by-access, basis.

I probably should make that explicit in the text.

> > +4. Alignment requirements in U-Boot
> > +
> > +As U-Boot runs on a variety of hardware using a variety of C99
> > +implementations, alignment requirements for the U-Boot code are the
> > +strictest possible so that all implementations building U-Boot will have
> > +their requirements satisfied. For instance, aligning longs to four-byte
> > +boundaries is compatible with all implementations and is thus the
> > +requirement for U-Boot.
> > +
> > +[U-Boot alignment requirements are actually stricter than just C99: in
> > +U-Boot, at least some accesses of a given width must not be broken into
> > +smaller accesses or lumped into a wider access, because they are done
> > +to or from a device for which each physical access may have side
> > +effects. C99 does not explicitly state such a constraint except in
> > +general terms, that a C99 implementation .]
> > +
> > +The strict alignment requirements imply that all type declarations in
> > +U-Boot should respect alignment requirements; especially, that no
> > +structure should contain unaligned members.
> 
> This is where I moderately disagree.  I agree we don't want to make
> things unaligned for no reason in data structures.
>
> > +5. Purposeful unaligned accesses in U-Boot
> > +
> > +However, there might be cases where some data representation may need
> > +to include objects which are not aligned to their requirements. This
> > +happens in structs representing protocols such as IP or USB, for
> > +instance, or de facto standards such as FAT. Taking FAT as an example,
> 
> Right.  We should combine 4 and 5 to an extent.  We don't normally want,
> but have cases we end up with it.

Agreed on this and the previous comment; I'll rewrite sections 4 and 5
to make it clearer that the alignment rule on struct fields is not
entirely strict.

> > +here is the start of the FAT header, expressed as a succession of C
> > +types from position 0 in the first disk sector:
> > +
> > +        0       3       jump instruction
> > +        3       8       Name of the tool which formatted the FAT
> > +        11      2       Number of bytes per sector
> > +
> > +Since the third field is a short, as per ARM EABI it should be
> > +even-aligned. However, it is not: its offset is 11, an odd value.
> > +Therefore, a struct representing the FAT header (examples below are
> > +adapted from actual code in fs/fdos/dos.h) would either have to split
> > +the field in two smaller bytes...
> > +
> > +    typedef struct bootsector
> > +    {
> > +        unsigned char jump [3];  /* 0  Jump to boot code */
> > +        char banner[BANNER_LG];	 /* 3  OEM name & version */
> > +        unsigned char secsiz_lo; /* 11 Bytes per sector LO */
> > +        unsigned char secsiz_hi; /* 12 Bytes per sector HI */
> > +        [...]
> > +    } Directory_t;
> > +
> > +... but this is awkward and makes accesses to the field more
> > +complicated to express and read. The alternative is to use a short
> > +field, made unaligned by way of a 'packed' attribute on the struct:
> > +
> > +    typedef struct bootsector
> > +    {
> > +        unsigned char jump [3]; /* 0  Jump to boot code */
> > +        char banner[BANNER_LG];	/* 3  OEM name & version */
> > +        unsigned short secsiz;  /* 11 Bytes per sector */
> > +        [...]
> > +    } __attribute__ ((packed))  Directory_t;
> > +
> > +The latter is what is done in the U-Boot code, and the access is then
> > +explicitly turned from one unaligned 16-bit to two aligned 8-bit ones
> > +by using the macro __le16_to_cpu (which is slightly below U-Boot
> > +requirements, as we should be using put_unaligned_le16 instead).
> 
> FWIW, this doesn't match the U-Boot sources where we just let the
> compiler do padding and now that I type that I wonder if that's not part
> of the blame for some of our fatwrite problems?

Hmm... I'd taken the code straight from my master branch. Which part of
the FAT code does padding? In any case, you are right that we need to
investigate.

> But here's the non-problem, demonstrated with the 'gpt write' code on an
> armv5te target.  Having described a struct to the compiler, and given
> the compiler knows architecture aligment requirements it will either go:
> (a) The struct is not packed, invisble padding time
> (b) The struct is packed, this is unaligned, do it correctly.

I'd say (b) is "do it as instructed" rather than "do it correctly",
precisely because options exist to control how unaligned accesses go.

Apart from this, I agree about the packed vs non-packed behavior --
which of course is not specific to ARMv5te or even ARM. It is pretty
much a de facto standard although to be complete, (b) does not
conform to C99 per se.

However, ARMv5TE is not able to do native unaligned accesses, and the
following does not seem specific to ARMv5TE so I am not sure why
ARMv5TE specifically matters here.

> > +6. Unintended unaligned accesses in U-Boot
> 
> We indeed don't want to allow actually bad things to be written and
> work.  On processors such as ARM where we may, or may not, do a silent
> fixup, we choose not to do a silent fixup and instead fail.  For
> example, the following blows right up as we want it to today, and
> continues to blow up with the patch I sent where we send
> -mno-unaligned-access.  It _only_ works if we cleared the SCTRL.A bit
> which _we_do_not_want_.
> 
> diff --git a/common/cmd_misc.c b/common/cmd_misc.c
> index 93f9eab..fa4a146 100644
> --- a/common/cmd_misc.c
> +++ b/common/cmd_misc.c
> @@ -38,6 +38,18 @@ U_BOOT_CMD(
>  	"    - delay execution for N seconds (N is _decimal_ !!!)"
>  );
>  
> +static int do_unaligned(cmd_tbl_t *cmdtp, int flag, int argc, char * const argv[])
> +{
> +	unsigned char buf[8] = { 0 };
> +	return *(int*)((unsigned long)buf | 1);
> +}
> +
> +U_BOOT_CMD(
> +	unaligned,    1,    1,     do_unaligned,
> +	"Make some unaligned accesses happen",
> +	"Make some unaligned accesses happen"
> +);
> +
>  #ifdef CONFIG_CMD_TIMER
>  static int do_timer(cmd_tbl_t *cmdtp, int flag, int argc, char * const argv[])
>  {

Agreed so far: this code is an example of usually unpredictable
run-time unaligned access.

> And that's where we need to stop, more or less.  The above is the type
> of unaligned access problem we care about because it fails, everywhere
> _unless_ we fix it up, and we don't want to.  All of the problems we're
> encountering now are either:
> - Self inflicted (due to the mismatch in what the compiler requires and
>   what we've not done)
> - Potentially abusing the packed attribute.

Indeed, and "that's where we need to stop" is where we disagree, the
point of disagreement being that unpredictable run-time unaligned
accesses would be "the type of unaligned access problem we care about
because it fails everywhere", if by this you meant "the *only* type".

I don't only care about code which fails everywhere; I care about code
which I can tell will fail elsewhere even though it "works for me",
whether "me" means ARMv7 or ARM in general, or any other architecture
for that matter.

(and I am and will be grateful to people using other architectures than
ARM if they decide to suffer some small cost in order to try and catch
an issue that might bite us even though it would not bite them.)

Having -munaligned-access in conjunction with SRC.A allows us to trap
unintended unaligned accesses, at the only expense of having to make
intended ones explicit rather than relying on a compiler option. That
is a small cost to us, and a benefit overall.

> In the case of the GPT code, I'm still not convinced there's a problem
> that isn't due to how we're building code that validly describes the
> standard and how we tell the compiler to build it.  It should be
> __packed to match the on-disk format.  Being __packed means the compiler
> does the right thing about access or it's a compiler bug.

Being packed just tells the compiler not to pad, it does not tell the
compiler how to access the unaligned fields; how to access unaligned
fields is what -m[no-]unaligned-access is about, which means the
decision is in the hands of the developer who builds the code, based
on what this developer intends the code to do. 

Amicalement,
-- 
Albert.


More information about the U-Boot mailing list