[U-Boot] [PATCH v7] socfpga: Add socfpga preloader signing to mkimage

Charles Manning cdhmanning at gmail.com
Mon Mar 10 04:04:58 CET 2014


Hello Gerhard

Thank you for that feedback.

On Sunday 09 March 2014 05:51:23 Gerhard Sittig wrote:
> On Thu, Mar 06, 2014 at 15:40 +1300, Charles Manning wrote:
> > [ ... ]
> > Unfortunately the CRC used in this boot ROM is not the same as the
> > Adler CRC in lib/crc32.c. Indeed the Adler code is not technically a
> > CRC but is more correctly described as a checksum.
>
> I don't quite get why you say that the zlib implementation is not
> a CRC.  IIUC all of the CRC-32 methods follow a common algorithm,
> and just happen to differ in their polynomials or initial and
> final masks.  So they result in different CRC values for the same
> data stream, yet all of them are CRCs.
>
> But strictly speaking this remark is not specific to this patch
> submission, and might be omitted.  The only relevant thing is
> that the zlib CRC-32 approach does not result in the value that
> the SoC's ROM loader expects.

Ok I will change that comment

>
> > Thus, the appropriate CRC generator is added to lib/ as crc32_alt.c.
>
> What does "alt" stand for, and can you provide an appropriate
> description with the commit log (or just use a better name)?  As
> previous communication suggests, it's certainly not "alternative"
> -- as there just are too many alternatives available to mark one
> of them as _the_ alternative.  It's neither "apple" where you
> appear to have found the matching polynomial first.  If it's
> "Altera", I'd suggest to use either "altr" as this is what
> elsewhere is used for Altera (the stock ticker), or plain
> "altera" instead of something this short and ambiguous.  Or
> "bzlib" since this is the most recent implementation that you are
> using.

I agree thank you for pointing out that error in my comment.

> > Signed-off-by: Charles Manning <cdhmanning at gmail.com>
> > ---
> >
> > Changes for v3:
> >  - Fix some coding style issues.
> >  - Move from a standalone tool to the mkimgae framework.
> >
> > Changes for v4:
> >  - Fix more coding style issues.
> >  - Fix typos in Makefile.
> >  - Rebase on master (previous version was not on master, but on a
> >    working socfpga branch).
> >
> > Changes for v5:
> >  - Fix more coding style issues.
> >  - Add some more comments.
> >  - Remove some unused defines.
> >  - Move the local CRC32 code into lib/crc32_alt.c.
> >
> > Changes for v6:
> >  - Fix more coding style issues.
> >  - Rejig socfpgaimage_vrec_header() function so that it has one return
> >    path and does stricter size checks.
> >
> > Changes for v7:
> >  - Use bzlib's crc table instead of adding another one.
> >  - Use existing code and a packed structure for header marshalling.
>
> One of the reasons I got tired of providing feedback is that this
> change log is rather terse.  Several identical "fix coding style"
> phrases are another way of telling readers that they should
> figure out for themselves what might have changed (especially
> when newer versions have functional changes that are not
> announced as such), and whether feedback was considered or got
> ignored.  Seeing several iterations which suffer from the same
> issues that have been identified multiple versions before isn't
> helpful either when asking yourself whether to spend more time
> and getting ignored another time.

Ok, I will try to be less terse.

>
> > --- /dev/null
> > +++ b/include/bzlib_crc32.h
> > @@ -0,0 +1,17 @@
> > +/*
> > + * Copyright (C) 2014 Charles Manning <cdhmanning at gmail.com>
> > + *
> > + * SPDX-License-Identifier:	GPL-2.0+
> > + *
> > + * Note that the CRC is **not** the zlib/Adler crc32 in crc32.c.
> > + * It is the CRC-32 used in bzip2, ethernet and elsewhere.
> > + */
> > +
> > +#ifndef __BZLIB_CRC32_H__
> > +#define __BZLIB_CRC32_H__
> > +
> > +#include <stdint.h>
> > +
> > +uint32_t bzlib_crc32(uint32_t crc, const void *_buf, int length);
> > +
> > +#endif
>
> and
>
> > --- /dev/null
> > +++ b/lib/bzlib_crc32.c
> > @@ -0,0 +1,26 @@
> > +/*
> > + * Copyright (C) 2014 Charles Manning <cdhmanning at gmail.com>
> > + *
> > + * SPDX-License-Identifier:	GPL-2.0+
> > + *
> > + * This provides a CRC-32 using the tables in bzlib_crctable.c
> > + */
> > +
> > +#include <bzlib_crc32.h>
> > +#include "bzlib_private.h"
> > +
> > +uint32_t bzlib_crc32(uint32_t crc, const void *_buf, int length)
> > +{
> > +	const uint8_t *buf = _buf;
> > +
> > +	crc ^= ~0;
> > +
> > +	while (length--) {
> > +		crc = (crc << 8) ^ BZ2_crc32Table[((crc >> 24) ^ *buf) & 0xff];
> > +		buf++;
> > +	}
> > +
> > +	crc ^= ~0;
> > +
> > +	return crc;
> > +}
>
> Since you already include the private bzlib header for the
> BZ2_crc32Table[] declaration, you might as well use the
> BZ_INITIALISE_CRC(), BZ_FINALISE_CRC(), and BZ_UPDATE_CRC()
> macros declared from there (right next to the table's
> declaration), instead of re-inventing how the CRC gets
> calculated.

If you think that makes it more clear I shall do that. I don't think though 
that it really is any clearer.

>
> You probably should determine whether you want to provide one
> routine to do the complete calculate over a given byte stream,
> without any "previous" CRC value -- this is what your current use
> case is.  Or whether you want to provide three primitives to
> initialize, update, and finalize a CRC (which is not used yet).
> Or whether you want to provide the three primitives plus one
> convenience wrapper.

I try to use a single "familiar" function, like crc32() does.

You can start with 0, then use it with the results.

eg 

crc = crc32(0, first_buff,...);
crc = crc32(crc, second_bufer,...);

>
> > --- a/tools/Makefile
> > +++ b/tools/Makefile
> > @@ -70,6 +70,8 @@ RSA_OBJS-$(CONFIG_FIT_SIGNATURE) := rsa-sign.o
> >  # common objs for dumpimage and mkimage
> >  dumpimage-mkimage-objs := aisimage.o \
> >  			$(FIT_SIG_OBJS-y) \
> > +			bzlib_crc32.o \
> > +			bzlib_crctable.o \
> >  			crc32.o \
> >  			default_image.o \
> >  			fit_image.o \
> > @@ -85,6 +87,7 @@ dumpimage-mkimage-objs := aisimage.o \
> >  			os_support.o \
> >  			pblimage.o \
> >  			sha1.o \
> > +			socfpgaimage.o \
> >  			ublimage.o \
> >  			$(LIBFDT_OBJS) \
> >  			$(RSA_OBJS-y)
> > diff --git a/tools/bzlib_crc32.c b/tools/bzlib_crc32.c
> > new file mode 100644
> > index 0000000..b7f7aa5
> > --- /dev/null
> > +++ b/tools/bzlib_crc32.c
> > @@ -0,0 +1 @@
> > +#include "../lib/bzlib_crc32.c"
> > diff --git a/tools/bzlib_crctable.c b/tools/bzlib_crctable.c
> > new file mode 100644
> > index 0000000..53d38ef
> > --- /dev/null
> > +++ b/tools/bzlib_crctable.c
> > @@ -0,0 +1 @@
> > +#include "../bzlib_crctable.c"
> > diff --git a/tools/bzlib_private.h b/tools/bzlib_private.h
> > new file mode 100644
> > index 0000000..cfb74be
> > --- /dev/null
> > +++ b/tools/bzlib_private.h
> > @@ -0,0 +1 @@
> > +#include "lib/bzlib_private.h"
>
> Now this is incredibly ugly.  You are duplicating lib/ stuff in
> tools/ and build it there as well?  Or not at all build it in
> lib/ where the code actually resides? 

> It's only a side note that 
> #include for "*.c" is problematic as some compilers do
> semi-intelligent stuff when the filename does not end in ".h" and
> then break compilation, so this hack should neither get spread
> nor suggested for use.

I absolutely agree this is very, very, ugly.

I was just following the current way of doing things. This is how crc32.c, 
sha1.c, md5.c, fdt.c and many others are handled.

I expect that this is worth a refactorisation effort, but I think that should 
be a separate exercise from doing this signer.

>
> Is there any (real, technical) reason why the bzip stuff (the
> CRC-32 calculation that has been made available separately)
> cannot get built and used as a library, and the tools/
> application just gets linked against it as one would expect?

From my limited understanding, lib/ is not built for the host. It is only 
built for the target. That is why we have all these ugly C files: crc32.c, 
sha1.c,... in tools/.

>
> > --- /dev/null
> > +++ b/tools/socfpgaimage.c
> > @@ -0,0 +1,255 @@
> > +/*
> > + * Copyright (C) 2014 Charles Manning <cdhmanning at gmail.com>
> > + *
> > + * SPDX-License-Identifier:	GPL-2.0+
> > + *
> > + * Reference doc
> > http://www.altera.com.cn/literature/hb/cyclone-v/cv_5400A.pdf + * Note
> > this doc is not entirely accurate. Of particular interest to us is the +
> > * "header" length field being in U32s and not bytes.
>
> Can you rephrase this?  The current form suggests that you cannot
> trust the Altera documentation, while you fail to state what
> exactly you have identified as being wrong (whether this one
> length spec is all the problems, or whether there are more).
>
> Looking at "Loading the Preloader" on page A-7 I see that it says
> the "Program length" is in bytes, while the "CRC" description on
> page A-8 mentions "Program Length * 4".  So the Handbook is
> inconsistent, and available data suggests that "length in units
> of 32bit words" is correct.

The definitive data point is that one works and the other does not :-).

>
> The polynomial cited there BTW translates to 0x04c11db7, which
> may be more specific than "what is used in ethernet and
> elsewhere", and might be useful to have in comments and commit
> logs.  The initial value and final XOR and "no reflection" are
> the other attributes of this specific CRC-32 method.

Ok, I will use that.

>
> > + *
> > + * "Header" is a structure of the following format.
> > + * this is positioned at 0x40.
> > + *
> > + * Endian is LSB.
> > + *
> > + * Offset   Length   Usage
> > + * -----------------------
> > + *   0x40        4   Validation word 0x31305341
> > + *   0x44        1   Version (whatever, zero is fine)
> > + *   0x45        1   Flags   (unused, zero is fine)
> > + *   0x46        2   Length  (in units of u32, including the end
> > checksum). + *   0x48        2   Zero
> > + *   0x4A        2   Checksum over the heder. NB Not CRC32
>
> This is wrong (and is not what the Handbook Appendix says).  The
> length is 32bits in width.  It just happens that in little endian
> representation and with a maximum of 60KB of payload data, the
> upper 16bits naturally remain zero.  Still the documentation and
> implementation should work with a 32bit length.

The implementation surely only has to work with the ROM. There is no point in 
it working for arbitrary cases.

>
> I agree that the term "checksum" might be confusing, because it's
> a mere sum of the preceeding bytes, and "by coincidence" the sum
> is used to check plausibility of the header.  I wouldn't either
> know how to best put this into a short comment.  From personal
> experience I can tell that "sum" alone would have helped a lot.

What is a checksum but a sum that is used for checking? The term checksum on 
its own does not imply any particular algorithm. If you think sum is better, 
I shall use that.
>
> > + *
> > + * At the end of the code we have a 32-bit CRC checksum over whole
> > binary + * excluding the CRC.
>
> This probably should say "CRC over the all payload data from
> offset zero up to but not including the position of the CRC".
> Depending on whether the 64KB padding is considered part of the
> binary or some "afterwards thing", this might help.

Ok.

>
> Background information:  The SPL is supposed to be up to 60KB in
> size (the upper 4KB are for the ROM loader's internal use, and
> for data shared among the preloader and the bootloader).  The SPL
> memory image has a gap at 0x40 (after the vectors, and before the
> "useful code") where the preloader header/signature gets put
> into.  After the signature application, a CRC-32 is calculated
> over the complete data and gets appended to the data (the length
> spec in the header includes the CRC location).  Then the image
> gets padded to full 64KB and gets duplicated four times (to
> increase robustness by means of redundancy, assuming boot media
> might have read errors).

Is 60K a hard limitation for SPL? I know that people are considering SPLs of 
larger size. Yeah... I know... what happened to 4k bootloaders?

>
> Your submission's motivation is to initially stamp a preloader
> that's the fresh result of a compilation.  I can see that it
> would be useful and desirable to apply the tool to existing
> binaries as well, i.e. throw a 64KB binary at the tool to
> re-create the signature.

I am confused by what you are saying here. There is only one point to this: 
sign a preloader so that a SocFPGA boot ROM will use it. There are no other 
use cases that make any sense.

Or put another way... would I run the omap signer on an arbitrary binary? No.
Same argument here.

By "signing existing binaries" do you mean existing preloaders? If so, people 
can to that with mkimage -T socfpgaimage ... (once this addition is in 
place).




>
> > + *
> > + * Note that the CRC used here is **not** the zlib/Adler crc32. It is
> > the + * CRC-32 used in bzip2, ethernet and elsewhere.
>
> If you want to be most specific, cite the parameters of the
> CRC-32 used here.  See the above comments and
> http://opensource.apple.com/source/CommonCrypto/CommonCrypto-60049/libcn/cr
>c32-bzip2.c and maybe http://www.ross.net/crc/download/crc_v3.txt
>
> BTW I'm still under the impression that zlib uses the same
> polynomial as well as identical initial values and final XOR
> masks (see lib/crc32.c).  So where is the difference between the
> zlib and the bzlib CRC-32 parameters?
>
>   $ perl -e 'use Compress::Zlib; printf "%x\n", crc32("123456789");'
>   cbf43926

From my understanding zlib crc32() == Adler.

Please compare the tables in lib/crc32.c and lib/bzlib_crc32.c. They are not 
the same.

 I have run many tests.

My original socfpga signer used crc32() and zlib. It did not give the values 
the ROM desired. As much as I would have liked to use zlib, my preferences 
are of no consequence. We must use the values the ROM will work with. If we 
do not, then the ROM will refuse to boot the code, which is not really the 
point of signing it...

I know Wikipedia isn't always accurate, but please also look at:

http://en.wikipedia.org/wiki/Cyclic_redundancy_check#Commonly_used_and_standardized_CRCs


I have not investigated this fully, but I think the crc I use here is the same 
as that used in pblimage.c and mxsimage.c

What I propose is that once this socfpga signer is bedded down, it makes sense 
to refactor those. I am prepared to do this. However, it is probably a better 
exercise to make refactoring a separate step from committing new code.

>
> > + *
> > + * The image is padded out to 64k, because that is what is
> > + * typically used to write the image to the boot medium.
> > + */
> > +
> > +#include <bzlib_crc32.h>
> > +#include "imagetool.h"
> > +#include <image.h>
> > +
> > +#define HEADER_OFFSET	0x40
> > +#define HEADER_SIZE	0xC
> > +#define VALIDATION_WORD	0x31305341
> > +#define PADDED_SIZE	0x10000
> > +
> > +/* To allow for adding CRC, the max input size is a bit smaller. */
> > +#define MAX_INPUT_SIZE	(PADDED_SIZE - sizeof(uint32_t))
>
> See above, it's not a blocker for the initial motivation, but
> might be useful to run the tool on existing 64KB blobs as well.
> But then it's hard to determine the "right" length in reliable
> ways, and the benefit may not be as big as I think it is, hmm...
> So OK, forget this idea, running the tool on fresh compiler
> output of at most 60KB size is quite appropriate an assumption.

I think we can safely say that for all useful use cases, this is a 
one-trick-pony: signing socfpga images.

>
> > +
> > +static uint8_t buffer[PADDED_SIZE];
> > +
> > +static struct {
> > +	uint32_t validation;
> > +	uint8_t  version;
> > +	uint8_t  flags;
> > +	uint16_t length_u32;
> > +	uint16_t zero;
> > +	uint16_t checksum;
> > +} __attribute__((packed)) header;
>
> You are aware of the "packed" and "alignment" discussion threads
> here, aren't you? 

Sorry I am not aware of these discussions. I personally **hate** using packed 
structure and assuming binary alignment.


> I considered the "serializer" accessors of 
> previous patch iterations most appropriate (except for their
> names), and feel this approach with a struct is inferior
> (especially because it doesn't cover the CRC later).  But others
> need not agree.

I too prefer the older serialisers but was told I could not use my own. Yes, 
there are some in target space: put_aligned_le16() etc, but from my 
understanding it is unhealthy to use target space functions from host space 
(mkimage etc). 


>
> Having a "length_u32" field which is of 'uint16_t' data type
> doesn't look right.

It means length in units of u32.

I will change that to length_in_u32 or something like that.

>
> > +
> > +/*
> > + * The header checksum is just a very simple checksum over
> > + * the header area.
> > + * There is still a crc32 over the whole lot.
> > + */
> > +static uint16_t hdr_checksum(const uint8_t *buf, int len)
> > +{
> > +	uint16_t ret = 0;
> > +	int i;
> > +
> > +	for (i = 0; i < len; i++) {
> > +		ret += (((uint16_t) *buf) & 0xff);
> > +		buf++;
> > +	}
> > +	return ret;
> > +}
> > +
> > +
>
> What is the typecast good for?  Why are there still whitespace
> issues?  Can you either fix them or tell why this is OK?  I.e.
> can you _somehow_ respond to feedback you receive?  Ignoring
> review comments makes you recive less feedback later.

Ok, I will do away with the cast. It does not serve any function.

I am unsure what whitespace issue do you mean? Does checkpatch not catch 
things like this?

>
> The usual C language style comments apply:  Please don't mix
> declarations and instructions.  

What do you mean by "declaration and instructions"? Do you mean the initial 
value mixing?


> You can eliminate variables.  Use 
> names which better reflect what is happening.  Something like
> this is more idiomatic:
>
> static uint16_t header_sum(const uint8_t *buf, int len)
> {
> 	uint16_t sum;
>
> 	sum = 0;
> 	while (len-- > 0) {
> 		sum += *buf++;
> 	}
> 	return sum;
> }

Ok I will do that.

>
> > +static void build_header(uint8_t *buf,
> > +			  uint8_t version,
> > +			  uint8_t flags,
> > +			  uint16_t length_bytes)
> > +{
> > +	header.validation = htole32(VALIDATION_WORD);
> > +	header.version = version;
> > +	header.flags = flags;
> > +	header.length_u32 = htole16(length_bytes/4);
> > +	header.zero = 0;
> > +	header.checksum = htole16(hdr_checksum((const uint8_t *)&header, 10));
> > +
> > +	memcpy(buf, &header, sizeof(header));
> > +}
>
> I'd suggest to s/4/sizeof(uint32_t)/

Ok, I will do that.

>
> Can you rephrase the "10" in terms of "header size minus the
> width of its sum"?  Very similar to the CRC which covers "all of
> the data except the CRC field".
>
> And there may be a problem with "direct" assignments to a
> "packed" struct.  Since you fill in a local struct and memcpy to
> the image then, you might as well use "normal serializers".

As explained above, I would prefer to use serialisers, they are much clearer 
for this sort of application.

>
> > +/* Sign the buffer and return the signed buffer size */
> > +static int sign_buffer(uint8_t *buf,
> > +			uint8_t version, uint8_t flags,
> > +			int len, int pad_64k)
> > +{
> > +	uint32_t calc_crc;
> > +
> > +	/* Align the length up */
> > +	len = (len + 3) & (~3);
>
> Is there nothing you can re-use?  If not can you introduce
> something generic to roundup() for user space tools to use?  

Sure I define something for reuse... 

Can you suggest where I might put it?


> Can 
> you rephrase the condition such that it becomes evident you are
> doing ROUNDUP(len, sizeof(uint32_t)) and avoid the magic "3" that
> has no such meaning for readers?
>
> > +
> > +	/* Build header, adding 4 bytes to length to hold the CRC32. */
> > +	build_header(buf + HEADER_OFFSET,  version, flags, len + 4);
> > +
> > +	/* Calculate and apply the CRC */
> > +	calc_crc = bzlib_crc32(0, buf, len);
> > +
> > +	*((uint32_t *)(buf + len)) = htole32(calc_crc);
>
> This is what I was talking about above.  It may be better to use
> a serializer to write both header fields as well as the CRC.

I absolutely agree. I prefer a serializer, but I moved to packed structures 
(which I generally mistrust for various reasons including arcane aliassing 
rules) because I was told not to add my own.

Is it Ok if I add my own? If I can, where do I put them, if not, I cannot see 
where to find some.

>
> > +
> > +	if (!pad_64k)
> > +		return len + 4;
> > +
> > +	return PADDED_SIZE;
> > +}
> >
> > +static void socfpgaimage_print_header(const void *ptr)
> > +{
> > +	if (verify_buffer(ptr) == 0)
> > +		printf("Looks like a sane SOCFPGA preloader\n");
> > +	else
> > +		printf("Not a sane SOCFPGA preloader\n");
> > +}
>
> Would it be useful to print a little more information than "it's
> a preloader" without any further details?  I'm not familiar with
> the context this routine gets called in, what do users expect?

It gets called at the end of making an image.

Ok, I will print out some numbers too. They are probably of little use to 
anyone, but they look "technical" :-).

Thanks

Charles




More information about the U-Boot mailing list