[U-Boot] [PATCH v2] Implement generalised RSA public exponents for verified boot

Michael van der Westhuizen michael at smart-africa.com
Fri May 30 23:03:24 CEST 2014


Hi Simon,

On 30 May 2014, at 10:50 PM, Simon Glass <sjg at chromium.org> wrote:

> Hi Michael,
> 
> On 30 May 2014 14:47, Michael van der Westhuizen
> <michael at smart-africa.com> wrote:
>> Hi Simon,
>> 
>> Thanks for the feedback.
>> 
>> I'll take care of the nits and look into removing some special casing.
>> 
>> On 30 May 2014, at 9:04 PM, Simon Glass <sjg at chromium.org> wrote:
>> 
>>> Hi Michael,
>>> 
>> <snip>
>>>> 
>>>> /**
>>>> + * num_pub_exponent_bits() - Number of bits in the public exponent
>>>> + *
>>>> + * @key:       RSA key
>>>> + * @k:         Storage for the number of public exponent bits
>>>> + */
>>>> +static int num_public_exponent_bits(const struct rsa_public_key *key, int *k)
>>> 
>>> s/k/keyp/ or something like that.
>>> 
>>> Also is this function able to just use ffs() or similar?
>> 
>> flsll() yes, but that's not portable to Linux.
> 
> Does that matter?

This code compiles on the host, so unfortunately yes.  That's the same reason I had to work around the lack of handy *_u64 fdt helpers when reading the public exponent.

> 
>> 
>>> 
>> <snip>
>> 
>>>> +}
>>>> +
>>>> +/**
>>>> * pow_mod() - in-place public exponentiation
>>>> *
>>>> * @key:       RSA key
>>>> @@ -132,6 +171,7 @@ static int pow_mod(const struct rsa_public_key *key, uint32_t *inout)
>>>> {
>>>>       uint32_t *result, *ptr;
>>>>       uint i;
>>>> +       int j, k;
>>>> 
>>>>       /* Sanity check for stack size - key->len is in 32-bit words */
>>>>       if (key->len > RSA_MAX_KEY_BITS / 32) {
>>>> @@ -141,18 +181,49 @@ static int pow_mod(const struct rsa_public_key *key, uint32_t *inout)
>>>>       }
>>>> 
>>>>       uint32_t val[key->len], acc[key->len], tmp[key->len];
>>>> +       uint32_t a_scaled[key->len];
>>>>       result = tmp;  /* Re-use location. */
>>>> 
>>>>       /* Convert from big endian byte array to little endian word array. */
>>>>       for (i = 0, ptr = inout + key->len - 1; i < key->len; i++, ptr--)
>>>>               val[i] = get_unaligned_be32(ptr);
>>>> 
>>>> -       montgomery_mul(key, acc, val, key->rr);  /* axx = a * RR / R mod M */
>>>> -       for (i = 0; i < 16; i += 2) {
>>>> -               montgomery_mul(key, tmp, acc, acc); /* tmp = acc^2 / R mod M */
>>>> -               montgomery_mul(key, acc, tmp, tmp); /* acc = tmp^2 / R mod M */
>>>> +       if (0 != num_public_exponent_bits(key, &k))
>>>> +               return -EINVAL;
>>>> +
>>>> +       if (k < 2) {
>>>> +               debug("Public exponent is too short (%d bits, minimum 2)\n",
>>>> +                     k);
>>>> +               return -EINVAL;
>>>> +       }
>>>> +
>>>> +       if (!is_public_exponent_bit_set(key, k - 1) ||
>>>> +           !is_public_exponent_bit_set(key, 0)) {
>>>> +               debug("Invalid RSA public exponent 0x%llx.\n", key->exponent);
>>> 
>>> What is invalid about it? Perhaps expand the message?
>>> 
>>>> +               return -EINVAL;
>>>> +       }
>>>> +
>>>> +       /* the bit at e[k-1] is 1 by definition, so start with: C := M */
>>>> +       montgomery_mul(key, acc, val, key->rr); /* acc = a * RR / R mod n */
>>>> +       /* retain scaled version for intermediate use */
>>> 
>>> Join these two comments, also the style should be:
>>> 
>>> /*
>>> * The bit at ...
>>> * ...
>>> */
>>> 
>>>> +       memcpy(a_scaled, acc, key->len * 4);
>>> 
>>> What is 4? is it sizeof(uint32_t?). I wonder if we could replace the
>>> line immediately above and remove this memcpy()?
>>> 
>>> For example, if you did:
>>> 
>>>      montgomery_mul(key, result, val, key->rr); /* acc = a * RR / R mod n */
>> 
>> Yes, it's sizeof(uint32_t) - that would probably be a good thing to spell out too.
>> 
>> result points to tmp, which is going to be repeatedly overwritten in the loop, but we need a_scaled to stay constant (as the result of the first montgomery_mul) throughout... or did I misunderstand your point?
> 
> OK I see thanks. Perhaps add a new temporary instead of using memcpy()?

And set it up using montgomery_mul?  Would that not be more expensive than a memcpy?

Thanks,
Michael


More information about the U-Boot mailing list