[U-Boot] [PATCH 3/4] sun8i: On H3 Use word 1 instead of word 3 from the SID

Siarhei Siamashka siarhei.siamashka at gmail.com
Fri Jul 29 13:12:52 CEST 2016


Hi,

On Thu, 28 Jul 2016 20:35:21 +0200
Hans de Goede <hdegoede at redhat.com> wrote:

> Hi,
> 
> On 28-07-16 05:13, Chen-Yu Tsai wrote:
> > Hi,
> >
> > On Thu, Jul 28, 2016 at 3:14 AM, Siarhei Siamashka
> > <siarhei.siamashka at gmail.com> wrote:  
> >> Hello Hans,
> >>
> >> On Wed, 27 Jul 2016 18:10:34 +0200
> >> Hans de Goede <hdegoede at redhat.com> wrote:
> >>  
> >>> It seems that bytes 13-14 of the SID / bytes 1-2 from word 3 of the SID
> >>> are always 0 on H3 making it a poor candidate to use as source for the
> >>> serialnr / mac-address, switch to word1 which seems to be more random.
> >>>
> >>> Cc: Chen-Yu Tsai <wens at csie.org>
> >>> Cc: Corentin LABBE <clabbe.montjoie at gmail.com>
> >>> Cc: Amit Singh Tomar <amittomer25 at gmail.com>
> >>> Signed-off-by: Hans de Goede <hdegoede at redhat.com>
> >>> ---
> >>>  board/sunxi/board.c | 23 ++++++++++++++---------
> >>>  1 file changed, 14 insertions(+), 9 deletions(-)
> >>>
> >>> diff --git a/board/sunxi/board.c b/board/sunxi/board.c
> >>> index ef3fe26..bbe5340 100644
> >>> --- a/board/sunxi/board.c
> >>> +++ b/board/sunxi/board.c
> >>> @@ -620,12 +620,17 @@ static void setup_environment(const void *fdt)
> >>>       uint8_t mac_addr[6];
> >>>       char ethaddr[16];
> >>>       int i, ret;
> >>> +#ifdef CONFIG_MACH_SUN8I_H3
> >>> +     const int idx0 = 0, idx1 = 1;
> >>> +#else
> >>> +     const int idx0 = 0, idx1 = 3;
> >>> +#endif
> >>>
> >>>       ret = sunxi_get_sid(sid);
> >>> -     if (ret == 0 && sid[0] != 0 && sid[3] != 0) {
> >>> +     if (ret == 0 && sid[idx0] != 0 && sid[idx1] != 0) {
> >>>               /* Ensure the NIC specific bytes of the mac are not all 0 */
> >>> -             if ((sid[3] & 0xffffff) == 0)
> >>> -                     sid[3] |= 0x800000;
> >>> +             if ((sid[idx1] & 0xffffff) == 0)
> >>> +                     sid[idx1] |= 0x800000;
> >>>
> >>>               for (i = 0; i < 4; i++) {
> >>>                       sprintf(ethaddr, "ethernet%d", i);
> >>> @@ -642,18 +647,18 @@ static void setup_environment(const void *fdt)
> >>>
> >>>                       /* Non OUI / registered MAC address */
> >>>                       mac_addr[0] = (i << 4) | 0x02;
> >>> -                     mac_addr[1] = (sid[0] >>  0) & 0xff;
> >>> -                     mac_addr[2] = (sid[3] >> 24) & 0xff;
> >>> -                     mac_addr[3] = (sid[3] >> 16) & 0xff;
> >>> -                     mac_addr[4] = (sid[3] >>  8) & 0xff;
> >>> -                     mac_addr[5] = (sid[3] >>  0) & 0xff;
> >>> +                     mac_addr[1] = (sid[idx0] >>  0) & 0xff;
> >>> +                     mac_addr[2] = (sid[idx1] >> 24) & 0xff;
> >>> +                     mac_addr[3] = (sid[idx1] >> 16) & 0xff;
> >>> +                     mac_addr[4] = (sid[idx1] >>  8) & 0xff;
> >>> +                     mac_addr[5] = (sid[idx1] >>  0) & 0xff;
> >>>
> >>>                       eth_setenv_enetaddr(ethaddr, mac_addr);
> >>>               }
> >>>
> >>>               if (!getenv("serial#")) {
> >>>                       snprintf(serial_string, sizeof(serial_string),
> >>> -                             "%08x%08x", sid[0], sid[3]);
> >>> +                             "%08x%08x", sid[idx0], sid[idx1]);
> >>>
> >>>                       setenv("serial#", serial_string);
> >>>               }  
> >>
> >> Is it really a good idea trying to guess which parts of the SID are
> >> sufficiently unique, depending on the SoC generation?  
> 
> Probably not, but that is what we've been doing so far.
> 
> >> We can calculate some kind of hash (CRC32? SHA256?) over the whole
> >> SID. This will ensure that all the SID bits are accounted for.  
> 
> That seems like a good idea.
> 
> I'm thinking about doing the following (for H3 at least, and probably
> also for other new SoCs):
> 
> 	sid[3] ^= sid[1] ^ sid[2];

This is not a very good idea. For example, if one of the words is a
encoding the batch number and another word is encoding the serial
number in a batch, then there will be MAC address collisions:

    0x0000000 ^ 0x00000000 = 0x00000000
    0x0000000 ^ 0x00000001 = 0x00000001
    0x0000001 ^ 0x00000000 = 0x00000001
    0x0000001 ^ 0x00000001 = 0x00000000

There is a reason why CRC32 is used for calculating checksums instead
of doing simple XOR or ADD. And even added as a special instruction on
x86 and ARM. It just ensures that changing one bit in the input affects
many bits in the output and can detect single bit or multiple bit flips
in the corrupted data. 

    https://en.wikipedia.org/wiki/Avalanche_effect

I would say that if we only want to combine the SID data into 32
deterministic pseudo-random bits, then the use of CRC32 is a much
better choice than XOR. U-Boot should already have the crc32 function
linked in and it's only a single function call to do this. But if we
want more than 32 bits, than taking any N bits of any cryptographic
hash would do the job. As an additional bonus, if some users have
privacy concerns, than a cryptographic hash would make it difficult
to recover the original SID value from the MAC address.

> No need to use a crpytho graphically secure hash, and AFAIK we
> don't have enough data for that anyways (without using some seed).
> 
> Before using sid[3] to get the low 4 bytes of the mac / serial.
> 
> I believe that using sid[0] as the first 4 bytes of the serial
> is still a good idea, as it gives us info on which SoC is being used.

It might be a waste of the 8 bits of space in the MAC address for
encoding something like only 4 bits of real data. Moreover, the
current code is only taking the lowest 8 bits of the sid[0], which
do not contain the SoC type on older Allwinner chips (A10/A13/A20).
And we are in the assumptions territory again for the newer chips
(H3/A83/A64/....), but making assumptions is not very reliable.

With using a good hash for combining the SID data, we only have a
risk of random collisions of the generated MAC addresses:

    http://preshing.com/20110504/hash-collision-probabilities/

Using more bits for the deterministic pseudo-random part of the MAC
address significantly reduces the probability of having collisions.

With 32 bits of the pseudo-random part, we can expect 1% probability
of a collision between at least one pair of devices in a set of 9292
devices.

With 44 bits of the pseudo-random part, we can expect 1% probability
of a collision between at least one pair of devices in a set of 594656
devices.
 
> >>
> >> Also changing the MAC address generation algorithm is an invasive
> >> change, which is affecting the end users. So IMHO it's best to have
> >> it implemented properly right from the start, rather than evolving
> >> via a trial and error method. What if it turns out that word1
> >> from the SID is actually not sufficiently random on H3? How large
> >> was your sample set?  
> >
> > I've added the SID values from whatever H3 boards I have to:
> >
> >     http://linux-sunxi.org/SID_Register_Guide#Currently_known_SID.27s  
> 
> I had done the same for my boards yesterday, but apparently never saved
> the preview :|
> 
> I've just done this a second time.
> 
> > And it seems word1 is more like a batch number. Note the last 3 boards,
> > which have identical word1's. These were given to me by Steven at the
> > same time. word2 seems to follow the pattern 5xxxxxxe.
> >
> > In short there are quite a few bits that are the same.  
> 
> Ack, so nack to my own patch as using word1 is worse then using word3,
> I suggest we simply ex-or word1-3 together and use that, any comments
> on that ?

I still think that it's best to use SHA1 or SHA256 and calculate it
over the whole SID data. The self-test code included in the sources

    http://git.denx.de/?p=u-boot.git;a=blob;f=lib/sha1.c
    http://git.denx.de/?p=u-boot.git;a=blob;f=lib/sha256.c

can serve as a usage example. There is really nothing difficult. And
then we can use as many bits as necessary for the pseudo-random part
of the MAC.

-- 
Best regards,
Siarhei Siamashka


More information about the U-Boot mailing list