[PATCH v5 00/27] Integrate MbedTLS v3.6 LTS with U-Boot
Raymond Mao
raymond.mao at linaro.org
Wed Aug 14 15:42:08 CEST 2024
Hi Tom,
On Fri, 2 Aug 2024 at 11:34, Raymond Mao <raymond.mao at linaro.org> wrote:
> Hi Tom,
>
> On Thu, 1 Aug 2024 at 16:46, Tom Rini <trini at konsulko.com> wrote:
>
>> On Wed, Jul 31, 2024 at 10:25:10AM -0700, Raymond Mao wrote:
>> >
>> > Integrate MbedTLS v3.6 LTS (currently v3.6.0) with U-Boot.
>> >
>> > Motivations:
>> > ------------
>> >
>> > 1. MbedTLS is well maintained with LTS versions.
>> > 2. LWIP is integrated with MbedTLS and easily to enable HTTPS.
>> > 3. MbedTLS recently switched license back to GPLv2.
>> >
>> > Prerequisite:
>> > -------------
>> >
>> > This patch series requires mbedtls git repo to be added as a
>> > subtree to the main U-Boot repo via:
>> > $ git subtree add --prefix lib/mbedtls/external/mbedtls \
>> > https://github.com/Mbed-TLS/mbedtls.git \
>> > v3.6.0 --squash
>> > Moreover, due to the Windows-style files from mbedtls git repo,
>> > we need to convert the CRLF endings to LF and do a commit manually:
>> > $ git add --renormalize .
>> > $ git commit
>> >
>> > New Kconfig options:
>> > --------------------
>> >
>> > `MBEDTLS_LIB` is for MbedTLS general switch.
>> > `MBEDTLS_LIB_CRYPTO` is for replacing original digest and crypto libs
>> with
>> > MbedTLS.
>> > `MBEDTLS_LIB_X509` is for replacing original X509, PKCS7, MSCode, ASN1,
>> > and Pubkey parser with MbedTLS.
>> > `LEGACY_CRYPTO` is introduced as a main switch for legacy crypto
>> library.
>> > `LEGACY_CRYPTO_BASIC` is for the basic crypto functionalities and
>> > `LEGACY_CRYPTO_CERT` is for the certificate related functionalities.
>> > For each of the algorithm, a pair of `<alg>_LEGACY` and `<alg>_MBEDTLS`
>> > Kconfig options are introduced. Meanwhile, `SPL_` Kconfig options are
>> > introduced.
>> >
>> > In this patch set, MBEDTLS_LIB, MBEDTLS_LIB_CRYPTO and MBEDTLS_LIB_X509
>> > are by default enabled in qemu_arm64_defconfig and sandbox_defconfig
>> > for testing purpose.
>> >
>> > Patches for external MbedTLS project:
>> > -------------------------------------
>> >
>> > Since U-Boot uses Microsoft Authentication Code to verify PE/COFFs
>> > executables which is not supported by MbedTLS at the moment,
>> > addtional patches for MbedTLS are created to adapt with the EFI loader:
>> > 1. Decoding of Microsoft Authentication Code.
>> > 2. Decoding of PKCS#9 Authenticate Attributes.
>> > 3. Extending MbedTLS PKCS#7 lib to support multiple signer's
>> certificates.
>> > 4. MbedTLS native test suites for PKCS#7 signer's info.
>> >
>> > All above 4 patches (tagged with `mbedtls/external`) are submitted to
>> > MbedTLS project and being reviewed, eventually they should be part of
>> > MbedTLS LTS release.
>> > But before that, please merge them into U-Boot, otherwise the building
>> > will be broken when MBEDTLS_LIB_X509 is enabled.
>> >
>> > See below PR link for the reference:
>> > https://github.com/Mbed-TLS/mbedtls/pull/9001
>> >
>> > Miscellaneous:
>> > --------------
>> >
>> > Optimized MbedTLS library size by tailoring the config file
>> > and disabling all unnecessary features for EFI loader.
>> > From v2, original libs (rsa, asn1_decoder, rsa_helper, md5, sha1,
>> sha256,
>> > sha512) are completely replaced when MbedTLS is enabled.
>> > From v3, the size-growth is slightly reduced by refactoring Hash
>> functions.
>> >
>> > Target(QEMU arm64) size-growth when enabling MbedTLS:
>> > v1: 6.03%
>> > v2: 4.66%
>> > From v3: 4.55%
>> >
>> > Please see the latest output from buildman for size-growth on QEMU
>> arm64,
>> > Sandbox and Nanopi A64. [1]
>>
>> Let us inline the growth on qemu_arm64 for a moment:
>> aarch64: (for 1/1 boards) all +6916.0 bss -32.0 data -64.0 rodata
>> +200.0 text +6812.0
>> qemu_arm64 : all +6916 bss -32 data -64 rodata +200 text
>> +6812
>> u-boot: add: 28/-17, grow: 12/-16 bytes: 15492/-8304 (7188)
>> function old new
>> delta
>> mbedtls_internal_sha1_process - 4540
>> +4540
>> mbedtls_internal_md5_process - 2928
>> +2928
>> mbedtls_internal_sha256_process - 2052
>> +2052
>> mbedtls_internal_sha512_process - 1056
>> +1056
>> K - 896
>> +896
>> mbedtls_sha512_finish - 556
>> +556
>> mbedtls_sha256_finish - 484
>> +484
>> mbedtls_sha1_finish - 420
>> +420
>> mbedtls_sha512_starts - 340
>> +340
>> mbedtls_md5_finish - 336
>> +336
>> mbedtls_sha512_update - 264
>> +264
>> mbedtls_sha256_update - 252
>> +252
>> mbedtls_sha1_update - 236
>> +236
>> mbedtls_md5_update - 236
>> +236
>> mbedtls_sha512 - 148
>> +148
>> mbedtls_sha256_starts - 124
>> +124
>> hash_init_sha512 52 128
>> +76
>> hash_init_sha256 52 128
>> +76
>> mbedtls_sha1_starts - 72
>> +72
>> mbedtls_md5_starts - 60
>> +60
>> hash_init_sha1 52 112
>> +60
>> mbedtls_platform_zeroize - 56
>> +56
>> mbedtls_sha512_free - 16
>> +16
>> mbedtls_sha256_free - 16
>> +16
>> mbedtls_sha1_free - 16
>> +16
>> mbedtls_md5_free - 16
>> +16
>> hash_finish_sha512 72 88
>> +16
>> hash_finish_sha256 72 88
>> +16
>> hash_finish_sha1 72 88
>> +16
>> sha512_csum_wd 68 80
>> +12
>> sha256_csum_wd 68 80
>> +12
>> sha1_csum_wd 68 80
>> +12
>> md5_wd 68 80
>> +12
>> mbedtls_sha512_init - 12
>> +12
>> mbedtls_sha256_init - 12
>> +12
>> mbedtls_sha1_init - 12
>> +12
>> mbedtls_md5_init - 12
>> +12
>> memset_func - 8
>> +8
>> sha512_update 4 8
>> +4
>> sha384_update 4 8
>> +4
>> sha256_update 12 8
>> -4
>> sha1_update 12 8
>> -4
>> sha256_process 16 -
>> -16
>> sha1_process 16 -
>> -16
>> hash_update_sha512 36 16
>> -20
>> hash_update_sha256 36 16
>> -20
>> hash_update_sha1 36 16
>> -20
>> MD5Init 56 36
>> -20
>> sha1_starts 60 36
>> -24
>> hash_update_sha384 36 -
>> -36
>> hash_init_sha384 52 -
>> -52
>> sha384_csum_wd 68 12
>> -56
>> sha256_starts 104 40
>> -64
>> sha256_padding 64 -
>> -64
>> sha1_padding 64 -
>> -64
>> hash_finish_sha384 72 -
>> -72
>> sha512_finish 152 36
>> -116
>> sha512_starts 168 40
>> -128
>> sha384_starts 168 40
>> -128
>> sha384_finish 152 4
>> -148
>> MD5Final 196 44
>> -152
>> sha512_base_do_finalize 160 -
>> -160
>> static.sha256_update 228 -
>> -228
>> static.sha1_update 240 -
>> -240
>> sha512_base_do_update 244 -
>> -244
>> MD5Update 260 -
>> -260
>> sha1_finish 300 36
>> -264
>> sha256_finish 404 36
>> -368
>> sha256_armv8_ce_process 428 -
>> -428
>> sha1_armv8_ce_process 484 -
>> -484
>> sha512_K 640 -
>> -640
>> sha512_block_fn 1212 -
>> -1212
>> MD5Transform 2552 -
>> -2552
>>
>> And to start with, that's not bad. In fact, tossing LTO in before mbedTLS
>> only changes
>> the top-line a little:
>> aarch64: (for 1/1 boards) all +5120.0 bss -16.0 data -64.0 rodata
>> +200.0 text +5000.0
>> qemu_arm64 : all +5120 bss -16 data -64 rodata +200 text
>> +5000
>> u-boot: add: 19/-18, grow: 11/-7 bytes: 14696/-7884 (6812)
>>
>> But, is there something we can do still? mbedTLS is a more robust
>> solution and I'm accepting there will be growth. But still the
>> process/start/finish is much larger. Is there something configurable
>> there?
>>
>> I have investigated all those MbedTLS native functions with big-size
> (_process/_update/_finish).
> For MD5 and SHA1, we don't have turnable configs.
> For SHA256 and SHA512, there are a few configs:
> 1. Performance configs only for Armv8/a64.
> I didn't turn that on, which might affect the target size as well.
> 2. Smaller implementation with lower size (only for non-Armv8/a64) at the
> expense of losing
> performance.
> I didn't enable both, as #1 is more for performance and might potentially
> increase target size;
> #2 compromises the performance and only for non-Armv8/a64.
> Looks like that both don't help in reducing the size of qemu_arm64.
> But I will try #1 on qemu_arm64 and #2 on sandbox and let you know
> the size impact soon.
>
> The smaller footprint implementation for SHA256/512 can reduce the target
size significantly on
those "<hash>_process()" functions. Please see below output from buildman:
```
aarch64: (for 2/2 boards) all -1468.0 bss +16.0 data -64.0 rodata +200.0
text -1620.0
qemu_arm64 : all +4608 bss +80 data -64 rodata +200 text
+4392
u-boot: add: 29/-17, grow: 12/-16 bytes: 13072/-8304 (4768)
nanopi_a64 : all -7544 bss -48 data -64 rodata +200 text
-7632
u-boot: add: 21/-8, grow: 4/-8 bytes: 10692/-4364 (6328)
sandbox: (for 1/1 boards) all +19312.0 data +1440.0 rodata -4128.0 text
+22000.0
sandbox : all +19312 data +1440 rodata -4128 text +22000
u-boot: add: 258/-206, grow: 122/-59 bytes: 90286/-76286
(14000)
```
Since this is a trade-off between size and performance, I will add one more
kconfig to
allow the user to turn it on/off. What are your thoughts?
On the other hand, the "Armv8/a64 only" options depend on NEON
instructions, so I
will keep them off.
Regards,
Raymond
More information about the U-Boot
mailing list