[PATCH v2 1/2] lib: zstd: update to latest Linux zstd 1.5.2

Tom Rini trini at konsulko.com
Sat Jan 7 18:38:42 CET 2023


On Fri, Jan 06, 2023 at 11:26:36PM +0000, Maier, Brandon L                            Collins wrote:
> > From: Tom Rini <trini at konsulko.com>
> > On Mon, Jan 02, 2023 at 04:02:06PM +0000, Maier, Brandon L
> > >
> > > I understand Linux replaced their custom zstd code with the Facebook zstd
> > code, so the changes are widespread. The new zstd does support many
> > optional performance features as documented in zstd's lib/README[1],
> > which I have already added a KConfig to disable all of them in this patch. For
> > reference, if I enable the performance options the build size increases as so:
> > >
> > > aarch64: (for 2/2 boards) all +27432.0 rodata +320.0 text +27112.0
> > >        arm: (for 5/5 boards) all +21801.6 rodata +256.0 text +21545.6
> > >    sandbox: (for 2/2 boards) all +59004.0 bss +16.0 rodata +800.0 text
> > +58188.0
> > >
> > > I haven't had any luck finding another way to decrease build size.
> > >
> > > [1]
> > https://github.com/facebook/zstd/blob/dev/lib/README.md#modular-build
> > 
> > After skimming over that, did you see if not disabling inlining, and
> > letting the compiler figure it out is a big size penalty?
> 
> That is one of the settings I am already enabling. To clarify from that README, the settings I am enabling in U-Boot are HUF_FORCE_DECOMPRESS_X1, ZSTD_FORCE_DECOMPRESS_SEQUENCES_SHORT, ZSTD_NO_INLINE, and ZSTD_STRIP_ERROR_STRINGS. Which is what upstream zstd uses for their own ZSTD_LIB_MINIFY option in lib/libzstd.mk.
> 
> If I drop the ZSTD_NO_INLINE, the build sizes increase as follows:
> 
> aarch64: (for 2/2 boards) all +6212.0 rodata +8.0 text +6204.0
>        arm: (for 5/5 boards) all +7408.0 text +7408.0
>    sandbox: (for 2/2 boards) all +21928.0 bss -16.0 data +16.0 rodata +224.0 text +21704.0

On top of the previous I assume, OK, darn.

> > Also, while the arch-level details are good, a specific platform set of numbers
> > instead might give some ideas on where maybe reductions could come from.
>  
> I looked over each platform, but the size deltas are consistent within each architecture.
> 
> What stands out though is that the old ZSTD ARM code is about 4.6kb smaller (25%) than the old ZSTD AARCH64 code. The newer ZSTD doesn't do that, both ARM and AARCH64 are about the same in size. That is why there is a 4k jump in size for just the arm boards. Comparing the functions list between the two, both archs appear similar except ZSTD_decompressBlock_internal() grows significantly on aarch64 for some reason. I haven't seen any obvious root cause though.
> 
> Here is the full breakdown of platform size deltas
> 
>    aarch64: (for 2/2 boards) all +535.5 rodata +965.5 text -430.0
>             turris_mox     : all +572 rodata +968 text -396
>             mvebu_espressobin-88f3720: all +499 rodata +963 text -464
>        arm: (for 5/5 boards) all +4489.6 rodata +940.0 text +3549.6
>             turris_omnia   : all +4608 rodata +940 text +3668
>             m53menlo       : all +4460 rodata +940 text +3520
>             dh_imx6        : all +4460 rodata +940 text +3520
>             stm32mp15_dhcom_basic: all +4460 rodata +940 text +3520
>             stm32mp15_dhcor_basic: all +4460 rodata +940 text +3520
>    sandbox: (for 2/2 boards) all +8.0 rodata +784.0 text -776.0
>             sandbox        : all +528 bss -32 rodata +768 text -208
>             sandbox64      : all -512 bss +32 rodata +800 text -1344
> 
> 
> > Finally, what is the whole list of platforms that grow?
> 
> All of them except sandbox64 grow by some amount.
> 
> > It's all not ideal, yes, but it looks like BTRFS is the main user, right now, which
> > isn't widely enabled. So maybe we can look towards improving upstream a
> > bit here, if motivated.
> 
> I can't speak for the BTRFS code as I don't have a suitable platform
> to test it on. But the zstd code does appear to be broken on master.
> Running the zstd compression test in sandbox or sandbox64 both
> segfault. And I have similar problems trying to decompress FIT images
> on our AARCH64 platform. The primary motivation to this patch is that
> the new version does work. And this seems like the appropriate fix
> given Linux has abandoned the original version of this code.

Well, I guess in the end, non-functional smaller code is factually worse
than larger functional code, so, I'll pick this up post v2023.01,
thanks!

-- 
Tom
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 659 bytes
Desc: not available
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20230107/2cdcfe59/attachment.sig>


More information about the U-Boot mailing list