[PATCH v2 1/2] lib: zstd: update to latest Linux zstd 1.5.2
Tom Rini
trini at konsulko.com
Sat Jan 7 18:38:42 CET 2023
On Fri, Jan 06, 2023 at 11:26:36PM +0000, Maier, Brandon L Collins wrote:
> > From: Tom Rini <trini at konsulko.com>
> > On Mon, Jan 02, 2023 at 04:02:06PM +0000, Maier, Brandon L
> > >
> > > I understand Linux replaced their custom zstd code with the Facebook zstd
> > code, so the changes are widespread. The new zstd does support many
> > optional performance features as documented in zstd's lib/README[1],
> > which I have already added a KConfig to disable all of them in this patch. For
> > reference, if I enable the performance options the build size increases as so:
> > >
> > > aarch64: (for 2/2 boards) all +27432.0 rodata +320.0 text +27112.0
> > > arm: (for 5/5 boards) all +21801.6 rodata +256.0 text +21545.6
> > > sandbox: (for 2/2 boards) all +59004.0 bss +16.0 rodata +800.0 text
> > +58188.0
> > >
> > > I haven't had any luck finding another way to decrease build size.
> > >
> > > [1]
> > https://github.com/facebook/zstd/blob/dev/lib/README.md#modular-build
> >
> > After skimming over that, did you see if not disabling inlining, and
> > letting the compiler figure it out is a big size penalty?
>
> That is one of the settings I am already enabling. To clarify from that README, the settings I am enabling in U-Boot are HUF_FORCE_DECOMPRESS_X1, ZSTD_FORCE_DECOMPRESS_SEQUENCES_SHORT, ZSTD_NO_INLINE, and ZSTD_STRIP_ERROR_STRINGS. Which is what upstream zstd uses for their own ZSTD_LIB_MINIFY option in lib/libzstd.mk.
>
> If I drop the ZSTD_NO_INLINE, the build sizes increase as follows:
>
> aarch64: (for 2/2 boards) all +6212.0 rodata +8.0 text +6204.0
> arm: (for 5/5 boards) all +7408.0 text +7408.0
> sandbox: (for 2/2 boards) all +21928.0 bss -16.0 data +16.0 rodata +224.0 text +21704.0
On top of the previous I assume, OK, darn.
> > Also, while the arch-level details are good, a specific platform set of numbers
> > instead might give some ideas on where maybe reductions could come from.
>
> I looked over each platform, but the size deltas are consistent within each architecture.
>
> What stands out though is that the old ZSTD ARM code is about 4.6kb smaller (25%) than the old ZSTD AARCH64 code. The newer ZSTD doesn't do that, both ARM and AARCH64 are about the same in size. That is why there is a 4k jump in size for just the arm boards. Comparing the functions list between the two, both archs appear similar except ZSTD_decompressBlock_internal() grows significantly on aarch64 for some reason. I haven't seen any obvious root cause though.
>
> Here is the full breakdown of platform size deltas
>
> aarch64: (for 2/2 boards) all +535.5 rodata +965.5 text -430.0
> turris_mox : all +572 rodata +968 text -396
> mvebu_espressobin-88f3720: all +499 rodata +963 text -464
> arm: (for 5/5 boards) all +4489.6 rodata +940.0 text +3549.6
> turris_omnia : all +4608 rodata +940 text +3668
> m53menlo : all +4460 rodata +940 text +3520
> dh_imx6 : all +4460 rodata +940 text +3520
> stm32mp15_dhcom_basic: all +4460 rodata +940 text +3520
> stm32mp15_dhcor_basic: all +4460 rodata +940 text +3520
> sandbox: (for 2/2 boards) all +8.0 rodata +784.0 text -776.0
> sandbox : all +528 bss -32 rodata +768 text -208
> sandbox64 : all -512 bss +32 rodata +800 text -1344
>
>
> > Finally, what is the whole list of platforms that grow?
>
> All of them except sandbox64 grow by some amount.
>
> > It's all not ideal, yes, but it looks like BTRFS is the main user, right now, which
> > isn't widely enabled. So maybe we can look towards improving upstream a
> > bit here, if motivated.
>
> I can't speak for the BTRFS code as I don't have a suitable platform
> to test it on. But the zstd code does appear to be broken on master.
> Running the zstd compression test in sandbox or sandbox64 both
> segfault. And I have similar problems trying to decompress FIT images
> on our AARCH64 platform. The primary motivation to this patch is that
> the new version does work. And this seems like the appropriate fix
> given Linux has abandoned the original version of this code.
Well, I guess in the end, non-functional smaller code is factually worse
than larger functional code, so, I'll pick this up post v2023.01,
thanks!
--
Tom
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 659 bytes
Desc: not available
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20230107/2cdcfe59/attachment.sig>
More information about the U-Boot
mailing list