[PATCH] zlib: Fix big performance regression

Tom Rini trini at konsulko.com
Mon Jul 1 03:06:52 CEST 2024


On Sun, Jun 30, 2024 at 08:49:08PM +0200, Christophe Leroy wrote:
> 
> 
> Le 30/06/2024 à 20:28, Christophe Leroy a écrit :
> > 
> > 
> > Le 30/06/2024 à 16:54, Tom Rini a écrit :
> > > On Sun, Jun 30, 2024 at 10:47:06AM +0200, Christophe Leroy wrote:
> > > > 
> > > > 
> > > > Le 27/06/2024 à 21:40, Tom Rini a écrit :
> > > > > On Thu, Jun 27, 2024 at 01:34:55PM -0600, Tom Rini wrote:
> > > > > > On Thu, Jun 27, 2024 at 10:25:21AM +0200, Christophe Leroy wrote:
> > > > > > 
> > > > > > > Commit 340fdf1303dc ("zlib: Port fix for CVE-2016-9841 to U-Boot")
> > > > > > > brings a big performance regression in inflate_fast(), which leads
> > > > > > > to watchdog timer reset on powerpc 8xx.
> > > > > > > 
> > > > > > > It looks like that commit does more than what it describe, it
> > > > > > > especially removed an important optimisation that was doing copies
> > > > > > > using halfwords instead of bytes. That unexpected change multiplied
> > > > > > > by almost 4 the time spent in inflate_fast() and increased by 40%
> > > > > > > the overall time needed to uncompress linux kernel image.
> > > > > > > 
> > > > > > > So partially revert that commit but keep post incrementation as it
> > > > > > > is the initial purpose of said commit.
> > > > > > > 
> > > > > > > Fixes: 340fdf1303dc ("zlib: Port fix for CVE-2016-9841 to U-Boot")
> > > > > > > Signed-off-by: Christophe Leroy <christophe.leroy at csgroup.eu>
> > > > > > > ---
> > > > > > >    lib/zlib/inffast.c | 51
> > > > > > > ++++++++++++++++++++++++++++++++++++----------
> > > > > > >    1 file changed, 40 insertions(+), 11 deletions(-)
> > > > > > 
> > > > > > Both this, and my mostly revert lead to CI failures around compression
> > > > > > tests:
> > > > > > https://source.denx.de/u-boot/u-boot/-/jobs/859329
> > > > > 
> > > > > A full revert however, is fine.
> > > > > 
> > > > 
> > > > Is it that ? :
> > > > 
> > > > FAILED
> > > > test/py/tests/test_ut.py::test_ut[ut_compression_compression_test_gzip]
> > > > 
> > > > It seems like when the optimisation was added by commit
> > > > cd514aeb996e ("zlib:
> > > > Optimize decompression"), only the pre-increment implementation was
> > > > available.
> > > > 
> > > > When POSTINC was added by commit e89516f031db ("zlib: split up to match
> > > > original source tree"), I guess it was not verified because POSTINC is
> > > > #undef by zlib.h.
> > > > 
> > > > With the following change I don't get the FAILED
> > > > ut_compression_compression_test_gzip CI anymore
> > > > (https://source.denx.de/u-boot/custodians/u-boot-mpc8xx/-/jobs/859937)
> > > > 
> > > > diff --git a/lib/zlib/inffast.c b/lib/zlib/inffast.c
> > > > index c271d85ea1..5dc0574202 100644
> > > > --- a/lib/zlib/inffast.c
> > > > +++ b/lib/zlib/inffast.c
> > > > @@ -257,7 +257,7 @@ void inflate_fast(z_streamp strm, unsigned start)
> > > >                          sfrom = (unsigned short *)(from - OFF);
> > > >                          loops = len >> 1;
> > > >                          do
> > > > -                           PUP(sout) = get_unaligned(++sfrom);
> > > > +                           PUP(sout) = get_unaligned(sfrom++);
> > > >                          while (--loops);
> > > >                          out = (unsigned char *)sout + OFF;
> > > >                          from = (unsigned char *)sfrom + OFF;
> > > 
> > > Ah, thanks! Testing again now.
> > > 
> > 
> > That's probably not enough. I thought other failures were unrelated but
> > I gave it a try with a full revert and I get no failure at all with that
> > so there must be other things.
> > 
> > I find that pat16 = *(sout-2+2*OFF) suspicious, sout being a short it
> > means -4 bytes which is not what is expected I think.
> 
> That's it. I get all green with the following, see
> https://source.denx.de/u-boot/custodians/u-boot-mpc8xx/-/pipelines/21391
> 
> diff --git a/lib/zlib/inffast.c b/lib/zlib/inffast.c
> index c271d85ea1..69268caee8 100644
> --- a/lib/zlib/inffast.c
> +++ b/lib/zlib/inffast.c
> @@ -257,14 +257,14 @@ void inflate_fast(z_streamp strm, unsigned start)
>  			sfrom = (unsigned short *)(from - OFF);
>  			loops = len >> 1;
>  			do
> -			    PUP(sout) = get_unaligned(++sfrom);
> +			    PUP(sout) = get_unaligned(sfrom++);
>  			while (--loops);
>  			out = (unsigned char *)sout + OFF;
>  			from = (unsigned char *)sfrom + OFF;
>  		    } else { /* dist == 1 or dist == 2 */
>  			unsigned short pat16;
> 
> -			pat16 = *(sout-2+2*OFF);
> +			pat16 = *(sout-1+OFF);
>  			if (dist == 1)
>  #if defined(__BIG_ENDIAN)
>  			    pat16 = (pat16 & 0xff) | ((pat16 & 0xff ) << 8);
> 

OK, so at this point we're changing a lot of common paths, to things
that haven't been tested before. I suspect it's all fine, but for the
release on July 1 I'm going to revert the zlib update, and then we can
perform these corrections and re-introduce the optimization after
release, when they'll get the most amount of testing.

-- 
Tom
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 659 bytes
Desc: not available
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20240630/ea20113e/attachment.sig>


More information about the U-Boot mailing list