BTRFS use-after-free bug at free_extent_buffer_internal

Sachi King nakato at nakato.io
Tue Apr 23 06:53:45 CEST 2024


On Monday 22 April 2024 5:15:50 PM AEST Qu Wenruo wrote:
> 
> 在 2024/4/22 16:07, Sachi King 写道:
> > Hi,
> >
> > I've hit a bug with u-boot on my BTRFS filesystem, and I'm fairly certain
> > it's a bug and not a corruption issue.
> >
> > A bit of history on the filesystem.  It is a fairly new filesystem as it was
> > being used to give me access to test a wayland application on a
> > Raspberry Pi.  The filesystem was about 3 days old when I hit the bug, and
> > I'm fairly certain it never had an unclean shutdown.  I have checked the
> > filesystem with "btrfs check" which has found no errors.  The filesystem
> > mounts on Linux and is functional.
> >
> >> # btrfs check --check-data-csum /dev/sda2
> >> Opening filesystem to check...
> >> Checking filesystem on /dev/sda2
> >> UUID: 18db6211-ac36-42c1-a22f-5e15e1486e0d
> >> [1/7] checking root items
> >> [2/7] checking extents
> >> [3/7] checking free space tree
> >> [4/7] checking fs roots
> >> [5/7] checking csums against data
> >> [6/7] checking root refs
> >> [7/7] checking quota groups skipped (not enabled on this FS)
> >> found 5070573568 bytes used, no error found
> >> total csum bytes: 4451620
> >> total tree bytes: 370458624
> >> total fs tree bytes: 353124352
> >> total extent tree bytes: 10010624
> >> btree space waste bytes: 62303284
> >> file data blocks allocated: 6786519040
> >>   referenced 6328619008
> 
> Since btrfs check reports no error, the fs must be valid.
> 
> But considering how new it is, it may be related to some new features
> not properly implemented in Uboot.
> 
> Is it possible to provide the whole binary dump of the fs?

Sure, here's a copy of the disk image.  The BTRFS is at partition 2.
http://maribel.nakato.io/u-boot-newtest.img.zstd

> >
> >
> > I've made an image of the filesystem so I could reproduce the bug in an
> > environment that doesn't require the physical SBC, and have reproduced
> > the issue using the head of the master branch with "qemu-x86_64_defconfig".
> >
> > My testing qemu was produced with the following:
> >> # make qemu-x86_64_defconfig
> >> # cat << EOF >> .config
> >> CONFIG_AUTOBOOT=y
> >> CONFIG_BOOTDELAY=1
> >> CONFIG_USE_BOOTCOMMAND=y
> >> CONFIG_BOOTSTD_DEFAULTS=y
> >> CONFIG_BOOTSTD_FULL=y
> >> CONFIG_CMD_BOOTFLOW_FULL=y
> >> CONFIG_BOOTCOMMAND="bootflow scan -lb"
> >> CONFIG_ENV_IS_NOWHERE=y
> >> CONFIG_LZ4=y
> >> CONFIG_BZIP2=y
> >> CONFIG_ZSTD=y
> >> CONFIG_FS_BTRFS=y
> >> CONFIG_CMD_BTRFS=y
> >> CONFIG_GZIP=y
> >> CONFIG_DEVICE_TREE_INCLUDES="bootstd.dtsi"
> >> EOF
> >> # make -j24
> >
> > bootstd.dtsi is placed at "arch/x86/dts/bootstd.dtsi" and contains:
> >> / {
> >>          bootstd {
> >>                  compatible = "u-boot,boot-std";
> >>                  filename-prefixes = "/@boot/", "/boot/", "/";
> >>                  bootdev-order = "scsi";
> >>                  extlinux {
> >>                          compatible = "u-boot,extlinux";
> >>                  };
> >>          };
> >> };
> >
> >
> > The VM was run with
> >> qemu-system-x86_64 -bios u-boot.rom -nographic -M q35 -action reboot=shutdown -drive file=/mnt/dbg/u-boot-debug.img
> >
> > The error message I recive on boot is
> >> BUG at fs/btrfs/extent-io.c:629/free_extent_buffer_internal()!
> >> BUG!
> >> resetting ...
> >
> >
> > I added a print statement to free_extent_buffer_internal that prints the
> > start address of the extent_buffer as I'm not sure what to be looking for
> > here.  This print statement is before the decrement.
> >> printf("free_extent_buffer_internal: eb->start[%llx] eb->refs[%i]\n", eb->start, eb->refs);
> >
> > The last message before the crash reported eb->start to be "0", with 0 refs.
> >> free_extent_buffer_internal: eb->start[0] eb->refs[0]
> >
> > The extent at 0 struck me as odd, so I tried commenting out the freeing, by
> > removing the call to free_extent_buffer_final, and this resulted in bootflow
> > succeeding and showing me the boot menu, which suprised me.
> > I expected to see the bug reproduce itself, with refs being zero, but eb->start
> > pointing somewhere valid, but I instead got a valid address with refs at 2.
> >
> > I'm assuming that the order free_extent_buffer_internal is called is
> > deterministic, so by counting the print outputs the line that prior held
> > the extent_buffer with a 0 start was replaced with:
> >> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
> >
> > Interestingly, as can be seen in the full logs with my included print
> > messages, 249c000 is being used just before this, with a ref count of
> > 2.  249c000 does appear to reach a point where it should have been freed
> > in the past, before it gets used again as seen in both logs.
> >
> > The failing boot log:
> >> U-Boot SPL 2024.04-00949-g1dd659fd62-dirty (Apr 22 2024 - 11:32:37 +1000)
> >> Trying to boot from SPI
> >> Jumping to 64-bit U-Boot: Note many features are missing
> >>
> >>
> >> U-Boot 2024.04-00949-g1dd659fd62-dirty (Apr 22 2024 - 11:32:37 +1000)
> >>
> >> CPU:   QEMU Virtual CPU version 2.5+
> >> DRAM:  128 MiB
> >> Core:  13 devices, 13 uclasses, devicetree: separate
> >> Loading Environment from nowhere... OK
> >> Model: QEMU x86 (I440FX)
> >> Net:   e1000: 00:00:00:00:00:00
> >>
> >> Error: e1000#0 No valid MAC address found.
> >>        eth_initialize() No ethernet found.
> >>
> >>
> >> Hit any key to stop autoboot:  0
> >> Scanning for bootflows in all bootdevs
> >> Seq  Method       State   Uclass    Part  Name                      Filename
> >> ---  -----------  ------  --------  ----  ------------------------  ----------------
> >> scanning bus for devices...
> >> Target spinup took 0 ms.
> >> SATA link 1 timeout.
> >> Target spinup took 0 ms.
> >> SATA link 3 timeout.
> >> SATA link 4 timeout.
> >> SATA link 5 timeout.
> >> AHCI 0001.0000 32 slots 6 ports 1.5 Gbps 0x3f impl SATA mode
> >> flags: 64bit ncq only
> >> free_extent_buffer_internal: eb->start[10000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[1548000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[150c000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[154c000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[1544000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[2490000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[6de0000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[1544000] eb->refs[1]
> >>    Device 0: (0:0) Vendor: ATA Prod.: QEMU HARDDISK Rev: 2.5+
> >>              Type: Hard Disk
> >>              Capacity: 58680.0 MB = 57.3 GB (120176640 x 512)
> >> timeout exit!
> >> Scanning bootdev 'ahci_scsi.id0lun0.bootdev':
> >> free_extent_buffer_internal: eb->start[10000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[1548000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[150c000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[154c000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[1544000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[2490000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[24a4000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[2490000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[249c000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[6de0000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[1544000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[10000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[1548000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[150c000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[154c000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[1544000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[2490000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[2490000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[24a4000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[24a4000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[24a4000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[0] eb->refs[0]
> >> BUG at fs/btrfs/extent-io.c:626/free_extent_buffer_internal()!
> 
> The eb[0] seems very weird.
> Even for superblock reading, we got eb at 64K, and every eb should have
> at least one ref at creation time.
> 
> So this indeed looks like the direct cause.
> But without a full call trace (is it possible inside U-boot runtime?), I
> do not have an immediate clue.
> 
> Thanks,
> Qu
> 
> >> BUG!
> >> resetting ...
> >
> > The succeeding log with the removed freeing:
> >> U-Boot SPL 2024.04-00949-g1dd659fd62-dirty (Apr 22 2024 - 11:38:25 +1000)
> >> Trying to boot from SPI
> >> Jumping to 64-bit U-Boot: Note many features are missing
> >>
> >>
> >> U-Boot 2024.04-00949-g1dd659fd62-dirty (Apr 22 2024 - 11:38:25 +1000)
> >>
> >> CPU:   QEMU Virtual CPU version 2.5+
> >> DRAM:  128 MiB
> >> Core:  13 devices, 13 uclasses, devicetree: separate
> >> Loading Environment from nowhere... OK
> >> Model: QEMU x86 (I440FX)
> >> Net:   e1000: 00:00:00:00:00:00
> >>
> >> Error: e1000#0 No valid MAC address found.
> >>        eth_initialize() No ethernet found.
> >>
> >>
> >> Hit any key to stop autoboot:  0
> >> Scanning for bootflows in all bootdevs
> >> Seq  Method       State   Uclass    Part  Name                      Filename
> >> ---  -----------  ------  --------  ----  ------------------------  ----------------
> >> scanning bus for devices...
> >> Target spinup took 0 ms.
> >> SATA link 1 timeout.
> >> Target spinup took 0 ms.
> >> SATA link 3 timeout.
> >> SATA link 4 timeout.
> >> SATA link 5 timeout.
> >> AHCI 0001.0000 32 slots 6 ports 1.5 Gbps 0x3f impl SATA mode
> >> flags: 64bit ncq only
> >> free_extent_buffer_internal: eb->start[10000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[1548000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[150c000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[154c000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[1544000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[2490000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[6de0000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[1544000] eb->refs[1]
> >>    Device 0: (0:0) Vendor: ATA Prod.: QEMU HARDDISK Rev: 2.5+
> >>              Type: Hard Disk
> >>              Capacity: 58680.0 MB = 57.3 GB (120176640 x 512)
> >> timeout exit!
> >> Scanning bootdev 'ahci_scsi.id0lun0.bootdev':
> >> free_extent_buffer_internal: eb->start[10000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[1548000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[150c000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[154c000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[1544000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[2490000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[24a4000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[2490000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[249c000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[6de0000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[1544000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[10000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[1548000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[150c000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[154c000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[1544000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[2490000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[2490000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[24a4000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[24a4000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[24a4000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
> >> free_extent_buffer_internal: eb->start[2490000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[6de0000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[28f4000] eb->refs[1]
> >> free_extent_buffer_internal: eb->start[1544000] eb->refs[1]
> >>    0  extlinux     ready   scsi         2  ahci_scsi.id0lun0.bootdev /@boot/extlinux/extlinux.conf
> >> ** Booting bootflow 'ahci_scsi.id0lun0.bootdev.part_2' with extlinux
> >> ------------------------------------------------------------
> >> 1:      NixOS - Default
> >> 2:      NixOS - Configuration 5 (2024-03-04 11:00 - 24.05.20240205.faf912b)
> >> 3:      NixOS - Configuration 4 (2024-03-02 15:05 - 24.05.20240205.faf912b)
> >> 4:      NixOS - Configuration 3 (2024-03-02 14:26 - 24.05.20240205.faf912b)
> >> 5:      NixOS - Configuration 2 (2024-03-02 14:06 - 24.05.20240205.faf912b)
> >> 6:      NixOS - Configuration 1 (1970-01-01 10:00 - 24.05.20240205.faf912b)
> >> Enter choice: 1:        NixOS - Default
> >> Retrieving file: /@boot/extlinux/../nixos/gq8jsgxnhq2wvsjrni3cjw1wb5540wjw-linux-6.1.63-stable_20231123-Image
> >> free_extent_buffer_internal: eb->start[10000] eb->refs[1]
> >
> >
> > I'm not sure where to go from here with the bug, so I'm hoping for some
> > help in tracking it down so it can be fixed.
> >
> > Thanks,
> > Sachi
> >
> >
> >






More information about the U-Boot mailing list