[PATCH v2 1/2] fs: btrfs: implement opendir(), readdir() and closedir()
Alexey Charkov
alchark at flipper.net
Sat Jun 27 07:59:41 CEST 2026
On Sat, Jun 27, 2026 at 2:27 AM Qu Wenruo <quwenruo.btrfs at gmx.com> wrote:
>
>
>
> 在 2026/6/27 00:48, Alexey Charkov 写道:
> > Add support for generic directory iteration with opendir(), readdir() and
> > closedir() in the btrfs filesystem driver.
> >
> > Signed-off-by: Alexey Charkov <alchark at flipper.net>
> > ---
> > fs/btrfs/btrfs.c | 98 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> > fs/btrfs/ctree.h | 2 ++
> > fs/btrfs/dir-item.c | 73 +++++++++++++++++++++++++++++++++++++++
> > fs/fs.c | 4 ++-
> > include/btrfs.h | 5 +++
> > 5 files changed, 181 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/btrfs/btrfs.c b/fs/btrfs/btrfs.c
> > index f3087f690fa4..6f034861da90 100644
> > --- a/fs/btrfs/btrfs.c
> > +++ b/fs/btrfs/btrfs.c
> > @@ -8,7 +8,9 @@
> > #include <config.h>
> > #include <malloc.h>
> > #include <u-boot/uuid.h>
> > +#include <linux/kernel.h>
> > #include <linux/time.h>
> > +#include <fs.h>
> > #include "btrfs.h"
> > #include "crypto/hash.h"
> > #include "disk-io.h"
> > @@ -159,6 +161,102 @@ int btrfs_ls(const char *path)
> > return 0;
> > }
> >
> > +/*
> > + * The fs layer closes and re-probes btrfs between readdir() calls (see
> > + * fs_readdir() in fs/fs.c), freeing and reallocating fs_info, so root cannot
> > + * be stored directly. The subvolume id and inode number are stable though, so
> > + * re-resolve the root from the current fs_info by subvolume id, which avoids
> > + * a full path walk and is much faster.
> > + */
> > +struct btrfs_dir_stream {
> > + struct fs_dir_stream parent;
> > + struct fs_dirent dirent;
> > + u64 subvolid;
> > + u64 ino;
> > + u64 offset;
> > +};
> > +
> > +int btrfs_opendir(const char *dirname, struct fs_dir_stream **dirsp)
> > +{
> > + struct btrfs_fs_info *fs_info = current_fs_info;
> > + struct btrfs_dir_stream *dirs;
> > + struct btrfs_root *root;
> > + u64 ino;
> > + u8 type;
> > + int ret;
> > +
> > + *dirsp = NULL;
> > + ASSERT(fs_info);
> > +
> > + ret = btrfs_lookup_path(fs_info->fs_root, BTRFS_FIRST_FREE_OBJECTID,
> > + dirname, &root, &ino, &type, 40);
> > + if (ret < 0)
> > + return ret;
> > + if (type != BTRFS_FT_DIR)
> > + return -ENOTDIR;
> > +
> > + dirs = calloc(1, sizeof(*dirs));
> > + if (!dirs)
> > + return -ENOMEM;
> > + dirs->subvolid = root->root_key.objectid;
> > + dirs->ino = ino;
> > +
> > + *dirsp = &dirs->parent;
> > + return 0;
> > +}
> > +
> > +static unsigned int btrfs_dirent_type_to_fs_type(u8 dirent_type)
> > +{
> > + switch (dirent_type) {
> > + case BTRFS_FT_DIR:
> > + return FS_DT_DIR;
> > + case BTRFS_FT_SYMLINK:
> > + return FS_DT_LNK;
> > + default:
> > + return FS_DT_REG;
> > + }
> > +}
> > +
> > +int btrfs_readdir(struct fs_dir_stream *fs_dirs, struct fs_dirent **dentp)
> > +{
> > + struct btrfs_dir_stream *dirs = container_of(fs_dirs, struct btrfs_dir_stream, parent);
> > + struct btrfs_fs_info *fs_info = current_fs_info;
> > + struct fs_dirent *dent = &dirs->dirent;
> > + struct btrfs_root *root;
> > + struct btrfs_key key;
> > + u8 type;
> > + int ret;
> > +
> > + *dentp = NULL;
> > + ASSERT(fs_info);
> > +
> > + key.objectid = dirs->subvolid;
> > + key.type = BTRFS_ROOT_ITEM_KEY;
> > + key.offset = (u64)-1;
> > + root = btrfs_read_fs_root(fs_info, &key);
>
> You can save the root pointer into btrfs_dir_stream, so we can avoid
> subvolume root lookup for each dentry.
That was in fact the first thing I tried, but turns out that it
doesn't work because a root pointer created with one fs_info structure
stops working once fs_close at the end of the generic fs opendir
caller closes the fs and deallocates the fs_info structure, due to it
referencing trees allocated for that instance of fs_info. The result
is a synchronous abort.
Hence the comment above the btrfs_dir_stream definition and this
second-best solution.
Please let me know if I've missed a more elegant option!
Closing and reopening the fs at every call which the FS layer does
sounds like a huge overhead, but I must assume it was unavoidable when
the interface was designed.
> You do not need to bother the lifespan of subvolume roots either, they
> are properly released during the close of the fs.
>
> Otherwise looks good to me, except an unrelated question just lines below.
>
> > + if (IS_ERR(root))
> > + return PTR_ERR(root);
> > +
> > + memset(dent, 0, sizeof(*dent));
> > + ret = btrfs_next_dir_entry(root, dirs->ino, &dirs->offset, dent->name,
> > + sizeof(dent->name), &type);
> > + if (ret < 0)
> > + return ret;
> > + if (ret > 0)
> > + return -ENOENT;
>
> I'm not sure what is the proper/preferred/sane handling of end of directory.
>
> Ext4/fat returns -ENOENT, squashfs returned -1, erofs return 1, exfat
> returns 0 but without populating @dentp.
>
> Personally I found the exfat behavior more sane, but considering the
> caller fs_ls_generic() doesn't really bother the return value but only
> cares if @dent is populated, it should be fine either way.
>
> But still, returning -ENOENT will populate errno, which may be confusing
> for debugging.
>
> Anyway it's an unrelated nitpick.
Happy to switch to exfat-like behavior. I was only looking at ext4
implementation as that is likely the most widely used and tested in
various corner cases, but it sounds like either way should work.
Thanks a lot,
Alexey
More information about the U-Boot
mailing list