e2find: new ext2/3/4 tool for fast directory entry iterations

Discussion:

Vincent Caron

2016-08-19 15:02:24 UTC

Hello ext users,

in a recurrent need to be able to traverse large filesystems (10-350M
inodes) backed by spindle-based RAID arrays, I tried several solutions
(like intercepting readdir and sorting by inode, playing with cache
hints, and such), to no avail.

Since I'm mostly facing ext3 and ext4 filesystems, I wrote a tool
based on libe2fs which replaces the 'find /' part by directly going to
the ext data structures. It has been working great for me for several
months, and I've published it at https://github.com/bearstech/e2find

There's a data safety issue I'm not quite sure. I'm using libe2fs to
open read-only blockdevices which are mounted and actively written to.
It's obviously unsafe (not from the data-loss p.o.v., but from the
coherent-retrieved-data p.o.v.), but I've never encountered a situation
where e2find would spit incoherent information : all retrieved filenames
exist as seen from the VFS, except those deleted between the enumeration
and resolution phases of e2sync. Since I'm in userland and not locking
any on-disk data structure I'm reading, I wonder what kind of suprises I
should expect in the retrieved data.

Bodo Thiesen

2016-08-21 11:31:12 UTC

Permalink

Post by Vincent Caron
Since I'm in userland and not locking
any on-disk data structure I'm reading, I wonder what kind of suprises I
should expect in the retrieved data.

Exactly that kind of surprises, you're expecting anyways: Old data
where committed data exist that has not been written to it's
target location yet or data, that has been overwritten in mean time. Since
there is almost no restriction on the "wrong" data (could be mp3, could be
part of a ext2 image file looking exactly like the data you're expecting
to see - no way to know for sure) you can see *anything*.

For an ext2 fs with journal, you could try interpreting the journal and
fixup your cache to bring it up to date like this:

1. Get a copy of the journal
2. Read the blocks you're interested in (i.e. do the normal traversing
step).
-> from time to time, get a new copy of the journal, check what
changed, process the changes. This also means, you need to keep
some meta data about when and where you got your data from, so you
can actually fixup stuff. Remember: While traversing, you can read
any kind of trash.
3. Upon completion of 2. get a final copy of the journal to bring
your cache up-to-date.

The funny thing about this aproach: By repeating step 3 you could keep
your cache up to date without any need of retraversing the file system at
any time again as long as your check interval is short enough so you don't
miss any journal updates.

I leave the details to you're implementation skills, since I don't know
what your strategies in e2find are.

Regards, Bodo

Vincent Caron

2016-08-23 09:14:46 UTC

Permalink

Hello,

thanks for your feedback !

Post by Bodo Thiesen
For an ext2 fs with journal, you could try interpreting the journal and
1. Get a copy of the journal
2. Read the blocks you're interested in (i.e. do the normal traversing
step).
-> from time to time, get a new copy of the journal, check what
changed, process the changes. This also means, you need to keep
some meta data about when and where you got your data from, so you
can actually fixup stuff. Remember: While traversing, you can read
any kind of trash.
3. Upon completion of 2. get a final copy of the journal to bring
your cache up-to-date.

I see. However libe2fs has some support to create journals but does
not seem to have an API to read and interpret the journal inode data,
that would be much more complex to implement for me.

Post by Bodo Thiesen
I leave the details to you're implementation skills, since I don't know
what your strategies in e2find are.

e2find does not track any file<->block relationship, which I guess is
needed in my case to map back journal data to files, and I would meed
more I/O and way more memory to implement this. I feel I'm going to give
up on the journal idea and add a clear warning of this limitation of
e2find in its documentation ...