I was asked by a XFS developer to look at the feasibility
of using posix_fallocate(3) for /var/lib/rpm Berkeley DB files.
The motivation is apparently that /var/lib/rpm files have
many extents even after a fresh install using rpm.
E.g. here's the number of disk extents I see:
$ sudo filefrag /var/lib/rpm/Packages
/var/lib/rpm/Packages: 1474 extents found, perfection would be 1
What remains an open question is whether having multiple extents
actually matters. Linus thinks (I'm told) that having disk extents is
issue. Note "thinks", I'm unable to find any credible measurement
of Berkeley DB performance with (or without) fragmentation.
But its likely feasible to use posix_fallocate(3) with --rebuilddb.
using posix_fallocate(3) doesn't necessarily guarantee fewer extents
(although that is likely the case for many file system types,
extent based file systems), but only reserves disk space.
Berkeley DB (as used by rpm) also uses the mpool cache. Performance
statistics on the mpool cache statistics are available by doing
sudo rpmdb_stat -m
When I run this command, I'm seeing cache hit rates of 66% -> 96%,
average ~80%, hinting that extents (and file system and disk layout) of
the backing file system data store perhaps don't matter much at all.
However, that's a guess, I'd rather see an objective measurement.
If someone can demonstrate an improvement in rpmdb performance
with fewer disk extents, I'll attempt to add posix_fallocate(3) to --
and run objective benchmarks.
Otherwise, the change hardly seems worth the effort afaict from reading
about Linux fallocate(2) and ext files system fragmentation
Any other opinions?
73 de Jeff
Received on Sun Sep 21 17:29:47 2008