RPM Community Forums

Mailing List Message of <rpm-devel>

R: Re: RPM DB, Berkley DB and journal files

From: <pinto.elia@gmail.com>
Date: Tue 28 Jun 2011 - 23:15:28 CEST
Message-ID: <4e0a447b.47c5e30a.4702.278e@mx.google.com>
Perhaps OT, but just a my random thought about this thread about  rpm5 and berkley DB. most of which is discussed here is very similar, or equal sometime, on similar question on openldap. Apparently strange but not so quite if someone think carefully at the rpm5 rpm acid feature. Just for Remainder my self, and Who is interested, that the rpm acid docu, not so much at this moment, need also on a tuning guide. Sound like a plan ? :-) Regards
----Messaggio originale----
Da: Jeff Johnson
Inviato:  28/06/2011, 20:38 
A: rpm-devel@rpm5.org
Oggetto: Re: RPM DB, Berkley DB and journal files

On Jun 28, 2011, at 1:00 PM, Mark Hatle wrote:

> I was hoping someone can help me understand what is happening and how to address
> it for our embedded system needs.
> For reference:
> http://bugzilla.pokylinux.org/show_bug.cgi?id=1174
> from the above, the complaint is:
> When using zypper/rpm install/removal package, database files and log files
> cost a lot of disk space and may cause target image out of space.

Sure: that is true for all caching: there's a trade-off between
performance benefit and caching cost. You need to find your
balance point (most users don't perceive the performance win).

There are tunables to limit the no. and both total and per-file sizes.

Most of the tuning related to __db* size is in DB_CONFIG:

	set_cachesize          0 1048576 0

The last number is no. of files iirc. Note that I've _NEVER_
been able to capture a significant performance improvement
but changing cache size (but I haven't tried very hard). The
most significant performance gains come from having

	set_mp_mmapsize         268435456

and no cache is needed whatsoever, all I/O is mmap'd straight
off the file system, if you can stand the memory mapping pressure
(the above is 256M w design heuristic of 25% of a nominal 1Gb host)

But the best possible answer for you is likely to move the __db* files
elsewhere, like /var/cache/rpm. The people who are complaining do not distinguish
that there are multiple types of items being stored, some important,
and some not, which just leads to Bloat! Bloat! Bloat! because
of the mixture, not anything else.

Note that moving forces re-thinking of permissions, and FHS, and
all the other "community" inputs and so I've chosen not to have that
meandering discussion that goes no place.

> The first time when we use RPM, a lot of database files will be created, larger
> than 10MB.

I don't know what "first time" means here. Presumably the issue
is that __db* files are created and populated, exactly how
caching is supposed to work.

Note that there other uses than caching in __db* files, most notably
that locks are registered (and stale locks removed on next invocation),
and that __db* files are the means by which multiple processes
coordinate locking and I as in Isolation data visibility between
processes. I'd suggest just leaving __db* files in place is
safer/saner than the extreme measures that Berkeley DB has to undertake
to avoid interprocess races on the file system when instantiating a dbenv.
YUM e.g chose to open-and-close a dbenv on every data access, adding
a huge overhead and Yet More Raciness and was rather a disaster.

But the size of __db* is purely a caching related issue. Note also
that those files are likely sparse, so examine blocks allocated,
not reported EOF offset, before undertaking fixing what perhaps is
a non-existent perceptual problem.

Note that you have also missed log/* files that WILL be large, particularly
if you haven't enabled this flag to auto-remove unused (in the "current"
sense, logs are always relevant for historical serialization).

The flag to automate log clean up is
	set_flags              db_log_autoremove   on


73 de Jeff
> ---
> Each of the __db.* files created are large.  Nearly 10 MB each.  In addition in
> the "log" directory are additional files that are also nearly 10 MB each.
> Are there any suggestions for changing the BerkleyDB configuration, either the
> config file or RPM settings/usages that would help me shrink down these sizes --
> or potentially help me eliminate the creation of these files?
> I'm currently using rpm-5.4.0 from rpm-5.4.0-0.20101229.src.rpm.
> --Mark
> ______________________________________________________________________
> RPM Package Manager                                    http://rpm5.org
> Developer Communication List                        rpm-devel@rpm5.org

RPM Package Manager                                    http://rpm5.org
Developer Communication List                        rpm-devel@rpm5.org
Received on Tue Jun 28 23:15:43 2011
Driven by Jeff Johnson and the RPM project team.
Hosted by OpenPKG and Ralf S. Engelschall.
Powered by FreeBSD and OpenPKG.