RPM Community Forums

Mailing List Message of <rpm-users>

Re: regrading Berkeley DB

From: Jeff Johnson <n3npq@mac.com>
Date: Mon 16 Nov 2009 - 20:47:26 CET
Message-id: <1C7AB2CE-2BEC-4362-909D-85208F7A9AC7@mac.com>

On Nov 16, 2009, at 9:46 AM, Jeff Johnson wrote:

> And one goal of the change is to rework an rpmdb so that SQL might become useful.

I should try to be more positive. Note that I haven't yet
started to write up publically
	Transactionally Protected Package Management
which is driving the changes. The issue of "supported"
is largely a technicality, if you want sqlite3, well
I can try to accomodate. But I cannot "support" code
that I do not use, the reasoning there should be obvious.

Attached is a SQL schema that more-or-less describes
what will happen to a rpmdb. Note that attempting to represent
a non-SQL Berkeley DB database in SQL has some issues. E.g.
there's quite a few SQL dialects around, the one in the attachment
is what is implemented in db-4.8.24 db_sql(1).

Second, I haven't bothered to represent the many-to-many
relations in SQL accurately. In fact, the indices marked TABLE
are also secondary INDEX's with duplicates onto Packages.

Third, the primary record is still a blob, the fields represent
the one-to-one relations to be able to use a INDEX keyword
but are otherwise a fiction. There is no limit on any NUL
terminated string as used in an rpmdb. An open issue is whether the
NUL should be included (or not) in rpmdb string keys.

And the goal is to get Header blob's out of an rpmdb, leaving
behind a path (or URL) to a *.rpm file from which a Header can be

If interested in RPM->sqlite3 (or any SQL), what is needed now
is to design the cascade that updates the secondary indices
when a package is added to the primary store.

That can be done (in sqlite3) with some trigger statements, there's
other schema syntax for other SQL implementations.

So the goal of changing is to be able to support SQL databases better,
with a reasonably well known SQL schema, not otherwise.

As for "supported", RPM itself needs only a single database implementation,
and my criteria is solely "performance", reliability is assumed.

I personally believe that Berkeley DB will run rings around any
relational database, but show me the numbers, and I will instantly
switch to any other database.


73 de Jeff

  • application/octet-stream attachment: bdb.sql
  • application/pkcs7-signature attachment: smime.p7s
Received on Mon Nov 16 20:48:20 2009
Driven by Jeff Johnson and the RPM project team.
Hosted by OpenPKG and Ralf S. Engelschall.
Powered by FreeBSD and OpenPKG.