RPM Community Forums

Mailing List Message of <rpm-devel>

Re: database support in rpm (Was: Re: RPM5 architectural decisions)

From: Jeff Johnson <n3npq@mac.com>
Date: Mon 30 Jul 2007 - 21:26:16 CEST
Message-Id: <FCCA6A12-1F5A-4E19-84CA-EBF1D0A06C8B@mac.com>

On Jul 30, 2007, at 2:58 PM, Thomas Lotterer wrote:

> Call for rpm5.org architectural decision.
>
> On Saturday, 28. July 2007 at 10:49 pm, "Thomas Lotterer" wrote:
>> I want to suggest we create and maintain a document describing
>> architectural decisions of rpm5.org. [...]
>> - one decision I'd like to see is whether rpm5.org
>>   is going to support BDB, SQLite, both, others etc.
>>
> Ralf and I like SQLite for various reasons not to be discussed  
> here. We
> were delighted to see SQLite being available for use with RPM and  
> wanted
> to use it for OpenPKG, which was at that time way behind using RPM
> 4.2.1. Now we are able to upgrade OpenPKG to the new RPM5. But we  
> found
> out the SQLite implementation is very rudimentarily from a technical
> point of view. The existing database abstraction inside RPM may be  
> able
> to use any database as key+value store, an approach that cannot  
> unleash
> the power of any RDBMS and SQL. While posting several issues with  
> SQLite
> we found out that the number of SQLite protagonists in the team is
> actually very, very, small. In fact, many inquiries did not lead to
> discussions but more to database fights. Facing the tremendous work
> which would need to be done to leverage true SQL(ite) power with the
> available manpower to actually do it and also taking the acceptance of
> the existing team members into account, my conclusion, which could  
> lead
> to our first architectural decision, is the following proposal:
>
> Issue:
> - database support in rpm
>
> Decision:
> - exclusively use Berkeley DB
>

I'd change "exclusively" to "primarily". Otherwise we have an  
external feature regression.

> Reason:
> - marginal support for alternatives from rpm-team developers
> - key+value approach cannot unleash RDBMS power anyway
> - existing schema not suitable to create complex SQL queries
> - abstraction layer incomplete and focused on BDB features
> - no use case found which requires a BDB alternative
>

The engineering issues wrto a SQL db should be addressed. No matter  
what, a reference
SQL schema for rpm package metadata has been needed for years. I  
tried to trick
Matthias into doing such a thing years ago, but he's too smart and  
slippery ;-)

Candidates are the up2date server schema, older beehive code, or perhaps
something can be swiped from one of the Fedora tools.

Rolling a SQL schema from scratch is not impossible either.

> Consequences:
> - rip out other database support from concept, code, build, docs, ...
> - discontinue any attempts to establish a DB abstraction layer
>
> I prefer to have one DB supported well and with enthusiasm over
> fruitless discussions about improving incomplete concepts and
> maintaining zombie code which is impractical to use.
>

Alternatives to Berkeley DB for the 3-4 usage cases:
    1) licensing
    2) embedded and -NPTL locking (although fcntl with BDB likely  
addresses)
    3) NFS support
need to be identified. Licensing is the trickiest, Berkeley DB can do  
2) and 3).

> Now comes the interesting part. How do we actually make a decision?
> Feedback to the actual topic and decision making much appreciated.
>

Dunno. But let's keep it simple and consensual please.

73 de Jeff
Received on Mon Jul 30 21:26:13 2007
Driven by Jeff Johnson and the RPM project team.
Hosted by OpenPKG and Ralf S. Engelschall.
Powered by FreeBSD and OpenPKG.