RPM Community Forums

Mailing List Message of <rpm-devel>

Header tag access optimizations

From: Jeff Johnson <n3npq@mac.com>
Date: Thu 01 Nov 2007 - 14:27:12 CET
Message-Id: <A615201D-AF8B-4747-9BD0-ED1D9B8EA3BA@mac.com>
Last night HEAD succeeded in performing a 380+ pkg upgrade.

Which means that headerGetExtension() et al are pretty close to
approaching "stable". There's still some minor memory leaks,
and the RPM_I18NSTRING_TYPE (aka summary/description/group)
are likely a bit wonky, but almost all header accesses have now
been converted to use headerGetExtension.

The fundamental reason for changing the memory usage rules
for header tag retrieval was to uncouple header reference counts
from tag data. There were many tricky usage cases that forced
headers to be maintained in-memory so that
      h = headerFree(h)
did not free memory underneath dozens of retrieved tag items.

So its time to figure what the final API for retrieving header tag
data should look like.

Because the methods are in the API, but there are no symbols in the ABI,
there is a great deal of flexibility in choosing how tag data should  
be retrieved,
up to and including keeping exactly the old methods coexisting with
a parallel set of new methods.

ATM, HEAD has several experimental wrappings, mostly so that I could  
double check
that, indeed, rpmdb/hdrline.h can be used to support whatever API
is necessary or desired.

Here's the old method
     xx = headerGetEntry(h, tag, &type, &data, &count);
and (likely) the new method
     xx = headerGetElement(h, he, 0)

Note that the real difference between the two methods is buried
in how the the returned data needs to be freed:
OLD
     data = headerFreeData(data, type)
NEW
     data = _free(data)

I personally think that continuing the legacy header API into the murky
future is rather unnecessary because the (or my) goal is to use *.xar
extensibly for saving tag data, and so the old methods simply will
not matter. xar comes with its own retrieval methods that need to
be accomodated by devising a means to retrieve tag elements using
not only a RPMTAG_FOO integer, but also a string as a retrieval key.
Retrieval by string, not tag number, is necessary for finishing support
for arbitrary tags (and arbitrary types) in headers no matter what.

Any other opinions?

If nothing else is heard, I'll likely finish up the header API using
a HE_t tag container everywhere, removing all occurrences of the
previous API, and then eliminate the existing API entirely in ~2-4 weeks
before rpm-5.0 release.

73 de Jeff
Received on Thu Nov 1 14:27:20 2007
Driven by Jeff Johnson and the RPM project team.
Hosted by OpenPKG and Ralf S. Engelschall.
Powered by FreeBSD and OpenPKG.