On Aug 24, 2008, at 10:27 AM, Alexey Tourbin wrote:
>
> This code is subject to infinite loop.
> Consider how it is called:
>
> mi = rpmdbInitIterator(db, RPMDBI_PACKAGES, hdrNum, sizeof(hdrNum));
> h = rpmdbNextIterator(mi);
>
BTW, I promised a forward looking rpmdb development plan but never
got a chance to
write down what I think needs to be done.
The rpmdb iterator is perhaps one of the better API's in rpm, certainly
one of the most heavily used by applications.
The paradigm for using the rpmdb iterator is
1) mi = creator()
2) xx = modifier(mi # set parameters, like RE's
3) while ((h = next(mi)) != NULL) # retrieve next header
4) do whatever with h or mi
5) destructor(mi)
The fundamental design flaw
Header rpmdbNextIterator();
Consider what is needed to do a better job with SQL tables. Most SQL
access is based on "rows" and "columns", not secondary lookup
of an indexed object (in this case, a Header h).
So somehow, a next() iterator that returns something that is more
"row" like
is necessary to permit better integration with SQL databases. The
currently implemented "column" like abstraction is currently headerGet
(),
which needs to be be generalized from a Header h getter() method to
a more general getter() based on a container other than a Header h.
==> Best I've been able to think of so far to solve the issues I just
mentioned
is to return a tree of arbitrary information from the underlying
storage through
the iteration. If up to me, I'd likely use a HE_t (or similar, tree
node data structures are pretty easy to devise) tree node because that
is already widely used throughout rpm already. Note the last line in the
rpmTagData union that could permit a HE_t to be used as a "tree node"
data structure:
union rpmDataType_u {
/*@null@*/
void * ptr;
rpmuint8_t * ui8p; /*!< RPM_UINT8_TYPE | RPM_CHAR_TYPE */
rpmuint16_t * ui16p; /*!< RPM_UINT16_TYPE */
rpmuint32_t * ui32p; /*!< RPM_UINT32_TYPE */
rpmuint64_t * ui64p; /*!< RPM_UINT64_TYPE */
/*@relnull@*/
const char * str; /*!< RPM_STRING_TYPE */
unsigned char * blob; /*!< RPM_BIN_TYPE */
const char ** argv; /*!< RPM_STRING_ARRAY_TYPE */
HE_t he;
};
I'd like to see a tree data structure within header tag data too, but
that is (perhaps) a different issue (and different implementation) than
returning a tree data structure from a rpmdb iteration.
Designing getter methods for retrieval of arbitrary information
stored in
an underlying store should be done like libgcrypt does with RFC2440/4880
packets. I'd personally use YAML instead of XML, but what is important
is that an extensible metalanguage for retrieval needs to be put into
place,
the number of needed getter methods in the ABI would cause its
own speshul pain.
Another fundamental design flaw is that rpmdb is doing joins (actually
secondary lookup's) without assistance from the underlying db
implementation.
Having rpm do the "join" operation does have the benefit that porting
to any db
implementation is rather easy. Only the dbi methods
open/close
get/put/del
cursor open/close
need to be written.
(aside) The sqlite3 implementation was proof-of-concept that porting
to a different db implementation with the existing join.
However, most SQL implementations have prepare()/exec() methods
which don't fit cleanly into the dbi API methods atm. Sure additional
dbiPrepare() and dbiExec() methods could be added, but that just
complicates the iteration access, which then has two mutually exclusive
techniques for retrieving information, and my guess is that the list
of supported access would rapidly grow beyond just Berkeley DB and
sqlite3 et al in order to support alternative, possibly remote, data
stores
through the rpmdb match iterator API.
Those are the most positive comments I have wrto rpmdb development.
I can go through the negative ("Gotta fix the hacks!") issues
whenever you
wish, just holler. RPMTAG_BASENAMES and RPMDB_LABEL retrievals
(as you likely know by now) are excruciatingly complex.
hth
73 de Jeff
Received on Sun Aug 24 18:16:50 2008