On Aug 24, 2008, at 12:15 PM, Jeff Johnson wrote:
>
> Another fundamental design flaw is that rpmdb is doing joins (actually
> secondary lookup's) without assistance from the underlying db
> implementation.
>
>
While I'm on the subject ...
One of the consequences of doing manual "joins" between primary
lookups and 2ndary retrieval is that there is a RMW cycle introduced
that is likely the major performance hit doing rpmdbAdd().
(aside) rpmdbAdd i mebbe 10-15% of total install time at worst,
so divide any measured savings by (at least) 10. Note also that
the real flaw in rpm is that redundant data is repeatedly looked up.
Removing the redundant lookups would have a far larger
saving than anything done to rpmdb/rpmdb.c, caveat emptor.
That's likely an obscure statement, so let me illustrate with an
example.
Consider how
Requires: libc.so.6
is added to /var/lib/rpm/Requirename.
The string "libc.so.6" is used as a primary retrieval key.
The set of header instances is retrieved.
A new header instance integer is added to the array.
The array of header instances is qsort'ed.
The array is rewritten to disk (with O_SYNC or equivalent).
Lather rinse repeat. That's what I'm calling the "RMW cycle".
(aside) mergesort, rather than qucksort, should be attempted,
resorting an already sorted array is "worst case" quicksort
performance. I've looked, but just never gotten there, qsort()
in glibc already has tweaks to avoid the lose.
Permitting duplicate primary keys for "libc.so.6" breaks the
RMW cycle. E.g. the flow I outlined previously becomes
dbiPut is called with "libc.so.6" as primary key and header
instance value.
However, all the retrieves then need to handle possibly duplicate
primary keys.
The rpmdb/rpmdb.c code is maybe 60-70% complete to handle duplicate
primary keys in indices, thereby breaking the "RMW cycle" in rpmdbAdd().
If/when completed, all the silly and stoopid goopiness to avoid the
consequences
of O_SYNC used to ensure that the data in an rpmdb ends up on disk
likely
becomes superfluous.
I hope the above makes sense. If not, holler and I'll try to explain
better.
73 de Jeff
Received on Sun Aug 24 18:48:57 2008