On Dec 14, 2007, at 10:17 AM, Ralf S. Engelschall wrote:
> On Fri, Dec 14, 2007, Jeff Johnson wrote:
>
>> @@ -170,4 +170,8 @@
>> script kiddie progress bars across the screen.
>> - jbj: arbitrary "%foo -p /bar" scriptlets as pair'ed
>> RPMTAG_{FOO,FOOPROG}.
>> - jbj: arbitrary triggers, like scriptlets, but with a
>> condition check too.
>> + - jbj: the hash *ahem* algorithm at rpmio/rpmhash.c:77
>> hashFunctionString()
>> + is pathetic. Ditto, rpmdb/fprint.c:186 fpHashFunction(). FYI, the
>> + fingerprint hash is in the top 10 pigs when profiling rpm
>> installs, so
>> + better has immediate performance benefits. rpmio/lookup3.c ==
>> better.
>
> Question: what is the purpose of hashFunctionString()'s result:
> to be used for mostly uniquely identify the input (aka a more
> cryptographically strong message digest) or just to be used for filing
> the input in a hash table (aka just a standard hashing function). If
> the first is the intention, then the implementation is semantically
> bad, but a better implementation will be even much slower, of course.
> If the second is the intention, we can improve it: a few years ago
> I've
> measured lots of hash functions and have one of the fastest and
> best (in
> terms of equal distribution) ones at hand...
>
Uniformly distributed hash retrieval, not crypto quality, is the need.
The fpHashFunction() is far more performance critical, and
the SuSE patch that adds some uniqification to rpmdb
join keys is merely OK, better is gonna be needed some day.
Note that fingerprint lookup subtly guarantees immutable
items, so only the address needs to be compared by FP_EQUAL.
E.g. the SuSE uniqification added to rpmdb join keys forces --rebuilddb
when flipping between rpm versions; otherwise (older) rpm instantly
segfaults.
I'd be interested in seeing how lookup3.c compares,. lookup3 claims
(credibly imho) to be cache line optimized on 32/64 bit ix86. That was
the best I could find scrounging around abt a year or so ago.
No matter what the defaults currently used by rpmhash are pathetic
hacks of mine before I knew what I was really doing ...
73 de Jeff
Received on Fri Dec 14 16:42:38 2007