RPM Community Forums

Mailing List Message of <rpm-devel>

Re: Implementing EVR comparisons using *RE's ?

From: Jeff Johnson <n3npq@mac.com>
Date: Fri 08 May 2009 - 20:55:22 CEST
Message-id: <D37E21B8-E8A7-46DA-8D82-3104088B87AA@mac.com>

On May 8, 2009, at 1:40 PM, Ralf S. Engelschall wrote:

> On Fri, May 08, 2009, Jeff Johnson wrote:
>
>> On May 8, 2009, at 12:29 PM, Jeff Johnson wrote:
>>>
>>> RPM versions are inequalities represented as half-planes, not  
>>> points.
>>
>> "half-planes" is of course neither precise nor correct. But
>> RPM EVR inequalities do not always have measure == 0 like
>> points do.
>
> Can you be a little bit more specific here, Jeff, please? Because I'm
> not sure I understand why the result of -1/0/+1 is such unreasonable.
>

I can try, but the short answer is I'm interested in generalizations
that you don't care about.

I certainly did not mean to imply "unreasonable".

Given well defined definitions for {E,V,R} (like permitted characters)
and a comparison operation, then distance metric comparison properties
like {LT,EQ,GT} are well-defined and easily understood.

And one can add an additional layer to convert the primitives of
{LT,EQ,GT} into boolean dependency assertions of TRUE or FALSE
for any inequality like
	Requires: N >= E:V-R
How to do that is taught in elementary school.

And one can further decompose the definitions for {E,V,R} into
deeper primitives, like alpha/digit/other strings, and extract
using *RE patterns etc etc etc.

Nothing at all unreasonable with the above, just unnecessarily complex  
imho.

The goal of depsolvers is to assign TRUE/FALSE values to assertions.

There's literally no need for a distance metric, or permitted  
characters.
or anything else in order to evaluate an assertion.

In fact there are already EVR usage cases that are widely deployed that
do _NOT_ have any useful distance metric.

Here's some on my F10 box:

	$ rpm -q --provides ocaml | more
	ocaml(compiler) = 3.10.2
	ocaml(Ocamlbuild) = df8d0c74d80342ca6057bad41bde8971
	ocaml(Ocamlbuild_executor) = 846552307267a7beccbeafa1f378a030
	ocaml(Ocamlbuild_pack) = 70dd242c6e6bb93e89d226308888f9ba
	ocaml(Ocamlbuild_plugin) = 14eaca3963ed1f73c1da0680370a802c
	ocaml(Ocamlbuild_unix_plugin) = 91f524a8cc2f4e0cd69f3ef83c774116

There's a bunch of digests being pushed into EVR strings to track
kernel ABI too. There's the whole class of probe dependencies,
and also
	Requires: N = %%{_someEVRmacro}
that are already implemented for several years.

(aside)
One might argue that probe dependencies and macros are utterly useless
because they can only be evaluated in situ, not a priori. And I  
certainly
would not disagree. OTOH, I do believe that
	Requires: signature(/path/to/file) = yaddayadda
and
	Requires: diskspace(/tmp) > 1Mb
most definitely are useful to RPM and packaging in spite of the
lack of no ability for a priori evaluation. But I digress ...

While those ocaml dependencies are only ever compared for equality, a  
degenerate case
that works just fine with a {LT,EQ,GT} distance metric comparison
(although there are certainly more efficient evaluations than using
*RE's to parse out primitive alpha/digit/other clases from what
is most definitely random gibberish), well, I think you can guess
where I'm heading now.

(aside)
I've only pointed out that a distance metric comparison is
mostly unnecessary for a depsolver assertion checker. The
other use of dependencies, ordering package installs, has
a similar analysis once one has a isaPrerequsiite? primitive.
I.e. node identifiers != edge connection points in a graph.

> From my personal point of view, version identifiers (I intentionally
> avoid the term "number" here, as those strings are not really numbers)
> are text-representations of _points_ in time on particular product
> _branches_. Comparing two versions means to decide whether they are
> equal or which one is the _successor_ of the other -- while  
> _successor_
> here is "defined" based on the particular product evolution process  
> and
> the used branching scheme.
>

Sure, nothing at all wrong or unreasonable with that POV.

I do point out that modern distributed VCS systems have
largely dispensed with "identifiers" and use hashes
to connect up version control system graphs. What
is lost is the ability to just look at the "identifier"
and (from a deep understanding of how the identifier
is constructed) be able to know where the node fits
into a complicated graph structure. But its pretty
easy to replace that functionality with tools that
walk the graph, displaying node "identifiers" without
the tyranny of committing to a node identifier
representation schema.

Another way of saying:

	Identifying a successor doesn't need to know anything about versions.

> This means that if the branching scheme is well known and the
> text-representations of versions in this scheme are precisely defined
> (as in: "N.M.X correspond to the trunk at the N-th generation, there  
> is
> a branch N.M forked off the trunk and a point X on this N.M branch",
> etc) then one can do a reasonable version comparison (even across
> branches) with the results -1/0/+1. If the scheme is not known or the
> text-representations of versions less precisely used, I agree: one can
> only return true/false.
>

Ultimately my concern is solely what should be implemented in RPM.

I'd rather see a pluggable "black box" (*rpmvercmp) hidden
under more useful assertion/ordering boolean primitives, than
exposing all the bleeping baggage of parsing out
alpha/digit/other segments into tuples for "easily used"
"white box" assertion checkers written outside of rpm.

When all is said and done, all that package management needs and
cares is "works" and "reliable".

I hope that clarifies.

Now how should dependency ranges be implemented? ;-)

73 de Jeff
Received on Fri May 8 20:56:30 2009
Driven by Jeff Johnson and the RPM project team.
Hosted by OpenPKG and Ralf S. Engelschall.
Powered by FreeBSD and OpenPKG.