RPM Community Forums

Mailing List Message of <rpm-devel>

Re: Implementing EVR comparisons using *RE's ?

From: Jeff Johnson <n3npq@mac.com>
Date: Fri 08 May 2009 - 18:29:33 CEST
Message-id: <A366E08E-8300-4877-AA0A-0618E72D0005@mac.com>

On May 8, 2009, at 4:21 AM, Ralf S. Engelschall wrote:

> On Wed, May 06, 2009, Jeff Johnson wrote:
>
>> While muddling through how to add rpmvercmp to JavaScript,
>> I find myself thinking Yet Again about how silly and feeble
>> and fragile and deficient rpmvercmp actually is.
>>
>> So I ask the question:
>>  Can *RE patterns that match newer but not older
>>  packages be devised?
>>
>> The general answer is boring: No, of course not.
>
> Yes. RE's are good for _parsing_ individual versions into their parts,
> but _comparing_ two versions should be done by comparing the parts
> theirself after an RE parsing step. And this comparison is  
> indepenent of
> the RE parsing step itself.
>
> So, IMHO what RPM should do is this:
>
> 1. There should be a PCRE-based version parsing macros defined
>   similar in spirit to what "__VER__" expands to in vcheck(1) but  
> which
>   has the focus on parsing a version into its _parts_:
>
>   %__rpm_version_part_parse    (\d+|[^\d]+)
>   %__rpm_version_part_map      (?g)[._-]+ "" \
>                                (?i)^(?:d|dev(?:el)?)$ 0 \
>                                (?i)^(?:s|snapshot)$   1 \
>                                (?i)^(?:a|alpha)$      2 \
>                                (?i)^(?:b|beta)$       3 \
>                                (?i)^(?:rc|candidate)$ 4 \
>                                (?i)^(?:|r|release)$   5 \
>                                (?i)^(?:pl|patch)$     6
>
>   This maps "0.9a7" to [0,5,9,2,7], "1.2.3" to [1,5,2,5,3], etc.
>
> 2. Then RPM should have a default implementation for comparison
>   which just compares the numbers of those result arrays, similar to
>   what rpmvercmp() does: first the first numbers, then the second
>   numbers, etc. Once one array is exhausted the comparison stops, too.
>   Additionally RPM should allow this comparison to be implemented
>   in an arbitrary embedded language via a macro:
>
>   %__rpm_version_compare       %{...: ....}
>
>   This macro receives the two arrays and has to return -1, 0, +1.
>

Note that strcmp {-1,0,+1} returns are not the right implementation.

RPM versions are inequalities represented as half-planes, not points.
So TRUE/FALSE, not {LT,EQ,GT}, are a better representation for
whether inequailities, not points, overlap (or not).

Drilling down into the details of the version comparison, with
segmentation into tuples so that an EVR comparison can be
fully performed outside of a block-box function is the wrong
approach (even if quite doable and perhaps useful for other reasons).

> This way the whole version comparing stuff is fully pluggable, as one
> can define how a version is parsed into its chunks and second how  
> those
> chunks are compared.
>
>> But %track (and /usr/lib/rpm/vcheck.pl) are based
>> on *RE's and Get It Right! (i.e. detecting newer
>> upstream versions of tarballs) sufficiently often
>> that newer EVR comnparisons might be done with *RE's
>> rather than the usual slice-n-dice of alpha/digit/other
>> character sets and using strcmp(3).
>
> The slice-n-dice is still present in my above RE solution, of course.
> I don't think one can get rid of this. Even vcheck(1) internally does
> something like this. Its __VER__ regex is just for matching the whole
> version, not for comparison.
>
>> Perhaps a better (as in easier to answer) question is
>>  How many types of versioning pattern templates would
>>  need to recognized for rpmvercmp to be done
>>  using *RE's instead?
>
> I think the above two macros and their REs are suffcient. The first  
> one
> is applied repeateadly to determine the parts, the second ones REs are
> applied in sequence on each resulting part for canoncalization/filter
> purposes. The result always should be an array of numerically (and  
> this
> way easily) comparable parts.
>

But isn't a mapping of E:V-R into a memoization dictionary that
preserves the comparison far less complex than splitting out
all the details into easily comparable parts?

Sure there's the daunting cost of setting up and maintaining
a memoization dictionary containing all needed EVR items.

But what is achieved is a simple numerical comparison on integers that
represent complex segmented strings without all the gory implementation
details.

73 de Jeff
Received on Fri May 8 18:29:56 2009
Driven by Jeff Johnson and the RPM project team.
Hosted by OpenPKG and Ralf S. Engelschall.
Powered by FreeBSD and OpenPKG.