On Feb 3, 2008, at 12:18 PM, Ralf S. Engelschall wrote:
> On Sun, Feb 03, 2008, Jeff Johnson wrote:
>
>> [...]
>>> I don't think you have to improve the glob. It is fully fine to
>>> first do
>>> the globbing and then the filtering. The regex is what has to be
>>> more
>>> precise and preferably even customizable.
>>
>> The question is more intellectual than practical:
>> Can a glob be written that ... ?
>> [...]
>
> I think: no. The Unix wildcard based patterns are usually often too
> weak for more sophisticated matching -- well, from a theoretical POV
> they are already in a completely different class of expressiveness. As
> a result one usually always ends up with additional filtering (based
> on regular expressions). So, your do already a fts(3) walk instead of
> glob(3) is also just fine as the regex filtering has to happen usually
> always anyway...
Thanks for confirming "too weak". I've always wondered whether
negated character classes and ksh fnmatch(3) extensions might
be used more intelligently than what I have been able to devise
in, say, rpmcache path filtering and rpm -qa tag=pattern attempts.
From a design POV, RE's are usually too much complexity for
users, who already just splat their way to some ill specified
goal and complain bitterly when rpm doesn't DWIM!. See
the weirdness at the top of rpmtsAddInstallElement() whose
raison d'etre is to stop the bitter complaints, rpmtsAddInstallElement
is not the right place for filtering. It is the right place to
stop bitter complaints from lusers and script kiddies however.
So I've always tried to design around {glob,fnmatch}(7) rather than
regex(7)
when there was a need in rpm for patterns.
As you know, rpm's CLI arg processing is tricked up in some
highly unusual ways, such as having a second glob application
to argument lists, macro expansions, tag=pattern modifiers, manifests
and
now "+bing" and "-bang" path rewrites.
But the programming paradigm for file patterns in rpm is starting to
clarify:
1) macro expand a directory glob pattern, with possible
contextual macros like %name defined on the fly.
2) glob the directory patterns to get an explicit list of existing
directory roots.
3) perform a (possibly multi-root'd) fts(3) walk and apply file path
criteria
using a tail anchor'd RE to filter by file path and/or suffix (and other
criteria like file(1) magic or stat(2) metadata or ... )
to return a argv list of candidate file paths that match a criterion
goal.
I expect the above paradigm can/will be used in many many places in
rpm-5.0,
rpmcliInstallPathElement() is proof-of-concept, limited solely to -i/-
U. Moving
the glop into rpmgi.c, where its will be used by install/query/
signing modes,
is my eventual goal, with other additional side effects including
fetching
and database imports. Not today however, wrestling with the rpmgi
sub-state machines to populate/check/order transactions for query
display will come along when I get to "=boom" argument expansions.
73 de Jeff
Received on Sun Feb 3 18:57:05 2008