RPM Community Forums

Mailing List Message of <rpm-devel>

Re: [CVS] RPM: rpm/ CHANGES rpm/rpmio/ Makefile.am rpmgrep.1

From: Jeff Johnson <n3npq@mac.com>
Date: Wed 13 Feb 2008 - 23:45:53 CET
Message-Id: <CF05E769-777B-47EA-92D5-27ED105B8CE2@mac.com>
(Dunno why this msg is mot getting through ... 3rd time's a charm)

For various rpmio development reasons, I needed PCRE expressions
applied to HTML content delivered by plain HTTP (not DAV enabled)  
transport.

In order to achieve that goal, I've rewritten pcregrep (from  
pcre-7.6) to use -lpopt and -lrpmio.

IMHO, the result has uses outside of rpm (and rpmio), so I'm going to  
install
the executable (at least through rpm-5.1 development) in bindir  
(i.e. /usr/bin).

We'll see later about whether /usr/bin/rpmgrep should be included in  
rpm-5.1
(or not). For now, I need to hear problem reports with rpmgrep, and  
that simply
isn't going to happen unless I install /usr/bin/rpmgrep in PATH.

Here's a very brief intro to what rpmgrep adds to pcregrep (note the  
URL argument, and
I hope the HTML in the output spew makes it through mail):

     $ ./rpmgrep Fedora http://jbj.org/
                     <title>Test Page for the Apache HTTP Server on  
Fedora</title>
                     <h1>Fedora <strong>Test Page</strong></h1>
                                             <p>For information on  
Fedora, please visit the <a href="http://fedoraproject.org/">Fedora  
Project     website</a>.</p>
                                                     <p>You are free  
to use the images below on Apache and Fedora powered HTTP servers.  
Thanks for     using Apache and Fedora!</p>
                                                     <p><a  
href="http://httpd.apache.org/"><img src="/icons/apache_pb2.gif"  
alt="[ Powered by Apache     ]"/></a> <a href="http:// 
fedoraproject.org/"><img src="/icons/poweredby.png" alt="[ Powered by  
Fedora ]" width="88" height="31" /></a></p>

(aside) Yes, the same functionality could have be done with
     $ curl http://jbj.org/ | grep Fedora
if I were writing a grep program.

(aside) I'm not writing a grep program, but rather using rpmgrep as  
an external executable to stabilize
PCRE patterns, hierarchical path traversal, and HTTP transport before  
enabling the same functionality within
rpm itself.

The output of rpm grep --help is appended below. Note that everwhere  
"file"
is mentioned, a URI should be able to be substituted. That's what  
rpmio is about.

I'll get a rpmgrep man page together soonishly ...

Enjoy!

73 de Jeff

====================================================
[jbj@wellfleet rpmio]$ ./rpmgrep --help
Usage: lt-rpmgrep [OPTION...]
   -A, --after-context=number      set number of following context lines
   -B, --before-context=number     set number of prior context lines
       --color                     matched text color option
       --colour                    matched text colour option
   -C, --context=number            set number of context lines,  
before & after
   -c, --count                     print only a count of matching  
lines per FILE
   -D, --devices=action            how to handle devices, FIFOs, and  
sockets
   -d, --directories=action        how to handle directories
   -e, --regex(p)                  specify pattern (may be used more  
than once)
   -F, --fixed_strings             patterns are sets of newline- 
separated
                                   strings
   -f, --file=path                 read patterns from file
       --file-offsets              output file offsets, not text
   -H, --with-filename             force the prefixing filename on  
output
   -h, --no-filename               suppress the prefixing filename on  
output
   -i, --ignore-case               ignore case distinctions
   -l, --files-with-matches        print only FILE names containing  
matches
   -L, --files-without-match       print only FILE names not  
containing matches
       --label=name                set name for standard input
       --line-offsets              output line numbers and offsets,  
not text
       --locale=locale             use the named locale
   -M, --multiline                 run in multiline mode
   -N, --newline=type              set newline type (CR, LF, CRLF,  
ANYCRLF or
                                   ANY)
   -n, --line-number               print line number with output lines
   -o, --only-matching             show only the part of the line  
that matched
   -q, --quiet                     suppress output, just set return code
   -r, --recursive                 recursively scan sub-directories
       --exclude=pattern           exclude matching files when recursing
       --include=pattern           include matching files when recursing
   -s, --no-messages               suppress error messages
   -u, --utf-8                     use UTF-8 mode
   -V, --version                   print version information and exit
   -v, --invert-match              select non-matching lines
   -w, --word-regex                force patterns to match only as words
   -x, --line-regex                force patterns to match only whole  
lines

Common options for all rpmio executables:
   -D, --define='MACRO EXPR'       define MACRO with value EXPR
       --undefine='MACRO'          undefine MACRO
   -E, --eval='EXPR'               print macro expansion of EXPR
   -r, --root=ROOT                 use ROOT as top level directory  
(default:
                                   "/")
       --quiet                     provide less detailed output
   -v, --verbose                   provide more detailed output
       --version                   print the version

Help options:
   -?, --help                      Show this help message
       --usage                     Display brief usage message
       --                          Terminate options

Usage: rpmgrep [OPTION...] [PATTERN] [FILE1 FILE2 ...]

   Search for PATTERN in each FILE or standard input.
   PATTERN must be present if neither -e nor -f is used.
   "-" can be used as a file name to mean STDIN.
   All files are read as plain files, without any interpretation.

Example: rpmgrep -i 'hello.*world' menu.h main.c

   When reading patterns from a file instead of using a command line  
option,
   trailing white space is removed and blank lines are ignored.

   With no FILEs, read standard input. If fewer than two FILEs given,  
assume -h.


On Feb 13, 2008, at 5:40 PM, Jeff Johnson wrote:

>   RPM Package Manager, CVS Repository
>   http://rpm5.org/cvs/
>    
> ______________________________________________________________________ 
> ______
>
>   Server: rpm5.org                         Name:   Jeff Johnson
>   Root:   /v/rpm/cvs                       Email:  jbj@rpm5.org
>   Module: rpm                              Date:   13-Feb-2008  
> 23:40:59
>   Branch: HEAD                             Handle: 2008021322405801
>
>   Modified files:
>     rpm                     CHANGES
>     rpm/rpmio               Makefile.am rpmgrep.1
>
>   Log:
>     - jbj: rpmgrep: install in bindir with man page.
>
>   Summary:
>     Revision    Changes     Path
>     1.2175      +1  -0      rpm/CHANGES
>     1.133       +6  -1      rpm/rpmio/Makefile.am
>     1.2         +26 -26     rpm/rpmio/rpmgrep.1
>    
> ______________________________________________________________________ 
> ______
>
>   patch -p0 <<'@@ .'
>   Index: rpm/CHANGES
>    
> ====================================================================== 
> ======
>   $ cvs diff -u -r1.2174 -r1.2175 CHANGES
>   --- rpm/CHANGES	12 Feb 2008 05:36:10 -0000	1.2174
>   +++ rpm/CHANGES	13 Feb 2008 22:40:58 -0000	1.2175
>   @@ -1,4 +1,5 @@
>    5.0.0 -> 5.1a1:
>   +    - jbj: rpmgrep: install in bindir with man page.
>        - rpm-maint: fix: limit exit codes to 254 to keep xargs happy.
>        - jbj: mire: add vallen argument to mireRegexec().
>        - jbj: borrow pcregrep.c from pcre-7.6, rename as rpmgrep.c.
>   @@ .
>   patch -p0 <<'@@ .'
>   Index: rpm/rpmio/Makefile.am
>    
> ====================================================================== 
> ======
>   $ cvs diff -u -r1.132 -r1.133 Makefile.am
>   --- rpm/rpmio/Makefile.am	11 Feb 2008 22:23:49 -0000	1.132
>   +++ rpm/rpmio/Makefile.am	13 Feb 2008 22:40:59 -0000	1.133
>   @@ -8,6 +8,9 @@
>
>    EXTRA_PROGRAMS = thkp thtml tinv tkey tmacro tmagic tput tpw  
> trpmio tsw dumpasn1 lookup3
>
>   +bin_PROGRAMS =
>   +man_MANS =
>   +
>    TESTS =
>    check_PROGRAMS = tdir tfts tget tglob tmire
>    check_SCRIPTS = testit.sh
>   @@ -108,7 +111,9 @@
>
>    TESTS += RunGrepTest
>    dist_noinst_SCRIPTS += RunGrepTest
>   -check_PROGRAMS += rpmgrep
>   +bin_PROGRAMS += rpmgrep
>   +man_MANS +=	rpmgrep.1
>   +
>    rpmgrep_SOURCES = rpmgrep.c
>    rpmgrep_LDADD = $(RPMIO_LDADD)
>
>   @@ .
>   patch -p0 <<'@@ .'
>   Index: rpm/rpmio/rpmgrep.1
>    
> ====================================================================== 
> ======
>   $ cvs diff -u -r1.1 -r1.2 rpmgrep.1
>   --- rpm/rpmio/rpmgrep.1	13 Feb 2008 22:09:48 -0000	1.1
>   +++ rpm/rpmio/rpmgrep.1	13 Feb 2008 22:40:59 -0000	1.2
>   @@ -1,13 +1,13 @@
>    .TH PCREGREP 1
>    .SH NAME
>   -pcregrep - a grep with Perl-compatible regular expressions.
>   +rpmgrep - a grep with Perl-compatible regular expressions.
>    .SH SYNOPSIS
>   -.B pcregrep [options] [long options] [pattern] [path1 path2 ...]
>   +.B rpmgrep [options] [long options] [pattern] [path1 path2 ...]
>    .
>    .SH DESCRIPTION
>    .rs
>    .sp
>   -\fBpcregrep\fP searches files for character patterns, in the  
> same way as other
>   +\fBrpmgrep\fP searches files for character patterns, in the same  
> way as other
>    grep commands do, but it uses the PCRE regular expression  
> library to support
>    patterns that are compatible with the regular expressions of  
> Perl 5. See
>    .\" HREF
>   @@ -19,7 +19,7 @@
>    Patterns, whether supplied on the command line or in a separate  
> file, are given
>    without delimiters. For example:
>    .sp
>   -  pcregrep Thursday /etc/motd
>   +  rpmgrep Thursday /etc/motd
>    .sp
>    If you attempt to use delimiters (for example, by surrounding a  
> pattern with
>    slashes, as is common in Perl scripts), they are interpreted as  
> part of the
>   @@ -33,16 +33,16 @@
>    arguments are treated as path names. At least one of \fB-e\fP,  
> \fB-f\fP, or an
>    argument pattern must be provided.
>    .P
>   -If no files are specified, \fBpcregrep\fP reads the standard  
> input. The
>   +If no files are specified, \fBrpmgrep\fP reads the standard  
> input. The
>    standard input can also be referenced by a name consisting of a  
> single hyphen.
>    For example:
>    .sp
>   -  pcregrep some-pattern /file1 - /file3
>   +  rpmgrep some-pattern /file1 - /file3
>    .sp
>    By default, each line that matches a pattern is copied to the  
> standard
>    output, and if there is more than one file, the file name is  
> output at the
>    start of each line, followed by a colon. However, there are  
> options that can
>   -change how \fBpcregrep\fP behaves. In particular, the \fB-M\fP  
> option makes it
>   +change how \fBrpmgrep\fP behaves. In particular, the \fB-M\fP  
> option makes it
>    possible to search for patterns that span line boundaries. What  
> defines a line
>    boundary is controlled by the \fB-N\fP (\fB--newline\fP) option.
>    .P
>   @@ -62,13 +62,13 @@
>    earlier part of the line.
>    .P
>    If the \fBLC_ALL\fP or \fBLC_CTYPE\fP environment variable is set,
>   -\fBpcregrep\fP uses the value to set a locale when calling the  
> PCRE library.
>   +\fBrpmgrep\fP uses the value to set a locale when calling the  
> PCRE library.
>    The \fB--locale\fP option can be used to override this.
>    .
>    .SH "SUPPORT FOR COMPRESSED FILES"
>    .rs
>    .sp
>   -It is possible to compile \fBpcregrep\fP so that it uses \fBlibz 
> \fP or
>   +It is possible to compile \fBrpmgrep\fP so that it uses \fBlibz 
> \fP or
>    \fBlibbz2\fP to read files whose names end in \fB.gz\fP or  
> \fB.bz2\fP,
>    respectively. You can find out whether your binary has support  
> for one or both
>    of these file types by running it with the \fB--help\fP option.  
> If the
>   @@ -88,7 +88,7 @@
>    and/or line numbers are being output, a hyphen separator is used  
> instead of a
>    colon for the context lines. A line containing "--" is output  
> between each
>    group of lines, unless they are in fact contiguous in the input  
> file. The value
>   -of \fInumber\fP is expected to be relatively small. However,  
> \fBpcregrep\fP
>   +of \fInumber\fP is expected to be relatively small. However,  
> \fBrpmgrep\fP
>    guarantees to have up to 8K of following text available for  
> context output.
>    .TP
>    \fB-B\fP \fInumber\fP, \fB--before-context=\fP\fInumber\fP
>   @@ -96,7 +96,7 @@
>    and/or line numbers are being output, a hyphen separator is used  
> instead of a
>    colon for the context lines. A line containing "--" is output  
> between each
>    group of lines, unless they are in fact contiguous in the input  
> file. The value
>   -of \fInumber\fP is expected to be relatively small. However,  
> \fBpcregrep\fP
>   +of \fInumber\fP is expected to be relatively small. However,  
> \fBrpmgrep\fP
>    guarantees to have up to 8K of preceding text available for  
> context output.
>    .TP
>    \fB-C\fP \fInumber\fP, \fB--context=\fP\fInumber\fP
>   @@ -150,13 +150,13 @@
>    of the order in which these options are specified. Note that  
> multiple use of
>    \fB-e\fP is not the same as a single pattern with alternatives.  
> For example,
>    X|Y finds the first character in a line that is X or Y, whereas  
> if the two
>   -patterns are given separately, \fBpcregrep\fP finds X if it is  
> present, even if
>   +patterns are given separately, \fBrpmgrep\fP finds X if it is  
> present, even if
>    it follows Y in the line. It finds Y only if there is no X in  
> the line. This
>    really matters only if you are using \fB-o\fP to show the part 
> (s) of the line
>    that matched.
>    .TP
>    \fB--exclude\fP=\fIpattern\fP
>   -When \fBpcregrep\fP is searching the files in a directory as a  
> consequence of
>   +When \fBrpmgrep\fP is searching the files in a directory as a  
> consequence of
>    the \fB-r\fP (recursive search) option, any files whose names  
> match the pattern
>    are excluded. The pattern is a PCRE regular expression. If a  
> file name matches
>    both \fB--include\fP and \fB--exclude\fP, it is excluded. There  
> is no short
>   @@ -211,7 +211,7 @@
>    Ignore upper/lower case distinctions during comparisons.
>    .TP
>    \fB--include\fP=\fIpattern\fP
>   -When \fBpcregrep\fP is searching the files in a directory as a  
> consequence of
>   +When \fBrpmgrep\fP is searching the files in a directory as a  
> consequence of
>    the \fB-r\fP (recursive search) option, only those files whose  
> names match the
>    pattern are included. The pattern is a PCRE regular expression.  
> If a file name
>    matches both \fB--include\fP and \fB--exclude\fP, it is  
> excluded. There is no
>   @@ -254,8 +254,8 @@
>    and $ characters. The output for any one match may consist of  
> more than one
>    line. When this option is set, the PCRE library is called in  
> "multiline" mode.
>    There is a limit to the number of lines that can be matched,  
> imposed by the way
>   -that \fBpcregrep\fP buffers the input file as it scans it. However,
>   -\fBpcregrep\fP ensures that at least 8K characters or the rest  
> of the document
>   +that \fBrpmgrep\fP buffers the input file as it scans it. However,
>   +\fBrpmgrep\fP ensures that at least 8K characters or the rest of  
> the document
>    (whichever is the shorter) are available for forward matching,  
> and similarly
>    the previous 8K characters (or all the previous characters, if  
> fewer than 8K)
>    are guaranteed to be available for lookbehind assertions.
>   @@ -272,12 +272,12 @@
>    .sp
>    When the PCRE library is built, a default line-ending sequence  
> is specified.
>    This is normally the standard sequence for the operating system.  
> Unless
>   -otherwise specified by this option, \fBpcregrep\fP uses the  
> library's default.
>   +otherwise specified by this option, \fBrpmgrep\fP uses the  
> library's default.
>    The possible values for this option are CR, LF, CRLF, ANYCRLF,  
> or ANY. This
>   -makes it possible to use \fBpcregrep\fP on files that have come  
> from other
>   +makes it possible to use \fBrpmgrep\fP on files that have come  
> from other
>    environments without having to modify their line endings. If the  
> data that is
>    being scanned does not agree with the convention set by this  
> option,
>   -\fBpcregrep\fP may behave in strange ways.
>   +\fBrpmgrep\fP may behave in strange ways.
>    .TP
>    \fB-n\fP, \fB--line-number\fP
>    Precede each output line by its line number in the file,  
> followed by a colon
>   @@ -316,7 +316,7 @@
>    UTF-8 characters.
>    .TP
>    \fB-V\fP, \fB--version\fP
>   -Write the version numbers of \fBpcregrep\fP and the PCRE library  
> that is being
>   +Write the version numbers of \fBrpmgrep\fP and the PCRE library  
> that is being
>    used to the standard error stream.
>    .TP
>    \fB-v\fP, \fB--invert-match\fP
>   @@ -346,9 +346,9 @@
>    .SH "NEWLINES"
>    .rs
>    .sp
>   -The \fB-N\fP (\fB--newline\fP) option allows \fBpcregrep\fP to  
> scan files with
>   +The \fB-N\fP (\fB--newline\fP) option allows \fBrpmgrep\fP to  
> scan files with
>    different newline conventions from the default. However, the  
> setting of this
>   -option does not affect the way in which \fBpcregrep\fP writes  
> information to
>   +option does not affect the way in which \fBrpmgrep\fP writes  
> information to
>    the standard error and output streams. It uses the string "\en"  
> in C
>    \fBprintf()\fP calls to indicate newlines, relying on the C I/O  
> library to
>    convert this to an appropriate sequence if the output is sent to  
> a file.
>   @@ -357,11 +357,11 @@
>    .SH "OPTIONS COMPATIBILITY"
>    .rs
>    .sp
>   -The majority of short and long forms of \fBpcregrep\fP's options  
> are the same
>   +The majority of short and long forms of \fBrpmgrep\fP's options  
> are the same
>    as in the GNU \fBgrep\fP program. Any long option of the form
>    \fB--xxx-regexp\fP (GNU terminology) is also available as \fB-- 
> xxx-regex\fP
>    (PCRE terminology). However, the \fB--locale\fP, \fB-M\fP, \fB-- 
> multiline\fP,
>   -\fB-u\fP, and \fB--utf-8\fP options are specific to \fBpcregrep\fP.
>   +\fB-u\fP, and \fB--utf-8\fP options are specific to \fBrpmgrep\fP.
>    .
>    .
>    .SH "OPTIONS WITH DATA"
>   @@ -399,9 +399,9 @@
>    fail to match certain lines. Such patterns normally involve  
> nested indefinite
>    repeats, for example: (a+)*\ed when matched against a line of  
> a's with no final
>    digit. The PCRE matching function has a resource limit that  
> causes it to abort
>   -in these circumstances. If this happens, \fBpcregrep\fP outputs  
> an error
>   +in these circumstances. If this happens, \fBrpmgrep\fP outputs  
> an error
>    message and the line that caused the problem to the standard  
> error stream. If
>   -there are more than 20 such errors, \fBpcregrep\fP gives up.
>   +there are more than 20 such errors, \fBrpmgrep\fP gives up.
>    .
>    .
>    .SH DIAGNOSTICS
>   @@ .
> ______________________________________________________________________
> RPM Package Manager                                    http://rpm5.org
> CVS Sources Repository                                rpm-cvs@rpm5.org
Received on Wed Feb 13 23:46:09 2008
Driven by Jeff Johnson and the RPM project team.
Hosted by OpenPKG and Ralf S. Engelschall.
Powered by FreeBSD and OpenPKG.