RPM Community Forums

From: Jeff Johnson <n3npq@mac.com> Date: Wed 25 Jun 2008 - 17:26:46 CEST Message-id: <A51E0FE2-8A56-4E25-815A-C97A52F0900B@mac.com>

On Jun 25, 2008, at 10:45 AM, Denis Washington wrote:

> On Wed, 2008-06-25 at 10:12 -0400, Jeff Johnson wrote:
>> On Jun 24, 2008, at 1:38 PM, Denis Washington wrote:
>>
>>>
>>>
>>>> Sound like a plan? My primary goals here are two-fold:
>>>>
>>>> 1) avoiding disasters if bogus headers start to be added to an  
>>>> rpmdb.
>>>>
>>>> 2) exposing rpmdbAdd() (and rpmdbRemove()) methods for use by
>>>> LSB/ISV/whatever applications that wish to register/unregister
>>>> software
>>>> on RPM managed systems.
>>>
>>> Sounds like a good plan, yeah. I'm glad being able to work with  
>>> you on
>>> this, as you certainly have a LOT more experience than me concerning
>>> this. Thank you very much!
>>>
>>
>> No problem.
>>
>> Enumerating the necessary data elements that need to be present
>> in a RPM header, and choosing _SOME_ representational markup,
>> would seem to be on the critical path.
>>
>> (aside) dpkg its really the same fundamental problem, but a different
>> target metadata representation. Ditto _your_package_manager_here
>> for all instances of class.
>>
>> There are several existing representations of "package" manifests,
>> both explicit and/or implicit that can be used to enumerate the
>> necessary data elements to be included in the target metadata
>> representation
>> (note I did not say "rpmdb").
>>
>> Simplest by far is find(1) output of a tree. i.e. an explicit list
>> of paths to files, with stat(2) and digest (and acls/xattrs and  
>> selinux
>> file contexts and whatever else is needed) implicitly derived from
>> the tree.
>
> With "implicitly derived", do you mean "read from the installed files
> instead of being explicitly in the manifest file"?
>

Yes. Basically I mean populating target metadata with stat(2)
info, not with explicitly parsed values.

The advantage is KISS: it don't come any simpler (for ISV's and other  
lusers)
than providing a file manifest.

The disadvantage of a KISS file manifest is that indeed, the files  
must be
     1) actually present (and available) on a file system
     2) correctly installed. Presumably the ISV (or other installer)  
is functional, or
     the ISV (or other installer) would not be trying to register a  
"package", would it?

>> Other soft "branding" identification information, like vendor,
>> packager, description,
>> build host, etc would need to be added to the list of paths. While
>> all of that
>> information may be vitally important to ISV's and LSB and installer
>> GUI's,
>> all that rpmlib needs is NEVRA (N==name, E==epoch, etc), and  
>> mostly for
>> human identification rather than installer functioning purposes.
>
> Not sure if epoch versioning is important for third-party software,  
> if I
> understand correctly it is more of a disto tool for changing package
> names etc. It might be safe to set always set the epoch to some  
> default
> value. But maybe it also makes sense, you may know a good use case for
> it.
>
> I ignored the revision and set it to 1, but revisions could be quite
> handy for ISVs too.
>

I speak in RPM NEVRA jargon, apologies.

Whether LSB "version" contains an Epoch: (or not) simply does not  
matter.

Always having Epoch: 0 (or equivalently, never including RPMTAG_EPOCH)
are all that is needed for identification purposes of "packages"  
using RPM target
metadata.

In fact, "version" and even "name" can be synthesized if/when/where  
necessary. Presumably
human lusers need more than "" as an identification tag. The ""  
string in RPMTAG_NAME
and RPMTAG_VERSION etc is more than adequate to prevent rpmdb disasters.

But clearly better needs to be specified with "version" and "upgrade"  
and ...

>> Is a find(1) path list "gud enuf" as a starting point? Or do you want
>> to establish
>> other, alternative, markup for expressing the necessary data  
>> elements.
>
> If you mean what I thought you meant, that would be OK. And another
> question: do you mean to take the _data_ that is in a find(1) path  
> list,
> or also its _format_, abadoning the XML representation? The current
> format is already a path list with some metadata added.
>
>> Other obviously complete and unsurprising candidates to describe
>> necessary
>> data elements to be included in target metadata are "tar tvf" and/or
>> "ls -al".
>> Those formats are explicit, no data is implicitly derived from stat
>> (2) of a file,
>> and the file does not have to exist in order to construct a
>> representation
>> of target metadata.
>
> I would go with the simple path list. With explicit stat data etc, we
> run into the problem that the data in the manifest might run out of  
> sync
> with the installed files (as the files may change them during  
> install).
> Implicit stat data also means less changes in existing installers,  
> which
> most likely already do chmod's etc.
>

I hear "simple path list".

Yes there are many issues with implicit file metadata, all well known.

No matter what, a simple file list is the bare minimum expression of  
target metadata.
Without file paths, one has only disk blocks for ISV's to sell. I  
undersand
that some disk manufacturer had a monoply selling disk blocks
15 years ago ...

(aside) that's a very dry & obscure joke, don't worry if it makes no  
sense.

>> But there's lots and lots of other markups that could/should be used
>> instead.
>>
>> What representation of target metadata works for you?
>
>> From the content, find(1) path lists would be the best IMHO. We could
> also take its representation (that is, a file with newline-separated
> files with somehow marked up metadata in front), but I think XML is
> pretty nice because it is well-defined and relatively easy to parse.
> Note that the backends don't have to deal with the manifest file  
> format
> as they already get the parsed binary representation of the manifest.
>

I change my question(s) to
     What representation(s) work for you?
     Which representation first? Which representation second? etc etc

We can't have the bikeshed discussions about whose metadata is better
without choices, now can we?

Personally, its easier for me to write a parser for Yet Another Form  
of Perfect Spewage
than it is to try to understand whatever reason(s) there are for  
using same. YMMV.

But if you want to design data structures to separate parsing from  
packing
with your RPM back-end, that works too. I'm just trying to create a  
header for
inclusion into a rpmdb. That assumes some content. And a well-defined
explicit markup permits efficient communication of what data is needed,
and where the content will be mapped into the target metadata store.

73 de Jeff

RPM Community Forums

Mailing List Message of <rpm-lsb>

Re: LSB Package API