RPM Community Forums

Mailing List Message of <rpm-users>

Re: high performance computing, HA and RPM5

From: devzero2000 <pinto.elia@gmail.com>
Date: Mon 07 Dec 2009 - 15:37:10 CET
Message-ID: <b086760e0912070637j5af2a33enbdb34ff0354b108c@mail.gmail.com>
On Mon, Dec 7, 2009 at 2:55 PM, Jeff Johnson <n3npq@mac.com> wrote:
>
> On Dec 7, 2009, at 7:47 AM, devzero2000 wrote:
>
>> High performance computing systems are very popular for some time. The
>> problems of Hign Avalibility computer systems are common in the same
>> way.
>> The question is how a package management system as rpm5 can address
>> the problems of such environments. I have not found any reference to
>> such issues, in general systems package management system, as rpm5.
>> Overall this system are starting from the a simple assumption : a
>> single system and a single db metadata (dpkg have not a real db
>> however). But this assumption is wrong on a system of HPC: in general,
>> the applications are installed in the absence of a true package but
>> are installed manuallu on a file or network distributed systems: NFS,
>> GFS2, Luster for example. The problem, from my point of view, is that
>> applications are not installed using a package system like rpm5 but
>> installed manually: anyone thinks at this point it is sufficient to
>> create a virtual package with only "requires" for issue like update
>> conflict or the like  and it is difficult to prove the opposite: Why
>> should I install the same package separately on multiple nodes where
>> the package is the same and it is installed on the same place (on a
>> distributed or network filesystem). I have the opinion that a
>> distributed system requires a rpm5 metadata distributed database and
>> the fact that rpm5 includes a relational (or a sort of it in the
>> latest incarnation of berkeley db) database system like the model is
>> certainly an advantage - this what Iof a advocate  of the relational
>> model as Chris Date tell about this issue, last time i have checked.
>> On the pragmatic view, specifically, assuming the same (as patch
>> version) 100 nodes should be possible to extend / var / lib / rpm /
>> Packages  with a shared rpm5 Packages (extending _db_path for example
>> ) on which should be able to act as a fragment of Packages (an Union
>> of Packages if you like ) and if this it is unavailable, well no
>> problem. The preceding are only a personal opinion. There are other
>> opinions? I have perhaps missing something ?
>>
>
> HPC is usually focussed on scaling, installing identical software
> on many nodes efficiently.
>
> Distributing system images with modest per-node customization tends to be
> simpler than per-node package management. Package management is useful for
> constructing the system images. But PM cannot compete with system images
> for installation scaling to multiple nodes.
First of all, thanks for your reply. But i disagree on this point : it
would be like saying that cloning is more  useful than using conga and
puppet (or kickstart FWIW) and here I disagree.
>
> Doing upgrades of multiple nodes is typically done by creating a new
> system image, and then undertaking a reinstallation of the new system
> image. This isn't as efficient as upgrading a package on a per-node basis
> because new system images will contain redundant already installed
> software. Its very hard to beat a reboot of a new system image located
> on a distributed file system for KISS efficiency.
>
> Tracking what system image is installed back to a specific PM database
> that describes the installed software within the system image could
> be done with a wrapper on rpm to choose 1-of-N rpmdb's to perform
> detailed queries re files in the system image. But a flat file manifest
> of what packages were installed in a system image is likely sufficient
> for most purposes as well.
But THIS make it useless or worse, the role of a package managemement
system, let it call call RPM5 or other.
Are you sure ?
>
> A distributed PM (or system image) database using some RPC transport is
> fairly simple. Since installed software is slowly changing, and mostly
It is an opinion. Security system patch are DAILY.
> readonly after system images are created, the RPC performance
> is likely not critical. Berkeley DB supplied sunrpc until db-4.8.24. Other
> RPC transports onto Berkeley DB are no harder than sunrpc.
>
> The above probably (imho) describes a reasonable architecture that scales efficiently
> for maintaining software on most of the nodes in a HPC "cluster".
>
> There's still a need for fault tolerance on the management server(s)
> where images are resident and where images are produced that need
> more than readonly access to databases. The management servers would
> likely benefit from a replicated database (which Berkeley DB can
> provide).
>
> One can imagine an architecture using replicated databases across
> all nodes, with full ACID transactional properties on not only the
> database, but also with packages and files. But the complexity
> cost, and the scaling to many nodes, likely has combinatorial
> failures. There are other efficiencies, like multicast transport,
> and a reliable message bus (like dbus) that would likely be needed
> as well.
As I replied, your answer seems to reiterate that a package management
system is not useful in HPC ENVIRONMENT. But I do not agree. These is
because  a package management system involves, or is a necessary
substrate, for  software distribution and patch management. But the
your last reply it is interesting, although it deserves further
investigation.
>
> hth random opinions from 5 minutes of thought about
HPC, HA, shared storage and RPM probably require further reflection.
IMHO  they are not been mentioned in the past is probably due to the
fact that many applications (user application not system) are
installed manually and they have not considered the benefits to use a
package management system for their applications
>
> 73 de Jeff
> ______________________________________________________________________
> RPM Package Manager                                    http://rpm5.org
> User Communication List                             rpm-users@rpm5.org
>
Received on Mon Dec 7 15:37:32 2009
Driven by Jeff Johnson and the RPM project team.
Hosted by OpenPKG and Ralf S. Engelschall.
Powered by FreeBSD and OpenPKG.