RPM Community Forums

Mailing List Message of <rpm-users>

Re: high performance computing, HA and RPM5

From: Jeff Johnson <n3npq@mac.com>
Date: Mon 07 Dec 2009 - 14:55:34 CET
Message-id: <F125EF2B-89B1-4DB6-A117-5CDF95143451@mac.com>

On Dec 7, 2009, at 7:47 AM, devzero2000 wrote:

> High performance computing systems are very popular for some time. The
> problems of Hign Avalibility computer systems are common in the same
> way.
> The question is how a package management system as rpm5 can address
> the problems of such environments. I have not found any reference to
> such issues, in general systems package management system, as rpm5.
> Overall this system are starting from the a simple assumption : a
> single system and a single db metadata (dpkg have not a real db
> however). But this assumption is wrong on a system of HPC: in general,
> the applications are installed in the absence of a true package but
> are installed manuallu on a file or network distributed systems: NFS,
> GFS2, Luster for example. The problem, from my point of view, is that
> applications are not installed using a package system like rpm5 but
> installed manually: anyone thinks at this point it is sufficient to
> create a virtual package with only "requires" for issue like update
> conflict or the like  and it is difficult to prove the opposite: Why
> should I install the same package separately on multiple nodes where
> the package is the same and it is installed on the same place (on a
> distributed or network filesystem). I have the opinion that a
> distributed system requires a rpm5 metadata distributed database and
> the fact that rpm5 includes a relational (or a sort of it in the
> latest incarnation of berkeley db) database system like the model is
> certainly an advantage - this what Iof a advocate  of the relational
> model as Chris Date tell about this issue, last time i have checked.
> On the pragmatic view, specifically, assuming the same (as patch
> version) 100 nodes should be possible to extend / var / lib / rpm /
> Packages  with a shared rpm5 Packages (extending _db_path for example
> ) on which should be able to act as a fragment of Packages (an Union
> of Packages if you like ) and if this it is unavailable, well no
> problem. The preceding are only a personal opinion. There are other
> opinions? I have perhaps missing something ?
> 

HPC is usually focussed on scaling, installing identical software
on many nodes efficiently.

Distributing system images with modest per-node customization tends to be
simpler than per-node package management. Package management is useful for
constructing the system images. But PM cannot compete with system images
for installation scaling to multiple nodes.

Doing upgrades of multiple nodes is typically done by creating a new
system image, and then undertaking a reinstallation of the new system
image. This isn't as efficient as upgrading a package on a per-node basis
because new system images will contain redundant already installed
software. Its very hard to beat a reboot of a new system image located
on a distributed file system for KISS efficiency.

Tracking what system image is installed back to a specific PM database
that describes the installed software within the system image could
be done with a wrapper on rpm to choose 1-of-N rpmdb's to perform
detailed queries re files in the system image. But a flat file manifest
of what packages were installed in a system image is likely sufficient
for most purposes as well.

A distributed PM (or system image) database using some RPC transport is
fairly simple. Since installed software is slowly changing, and mostly
readonly after system images are created, the RPC performance
is likely not critical. Berkeley DB supplied sunrpc until db-4.8.24. Other
RPC transports onto Berkeley DB are no harder than sunrpc.

The above probably (imho) describes a reasonable architecture that scales efficiently
for maintaining software on most of the nodes in a HPC "cluster".

There's still a need for fault tolerance on the management server(s)
where images are resident and where images are produced that need
more than readonly access to databases. The management servers would
likely benefit from a replicated database (which Berkeley DB can
provide).

One can imagine an architecture using replicated databases across
all nodes, with full ACID transactional properties on not only the
database, but also with packages and files. But the complexity
cost, and the scaling to many nodes, likely has combinatorial
failures. There are other efficiencies, like multicast transport,
and a reliable message bus (like dbus) that would likely be needed
as well.

hth random opinions from 5 minutes of thought about HPC and RPM

73 de Jeff
Received on Mon Dec 7 14:55:59 2009
Driven by Jeff Johnson and the RPM project team.
Hosted by OpenPKG and Ralf S. Engelschall.
Powered by FreeBSD and OpenPKG.