RPM Community Forums

Mailing List Message of <rpm-users>

Re: high performance computing, HA and RPM5

From: Michael Jennings <mej@kainx.org>
Date: Mon 07 Dec 2009 - 21:49:09 CET
Message-ID: <20091207204909.GA15960@kainx.org>
On Monday, 07 December 2009, at 13:47:01 (+0100),
devzero2000 wrote:

> Overall this system are starting from the a simple assumption : a
> single system and a single db metadata (dpkg have not a real db
> however). But this assumption is wrong on a system of HPC: in
> general, the applications are installed in the absence of a true
> package but are installed manuallu on a file or network distributed
> systems: NFS, GFS2, Luster for example.

I should start off by saying that my day job is designing, building,
and managing high-performance computational clusters for the US
Department of Energy, so HPC is an area in which I have some
experience.

Package management in a cluster environment is largely no different
from package management on a single system.  We have a single master
server which contains the VNFS (i.e., root filesystem) image which is
provisioned to each node statelessly on boot.  All updates to the
nodes' packages occur against a single image on the master node, and
each node can be subsequently brought up-to-date via either a quick
reboot or a "livesync" (online image update propogation).  All nodes
of a particular class (login/compute, I/O, etc.) run the same
stateless image, and nodes can be repurposed or provisioned from bare
metal in little more than the time it takes to reboot.

But from a package management perspective, the problem is really no
different:  OS updates are installed against a single, local image,
just as if you were managing a chroot jail.  Shared storage is only
used for two purposes (as far as the OS goes):  (1) To save RAM by
"hybridizing" the stateless image, keeping some local and some shared;
or (2) providing a means of transfer during provisioning (NFS is one
of a few possibilities here).  Everything else is orthogonal to
OS-level package management.

What you may be referring to is application software.  In most cases
(certainly on our systems), computational libraries, compilers,
etc. are installed manually onto shared storage and managed with a
technique called "environment modules."  The reason for this is
simple:  Managing these packages with RPM simply isn't prudent or
reasonable.

For example, on some of our clusters, we have numerous versions of
OpenMPI installed (1.2.7, 1.2.8, 1.3, 1.3.1, 1.3.2, 1.3.3, and
1.3.4).  Using env-modules, users are able to select which version
they want at any time using very simple commands, and code built with
each version will continue to function no matter what other versions
are eventually installed.  Our scientists rely on this level of
persistence, as jobs can run continuously for days, weeks, even
months.  And scientists are in the business of doing science, not
trying to figure out why the script that worked perfectly yesterday
suddenly crashes today, so "Woops!  Upgrade!" is not a valid response.

Why not use RPM for this?  Simple.  RPM and associated tools are not
nearly as skilled at maintaining parallel distinct versions of
packages as they are single lines of development.  Even if one
installs multiple versions in parallel, all it takes is one
administrative brain-fart (rpm -Uvh), and it's hasta-la-vista,
binaries!  And even if RPM got better at this, it really adds very
little to the paradigm with which env-modules was designed to work:
./configure --prefix=/path/to/pkg-version

Also keep in mind that configuration management and cluster management
are radically different problem spaces.  We use a combination of
kickstart/cfengine (which we "borrowed" from LANL, who run some of the
world's largest supercomputers) for our stateful systems (which, in
the case of our clusters, is only the master nodes).  Any decent
cluster management software will provide everything you need for node
(i.e., stateless) configuration management.

In short, you're dealing with 3 entirely orthogonal concepts here:
package management, configuration management, and cluster management.
You're going to find yourself in a world of hurt if you don't
understand and maintain the separation.

Michael

-- 
Michael Jennings (a.k.a. KainX)  http://www.kainx.org/  <mej@kainx.org>
Linux Server/Cluster Admin, LBL.gov       Author, Eterm (www.eterm.org)
-----------------------------------------------------------------------
 "Who are you people?"  "We're writers."  "What are you striking for?"
 "More money."  "How much do you earn?"  "$350,000."
                -- conversation with striking Writer's Guild member as
                   reported by Bernard Weintraub in the New York Times
Received on Mon Dec 7 21:56:12 2009
Driven by Jeff Johnson and the RPM project team.
Hosted by OpenPKG and Ralf S. Engelschall.
Powered by FreeBSD and OpenPKG.