This project provides provisioned HPC cluster models using underlying virtualization mechanisms.
The purpose of this project is to provide a common baseline for repeatable HPC experiments. This has been used for education, distributed collaboration, tool development colaboration, failure signature discovery, local HPC debugging and cluster configuration comparisons, enabled by construction and use of short-lived and common baseline hpc cluster models. In short, extend the "systems as cattle not pets" [1] [2] analogy into the realm of "clusters as cattle, not pets."
In effect, this project automates, replaces and enables customized recipes for
manually-executed
cluster component construction, installation, configuration and verification processes.
The initial release requires local enablers: gmake, vagrant and virtualbox and, if specified, libvirt, and its accompanying vagrant-libvirt plugin. Graphviz, doxygen and vmtouch are recommended, but not required. A local copy of the clever bash-doxygen sed filter is included. Lighterweight and multi-node mechanisms are welcomed and planned.
Two representative HPC cluster recipes are provided. Cluster recipes are in the clusters directory. Presently, recipes generate clusters local to the installation host, only.
vc is a virtual machine-based cluster, configured with the service-factored following nodes:
- vcfs - provides file storage to other cluster nodes, including common configuration and logs (slurm, rsyslog)
- vcsvc - provides common in-bound services such as DNS, NTP, SYSLOG
- vcbuild - configured with a larger share of RAM and cpu cores, compilation HPC partition,
builds software (slurm, lustre) as it is brought up, if necessary - vcdb - provides mysql service, holds the slurm scheduling database daemon
- vcaltdb - provides an alternate mysql db, configured as a replicant of the primary data base
- vcsched - provides the slurm controller and scheduler service
- vc[1-2] - computational nodes
- vclogin - front-end/login node, provides vc-cluster job submission services
- vcgate - externally-accessible node via bridged 3rd interface
vx is a minimal, conjoined virtual-machine cluster, dependent upon vc-provided services and nodes
- vxsched - provides the slurm controller and scheduler service, dependent upon vcdb, vcbuild, vcsvc, vcfs
- vx[1-2] - computational nodes
- vxlogin - front-end/login node, provides vx-cluster job submission services
This software constructs models of production clusters which include enabled and enforcing security features. However, since the cluster models are constructed to automate and compare cluster experiments, these cluster recipes are not in themselves secure. Different cluster recipes would be required to construct cluster images with security guarantees.
Set the BASE directory in bin/setpath.{c}sh. The default setting is the output of pwd, often
$HOME/hpc-collab or $HOME/hpc-collab-<branch-name>.
cd hpc-collab
[csh] % source bin/setpath.csh
[bash/zsh] $ . bin/setpath.sh
% make -C clusters/vc Vagrantfile
% make -C clusters/vx Vagrantfile
% ln -s /scratch/tarballs <--- assuming a separate, larger /scratch partition
Consider setting the value clusters/common/flag/PREFERRED_REPO to your nearest rsync reposistory. Alternatively, reorder the file requires/ingest/repos. The last line in the file will be the preferred repository, by default.
Then make prereq to sanity check that there is sufficient storage to host this set of cluster recipes and to construct the appropriate Vagrantfiles for the local environment. Point hpc-collab/tarballs, $HOME/VirtualBox VMs and /var/lib/libvirt/images at a separate partition with more storage, if needed. Examine requires/sw/* to determine whether additional software needs to be installed onto the host, such as the vagrant libvirt plugin. The vagrant-proxyconf plugin may be necessary if individual nodes require a proxy server to establish yum installation connections.
The virtualization provider is set in clusters/common/Vagrantfile.d/cfg.vm.providers/default_provider. By default it is virtualbox. The configuration flag clusters/common/flag/NO_NFS is set. When changing these settings, it may be necessary to rm clusters/common/{vc,vx}/.regenerated. The Vagrantfile is dynamically composed based on these configuration parameters. Each virtualization provider uses different ranges of private IP address space for its own cluster-internal private network. For convenience, cat clusters/vc/common/etc/hosts >> /etc/hosts, when regenerating the various configuration files for each virtualization provider.
Virtualbox may be slower than libvirt provisioning, especially if NO_NFS is set, although it is more consistent and reliable and does not require administrative privileges for a local NFS server.
The default configuration settings of virtualbox and NO_NFS require no elevated privileges. Generally, this combination has the fewest installation and compatibility issues.
Cluster recipes are driven by configuration stored in skeleton file systems. Vagrant Vagrantfile and GNU make rules ingest the settings from the cfg/<nodenames> directories.
In the interest of documentation that matches actual code, preliminary work has been done with bash-doxygen.sed.
Make systematizes the dependencies and invocations. In order to avoid all of its arguments, convenience aliases are created in the setpath.csh and setpath.sh shell-specific files. A future implementation will convert these to Modulefiles.
- cd clusters/vc; make Vagrantfile - to construct initial Vagrantfile
- make prereq - simplistic check of underlying prerequisites
- make provision - identical to 'make up'
- make show - shows virtual cluster nodes and their state
- make up - provisions virtual cluster nodes
- make unprovision - destroys virtual clusters, their nodes and underlying resources
Aliases are provided by the setpath helper. When using them as recommended,
the appropriate Makefile is set so that one need not be in a cluster directory.
provision | make provision |
show | make show |
up | make up |
unprovision | make unprovision |
savelogs | make savelogs |
vc | provision all vc cluster nodes |
vx | provision all vx cluster nodes |
for <nodename>:
<nodename> = equivalent to 'cd clusters/<CL>; make nodename' - provisions as needed
<nodename>-- = equivalent to 'cd clusters/<CL>; make nodename_UNPROVISION' - unprovision node
<nodename>! = equivalent to 'cd clusters/<CL>; make nodename_UNPROVISION ; make nodename' - unprovision and force reprovisioning
for all nodes in the cluster, <CL>:
<CL> = equivalent to 'make up'
<CL>-- = equivalent to 'make nodename_UNPROVISION'
<CL>! = equivalent to 'make nodename_UNPROVISION; make nodename'
force unprovision and reprovisioning
Components such as clusters, nodes and filesystems are standalone. Each includes code and configuration to establish prerequisites, configure, install, and verify. Common configuration implementations, such as ansible, are planned and encouraged to be contributed.
Configuration of the cluster may be tuned with flag or configuration markers. Flags are located in clusters/common/flag.
- WHICH_DB selects which data base to use: mariadb-community (default), community-mysql, or mariadb-enterprise.
- SINC is a numeric factor, if present, indicating that the timeouts need to be adjusted. Often necessary for WHICH_DB:community-mysql. Timeouts are adjusted by multipling by the value of this Slow Internet Coefficient.
Alternate virtualization providers may be selected by changing the contents of the file clusters/common/Vagrantfile.d/cfg.vm.providers.d/default_provider. Changing the virtualization provider will trigger a "recompilation" of the cluster's Vagrantfile.
Virtualbox, in particular, requires substantial RAM (>32Gb) and storage (~36Gb) for the default cluster recipe's run-time. During ingestion of prerequisites, ~20Gb storage is needed for a temporary local copy of a CentOS repository.
The vc and vx clusters build in ~90 minutes on an Intel core i5 laptop with 64Gb RAM, assuming that the initial repository rsync and tarball creation of tarballs/repos.tgz is complete and successful.
Use make prereq to validate the known storage space issues. Monitoring virtual memory footprints of the cluster images is also necessary.
The author wishes to acknowledge and appreciates the contributions of time, effort, intellect, care and code that LANL Supercomputer Summer Institute students and researchers have made to this project.