From our guest blogger Klaus Görgen. In mid April, the long-awaited dedicated PASCAL computer equipment arrived at IBG-3: one very powerful workstation, which will serve as our “mini supercomputer substitute”, plus 15, as well powerful, notebooks for the course participants. Tough only a single machine, the workstation will be mimicking a (very) small high performance computing cluster, where usually a number of so-called compute nodes are connected via a dedicated low-latency, high-bandwidth communication network plus special networking software.
Our workstation will be used for running the numerical model systems of the training and also centrally host multiple datasets needed. The notebooks will be used by two students each to connect to that workstation during the hands-on sessions to run their model, and afterwards doing analysis and visualization on the notebooks.
Though everything will be happening in the same room, this setup will provide some understanding how HPC systems function in principle, and give an impression how it “feels” to login to a supercomputer. As its core feature, the workstation is equipped with four Intel Xeon Broadwell-EX E7-4850 v4 CPUs, each with 16 cores, resulting in 64 cores plus a total of 256GB DDR4 main memory. Using hyper-threading, it is even possible to concurrently run 128 compute threads in parallel. Storage is via a combination of SAS SSDs for the operating system and SATA3 HDDs for the data.
This workstation allows all participants plus one tutor to run, e.g., the fully coupled multi-physics Terrestrial Systems Modelling Platform (TerrSysMP) model concurrently with one core assigned to the atmospheric code COSMO, one for the land surface model CLM and one for the hydrology model ParFlow. Many other setups are possible and will be used, e.g., a parallel model run with one instance of the model using the full machine. Such a setup is not only highly feasible for trainings, but may also be used in a real world low-cost, robust application scenario.
The general installation of the machine will commence in the upcoming days. Aside from the latest Ubuntu server GNU/Linux operating system, such a system needs a communication library (like OpenMPI), and a workload manager (like Slurm) to run models parallel and concurrently, respectively. Furthermore, some networking services will be installed; in conjunction with the notebooks, we will hence built our own, powerful, portable, and autonomous “mini HPC network”. Aside from the software to compile the models (C and Fortran) also scientific software libraries such as netCDF or Silo for input/output or numerical libraries such as Hypre will be installed and tested. The notebooks will also feature GNU/Linux operating systems and will contain like the workstation software such as Python, R or NCL for data processing, analysis, statistics, and visualization. We will continue to report on the software installation and tests with the TerrSysMP model system as well as the content of the course components.