[Beowulf] Cluster stack based on Ansible. Looking for feedback.

Dear colleagues, We are working on a new full Ansible based cluster stack, and because we are reaching a stable state, we think it would be a good idea to present it to the Beowulf community. I tried to provide as much as possible relevant information in this mail. *Origin of the need:* We are building a small HPC center in a FabLab, at Auray – France ( https://en.wikipedia.org/wiki/Auray). We needed something simple to manage our cluster and our workstations, but also very flexible as we are working with enterprises and universities (some wants Centos, others Ubuntu, etc). We also wanted something with less scripts as possible, to easily maintain the product and spend less time managing the cluster to have spare time doing interesting things. After a deep search, Ansible was chosen for its simplicity, and we iterated over and over to converge to this stack. Stack is fully open, in MIT. We do not sell anything, just free best effort help :-) *What it can do:* In its current state, stack can deploy on Centos 7.6 and RHEL 8.0 (and nearly on Ubuntu 18.04): - An /etc/hosts file - A dhcp configuration with optional shared networks - A DNS configuration based on Bind (server/client) - A time configuration based on Chrony (server/client) - A full PXE stack, based on simplicity and verbosity o Tftp based on atftp o Apache for all repos/files o iPXE advanced stack with menu to handle all exotic hardware § EFI / Legacy § PXE / iPXE native ROM § CD or USB boot to PXE when no native PXE available (or stupidly made BIOS/EFI) - Repositories server/client - NIC configuration (basic for now, based on NMCLI Ansible module) - Rsyslog (server/client, systemd split files) - Conman - NFS (server/client) And as addons: - Slurm (basic configuration, master/nodes) - Clustershell groups - Basic OpenLdap with phpldapadmin (currently unsecure) - Very basic user’s management for very small clusters *with a single login node* to replace Ldap - Prometheus configuration (Prometheus, Alertmanager, NodeExporter, with basic configuration and already few alerts) Also, stack can deploy Ubuntu 18.04 and OpenSuse Leap 15.1 via PXE, but still not all Ansible roles can be deploy on it after OS deployment (we are implementing Ubuntu right now for our workstations). *The stack is fully modular*. Any new role can be created, and any new data can be added to the inventory. We worked hard so that roles are fully independent (you can replace DHCP/DNS/PXE by Cobbler, others roles do not care. You can replace Slurm by another JobScheduler, same. Etc). Note to Ansible users: we are using “merge” hash_behaviour. Using this, stack can cover simple but also very complex clusters (parameters can be targeted to cover specific host(s) configuration or simply experiment). The stack also as few native mechanisms under the hood ready and tested as POC, that we are implementing. These features are a need for us, so we added them: - Multi Icebergs (sometime called multi islands in HPC) o When there is a need to separated parts of the cluster (our case, to provide dedicated small sub-clusters to some enterprises), stack is able to split hosts into icebergs, each iceberg being managed by a group of management nodes and isolated from the others (but can be reached through interconnect if exist and asked). o Because Ansible rely on groups and ssh, we can simply achieve that. o This feature would allow the stack to handle an old cluster aside a new one keeping both in production, or maybe scale to very large configurations. - Accelerated modes o Ansible is famous for its simplicity, but also for being slow, even with Mitogen. o Accelerated mode is a POC we designed to heavily accelerate critical templates rendering. It is based on some analysis we made on Ansible. It also reduces memory usage. o Accelerated mode is just a basic trick we found in the inventory, no Ansible modifications. Note that for us, it is mandatory that the stack stay simple. These features are by default not activated, keeping Ansible inventory very simple for basic usage. Last point: stack do not aim HPC in particular, it is generic and can adapt to HPC clusters, but also enterprise or university IT networks (workstations, laptops, etc.). *What we are doing with it:* We are managing our 14 thin (32GbRam/16cores) + 1 fat (1TbRam/64cores) supermicro servers, and also few workstations. We are a FabLab made of science and technology fans, so not a lot of money, which means we are gathering “old” equipment (we are Sandy Bridge right now) for our cluster. This is why we needed a so flexible stack: to be able to handle exotic hardware. *The future of the stack:* Main future objective would be to create a base for a *modular* and *simple* stack. For simple clusters or just testing/dev on part of large clusters. This stack has (we think) a nice PXE mechanism, and all the basis needed. It could be used as a base for other things. Like a skeleton missing flesh. Anyone could add new roles/modules/tools to it, keeping in mind simplicity and roles independency. We are right now working on Ubuntu 18.04 implementation, and we hope to release version 1.0 soon. *Where to find:* You can find the stack on github here: https://github.com/oxedions/bluebanquise Documentation is in resources/documentation/_build/html (the _build/html directory will soon be removed from the repository and hosted somewhere, as only documentation sources should be here). The few packages are still not online, but we can provide them if someone which to test the stack. Stack is young, so we are looking for any feedbacks (positive and/or negative). Feel free to have a look. For any detail, please do not hesitate to contact us :-) Thank you for reading this very long and boring mail. With our best regards Oxedions and Johnny Keats -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://beowulf.org/pipermail/beowulf/attachments/20190912/855fe946/attachment-0001.html>