SysAdmin by Chance

During summer’16, my colleague and I utilized a bunch of unused servers to build a small cluster for our research group at Carnegie Mellon University (Bay Area campus). This is an attempt to document our learnings as we wore SysAdmin hats for the first time and ventured into cluster building.

Cluster Overview

Architecture Diagram of the cluster depicting key components detailed below

The key components of the cluster, as shown in the diagram above, are summarized:

Compute Nodes: 3 HP Servers with following specs: Processor: 16 cores Intel Xeon E5530 @2.4Ghz processor Memory: 24Gb RAM, Local Harddisk: 500Gb HDD OS: Redhat Enterprise Linux, 6.4 Storage: Network Attached Storage of 12TB managed by OS FreeNAS User Management: LDAP based central authentication NFS based home directories mounted upon login with Autofs Note: Our cluster currently does not provide a VM or any resource isolation between users in the system as in the case of AWS EC2. Users, once logged into the cluster, will request a certain amount of resources in terms of cores and number of nodes, and will run their jobs with a job scheduling system like PBS. Therefore, the rest of this article will focus on Storage and User Management components of the cluster architecture.

Why we Chose What we Chose:

From the diagram above, it is seen that we use LDAP (Lightweight Directory Access Protocol) for user management, NAS (Network Attached Storage) for configuring shared storage and a Gigabit switch connecting the compute nodes to the rest of the components. Before launching into the setup process, we first present reasons for this choice of technologies.

Ease of User Management: If not for a directory service, we would have to manually create users in each of the compute node. We will have to make sure their uid, gids and access levels are all identical. This gets especially cumbersome as the number of compute nodes grow. LDAP solves this, and can be used as the authentication mechanism for other services as well. LDAP is also fast in retrieving information due to its directory structure. Central Storage: With NAS based central storage, scaling gets easier as we can increase storage by simply adding more HDDs into NAS. Maintaining redundancy and backing up the data becomes easier. But most importantly, NAS supports NFS protocols, which we can be leveraged to share user home directories as NFS shares. This way, user home directories are centrally stored and can be accessed from anywhere in the network by simply mounting the NFS shares. So, users will have a consistent view of their home directory irrespective of the compute node they log into. Gigabit Switch: With the use of NAS and NFS-enabled network drives, network bandwidth and latency become the performance bottleneck. So, we need a managed switch with high bandwidth between the compute nodes and the other components. Owing to external constraints, we got a 1 Gigabit switch though a 10 Gigabit switch would have been ideal.

Preliminaries

Before we jump into the configuration details of these components, its important to know a few fundamentals about user authentication in Linux.

Pluggable Authentication Module: Read up on PAM if you are not aware of it. It is the core of authentication in linux systems. In simple terms, PAM provides a modular framework using which one can customize how each of the system services (such as ssh, rlogin, passwd, su etc) does user authentication. You will find a directory /etc/pam.d/ containing a number of configuration files, one for each ‘PAM-aware’ service that wants to define how it wants its users to be authenticated. Name Service Switch: When applications want to access user information, they access it from corresponding databases such as passwd, group or shadow. The source for these databases could be a local file or remote via SSS (System Security Services). This mapping of database to source is provided through Name Service Switch. File /etc/nsswitch.conf contains these entries. Local sources such as /etc/passwd and /etc/shadow are called “files”, and “sss” if SSSD or “ldap” if LDAP is the source. System Security Services Daemon (SSSD): System Security Services Daemon (SSSD) is a service which provides access to different identity and authentication providers. SSSD provides a way to connect to an identity store to retrieve authentication information and then uses that to create a local cache of users and credentials. So in our case, SSSD can act like an intermediary between the remote LDAP store and local system clients (be it openldap client, ssh client). SSSD eases up LDAP configuration in RHEL and reduces load on LDAP server due its caching capability.

PAM, NSS and SSSD work together to facilitate user authentication in the shared cluster. We will now take a closer look at the configuration of each of these components, and finally look at an end-end use case depicting LDAP-based ssh authentication and correspondingly mounting the user’s NFS directory.

Configuration

There are good number of step by step tutorials to configure LDAP and NAS for Linux. So I will be skipping that and instead walk through our final configuration files for. Our systems were RHEL based, so our configurations are for RHEL based systems. Though certain details may change, overall idea should be similar across other flavors of Linux.

User Management via LDAP

1. Setup OpenLDAP

OpenLDAP is the open source implementation of LDAP. This is what we have used as the LDAP server and client.You can follow this link for step by step configuration upto step 12. The server is pretty straight-forward to setup. Make sure that you have created the certificate file and that it has appropriate permissions. Our /etc/openldap/slapd.conf file looks like this.

##############/etc/openldap/slapd.conf################## rootdn "cn=Manager,dc=summer,dc=sv.cmu.local"

rootpw {SSHA}Thisis22NotReallyShsfdwmsdasYasd

access to attrs=userPassword,shadowLastChange

by anonymous auth

by self write

by * none

The line access to attrs is added so that users can change their LDAP password from the default one, and is not necessary otherwise.

For client configuration, follow the link and make sure the OpenLdap client and NSS/PAM modules are installed (Installing SSSD will automatically take care of this). The below command will help you set up the LDAP server details in RHEL clients. I personally found the official deployment guide very helpful.

On client machines $ yum install openldap openldap-clients sssd

$ authconfig-tui

Make sure the certificate file you’ve created in server is placed in the client as well and has the right permissions. Our /etc/openldap/ldap.conf looks like this, with certificate file cacerts under /etc/openldap.

###########################/etc/openldap/ldap.conf############### TLS_REQCERT allow

TLS_CACERTDIR /etc/openldap/cacerts

URI ldap://summer.sv.cmu.local

BASE dc=summer,dc=sv.cmu.local

Now, if you want to migrate local unix users into LDAP, you can use “migrationtools”. Instead if you wish to add, delete or update users/ groups, refer to this link . Test the LDAP with LDAP search or with getent passwd.

getent passwd <username>

2. Setup SSSD

Now that we have LDAP covered, we’ll have to configure the system to use LDAP authentication for services such as SSH. For this, we will use SSSD (NSLCD with NSCD provide an alternative, but I recommend SSSD).

Once SSSD is installed, the sssd.conf file needs to be created in /etc/sssd directory . Our /etc/sssd/sssd.conf looks as below. Make sure you have NSS, PAM and AutoFS under SSSD domain. Here we are basically directing PAM and NSS to use SSSD for authentication. Official red hat deployment guide has more information.

############# /etc/sssd/sssd.conf ############################## [domain/default] autofs_provider = ldap

cache_credentials = True

ldap_search_base = dc=summer,dc=sv.cmu.local

krb5_realm = EXAMPLE.COM

krb5_server = kerberos.example.com

id_provider = ldap

auth_provider = ldap

chpass_provider = ldap

ldap_uri = ldap://summer.sv.cmu.local

ldap_tls_cacertdir = /etc/openldap/cacerts

ldap_tls_cacert = /etc/openldap/cacerts/summerldap.pem

ldap_tls_reqcert = allow

ldap_id_use_start_tls = False

enumerate = True [sssd]

services = nss, pam, autofs

config_file_version = 2

domains = default

[nss]

homedir_substring = /home [pam] [sudo] [autofs] [ssh] [pac] [ifp]

SSSD-PAM:

Source : Red Hat Deployment Guide

SSSD provides a PAM module, sssd_pam , which instructs the system to use SSSD to retrieve user information. The PAM configuration must include a reference to the SSSD module, and then the SSSD configuration sets how SSSD interacts with PAM. Use authconfig to enable SSSD for system authentication.

# authconfig --enablesssd --enablesssdauth --update

This automatically updates the PAM configuration to reference all of the SSSD modules. This is reflected in our /etc/pam.d/system-auth-ac . Note pam_sss.so has replaced pam_unix.so .

############### /etc/pam.d/system-auth-ac ############# #%PAM-1.0

# This file is auto-generated.

# User changes will be destroyed the next time authconfig is run.

auth required pam_env.so

auth sufficient pam_unix.so nullok try_first_pass

auth requisite pam_succeed_if.so uid >= 500 quiet

auth sufficient pam_sss.so use_first_pass

auth required pam_deny.so account required pam_unix.so broken_shadow

account sufficient pam_localuser.so

account sufficient pam_succeed_if.so uid < 500 quiet

account [default=bad success=ok user_unknown=ignore] pam_sss.so

account [default=bad success=ok user_unknown=ignore] pam_ldap.so

account required pam_permit.so password requisite pam_cracklib.so try_first_pass retry=3 type=

password sufficient pam_unix.so sha512 nullok try_first_pass use_authtok

password sufficient pam_sss.so use_authtok

password required pam_deny.so session optional pam_keyinit.so revoke

session required pam_limits.so

session [success=1 default=ignore] pam_succeed_if.so service in crond quiet use_uid

session required pam_unix.so

session optional pam_sss.so

SSSD-NSS:

Source: Red Hat Deployment Guide

NSS can use multiple identity and configuration providers for any and all of its service maps. The default is to use system files for services; for SSSD to be included, the nss_sss module has to be included for the desired service type.

Enabling SSSD auth should have configured nsswitch.conf file. Else, you can run the command to enable it .

# authconfig --enablesssd --enablesssdauth --update

This automatically configures the password, shadow, group, and netgroups services maps to use the SSSD module. Your nsswitch.conf file should reflect that.

##################### /etc/nsswitch.conf ######################## passwd: files sss

shadow: files sss

group: files sss #hosts: db files nisplus nis dns

hosts: files dns bootparams: nisplus [NOTFOUND=return] files ethers: files

netmasks: files

networks: files

protocols: files

rpc: files

services: files sss netgroup: files sss publickey: nisplus automount: files ldap

aliases: files nisplus

Finally, we have to configure SSH to use PAM as authentication mechanism, so that all attempts to user logins through SSH is routed to PAM, which uses SSSD to check against LDAP.

For that, in the /etc/ssh/sshd_config change UsePAM to yes from no

#UsePAM no

UsePAM yes

With these steps, we conclude setting up user authentication.

File System Management

1. Setup NAS

We used FreeNAS OS, which is open source operating system for Network Attached Storage. FreeNAS supports ZFS file system, which has essential features like protection against data corruption, copy-on-write, continuous integrity checking etc.

FreeNAS can be installed on a 8GB USB stick (preferred method) and requires minimal amount of RAM for its operation. Installation requirements and steps are well documented in FreeNAS official guide.

FreeNAS has an easy to use web-gui for configurations. Once the basic set-up is done, we did the following steps.

Create ZFS volume using volume manager. Under the volume, create master dataset. Under this master dataset, create a dataset for each user. This would serve as home directory for those users. Make sure the ownership of the dataset and permissions are set properly. Also, set the quota for each dataset.

Volume and datasets created

3. Once the datasets are created, we need to create shares so that they are accessible to other computers on the network. We created unix shares(NFS), which will be used by the client computers to mount the datasets.

2. Setup Autofs on compute nodes.

Autofs uses the auto-mount daemon to manage the end user’s mount points by only mounting them dynamically when they are accessed. We can leverage this to mount user home directories upon their login and to unmount them as they logout.

Autofs consults the master map configuration file /etc/auto.master to determine which mount points are defined. It then starts an auto-mount process with the appropriate parameters for each mount point. Each line in the master map defines a mount point and a separate map file that defines the file systems to be mounted under this mount point.

For example, the /etc/auto.home file might define mount points in the /home directory; this relationship would be defined in the /etc/auto.master file.

Source: RedHat Guide

/etc/auto.master file looks like this.

# # Sample auto.master file # This is an automounter map and it has the following format # key [ -mount-options-separated-by-comma ] location # For details of the format look at autofs(5). # /misc /etc/auto.misc /home /etc/auto.home # # NOTE: mounts done from a hosts map will be mounted with the # “nosuid” and “nodev” options unless the “suid” and “dev” # options are explicitly given. # /net -hosts # # Include central master map if it can be found using # nsswitch sources. # # Note that if there are entries for /net or /misc (as # above) in the included master map any keys that are the # same will not be seen as the first read key seen takes # precedence. # +auto.master ~

/etc/auto.home file looks like this.

* -fstype=nfs,rw,nosuid,soft <NAS ServerIp>:/mnt/BareLab/project/&

Note: Make sure there is /home/ directory and no other subdirectory under it. Basically the above line means, when a user logs in, mount the directory “/mnt/BareLab/project/<username>” shared in NAS server under /home in client machine with name <username>.

Now we can restart autofs service.

Putting it all together

User Login Sequence Diagram

So, to put everything together, we can look at the sequence of events that take place when a user tries to login to the cluster through SSH.

User attempts to open SSH connection to compute node with username and password. SSH, being configured to use PAM, calls the PAM module with user supplied information. PAM module checks with NSS for the source of databases like passwd to check against. NSS points PAM to SSS as a source for passwd and shadow databases. PAM calls SSS for authentication. SSSD, being configured to use LDAP, queries LDAP server for authentication. LDAP sends the user related information. SSSD caches the information, checks authentication information. Upon successful authentication,and sends a green signal to PAM. PAM in-turn gives a green signal to SSH. Now that the user is authenticated, SSH tries to load the user’s home directory along with other user environment details. So autofs is kicked in to mount the home directory. Autofs connects to NAS server and fetches the home directory path from /etc/auto.home as well as the share. NAS server responds by providing the share location Autofs mounts the user directory under /home. User is dropped onto his shell at his home directory.

This completes the user login process into the cluster. Once logged into the cluster, users can request for the resources and submit the jobs with the help of portable batch system.

Appendix:

SSH:

For many distributed applications like spark, compute nodes need to have password-less SSH amongst them. This can be done by enabling key based authentication amongst the compute nodes.

This link provides all the information you need to configure SSH.

Note: All public keys are placed under the corresponding user home directories. Make sure you change /etc/ssh/sshd_config file to search authorized keys under user home directory. (%h) achieves that.

AuthorizedKeysFile %h/.ssh/authorized_keys

User Addition/Deletion from LDAP:

For LDAP user addition, modification and deletion, this link provides necessary details. Once, the users are added through ldif files into LDAP, we will have to create a corresponding dataset in FreeNAS and share the directory. Check the section entitled File System Management for further details.

Note: FreeNAS exposes rest api’s as well. This could be used to automate the whole user addition process with some scripting. Admins can enter the user details through web hosted page and in the backend it could trigger the user addition script.

Network Booting: If you can host a DHCP server, you can configure TFTP and PXI booting as well. This will ease the addition of new servers into the cluster. Since our university network doesn’t let us host DHCP servers for security purposes, we will need to have a router to create a network within the university network. Since we couldn’t afford a router and just had a switch, enabling network booting using PXI in this case becomes difficult. But as the cluster grows, this is definitely something that should be considered. (Also see FOG)

RHEL : If you decide to go with RHEL as OS for compute nodes, you may find this link useful for USB stick based installation. Also, it helps to have a developer account with RedHat which will give you access to forums and their packages. Try this link for a no-cost developer account. It may not be valid anymore, but just pray it is. Make sure the key services in the compute nodes are added to the list of startup services so that the servers comes back to the same state upon reboot.

Cluster pictures: