Networking

Unix and Linux network configuration. Multiple network interfaces. Bridged NICs. High-availability network configurations.

Applications

Reviews of latest Unix and Linux software. Helpful tips for application support admins. Automating application support.

Data

Disk partitioning, filesystems, directories, and files. Volume management, logical volumes, HA filesystems. Backups and disaster recovery.

Monitoring

Distributed server monitoring. Server performance and capacity planning. Monitoring applications, network status and user activity.

Commands & Shells

Cool Unix shell commands and options. Command-line tools and application. Things every Unix sysadmin needs to know.

Home » Commands & Shells, Scripts

Shell Scripting for HPC Clusters, Part 1

Submitted by on October 10, 2009 – 12:59 amOne Comment

This is the first installment of a multipart guide for beginner Unix sysadmins supporting HPC clusters.

“For” and “While” Loop Constructs

The main challenge of supporting a Linux cluster is ensuring a homogeneous environment. Aside from small differences – primarily in network configuration – cluster nodes must be identical to achieve optimal performance and to simplify troubleshooting. Scripting is an important tool for administering any Unix system and it is particularly valuable for managing clusters.

“While” Loops

In a “while” loop, we set a variable to the number of the first cluster node and increment this variable by one with every iteration of the loop. This method works well if you need to access a consecutive range of nodes that are numbered without the use of lead-in zeros (i.e. “node1″ and not “node01″).

In the above example, the variable $i is set to 1 and the script connects to node1 (node$i) and runs the hostname and date commands. The variable $i is then incremented by 1, the script connects to node2 and repeats all the steps for as long as the variable $i is less or equal (-le) to 128, which is the total number of nodes in our cluster.

The following method can be used when node names use lead-in zeros or when there are gaps in the sequence.

“For” Loops

This method is best for accessing a small number of nodes, as it requires you to type every node number. This would not be the best way to access all 128 nodes in our test cluster.

The following method is equivalent to the second “while” loop example above, as it also uses a text file containing node names.

It is recommended that you use full path for the ssh, rsh, scp, rcp, etc. The commands to be executed on the remote host must always be enclosed in double-quotes. Multiple commands should be separated by semicolons. The ampersand should follow the remote commands and it should be outside double-quotes. The purpose of the ampersand is to background commands for each node to avoid the script hanging on a single node that may be down or otherwise inaccessible.

To make it easier to control which nodes are being accessed by the loop, it is recommended to use a while loop that reads the names of the nodes from a text file. You can easily comment out any nodes you don’t want to access.

If your node names use lead-in zeros (i.e. node001), you can still use the incremental while loop. However, it gets a bit complicated. The following loop will access nodes node001 through node128.

In a situation like this it will probably be easier to just generate a list of nodes and save it as a text file to be used as input for the loop.

Practical Loop Examples

When executing complex commands on remote servers, it is a good idea to put all commands into a script and then to put this script into a directory exported via NFS to all the nodes. You can also RCP/SCP or FTP/SFTP the script to each node before running it. This way you can write simple loops that will call on the script and execute it locally on each node.

Loop Example 1

We need to connect to nodes 1 through 128 to add the new file server IP and hostname to the /etc/hosts file. We also need to add a new NFS mount to each node to be mounted at boot time.

First, create a simple script add_nfs_mount.ksh to add the file server name and IP to /etc/hosts, create a mountpoint, add the NFS mount to /etc/fstab, and to mount the new filesystem. Place this script into the shared directory /export/scripts, which is exported via NFS to all nodes.

Since this script is in a directory accessible from all cluster nodes, all you need to do now is to write a simple loop that would execute this script on each node. Don’t forget to make the script executable: chmod +x /export/scripts/add_nfs_mount.ksh

Loop Example 2

There may be situations when you cannot mount an NFS share on all the nodes. An alternative would be to use SCP or RCP to copy the script to the nodes and then to execute is locally on each node. Let’s take a look at how this is done.

In this example we need to configure cluster nodes 1 through 128 to use US Eastern timezone and NTP. Let’s create the script /scripts/set_timezone.ksh

Now we need to create a loop to scp this script to nodes 1 through 128 and to execute it locally on each node.

Loop Example 3

Another way of putting a script on the cluster nodes is to use FTP/SFTP. In the following example we need to install an RPM package on each cluster node. The first step is to FTP the /tmp/package.rpm file to all the nodes.

The final step is easy. All we need to do is to SSH to each node and install the RPM.

The second part of this guide – Searching, Replacing, Comparing – will be published next week. Stay tuned.

  • Pingback: Shell Scripting for HPC Clusters, Part 2 | KrazyWorks

  • davemc74656

    For a tightly coupled computational fluid dynamics (CFD) code, what parts of cluster should receive priority?

    Hint: You should consider these four parts of a cluster: node speed (i.e. processor speed in a node), memory, network fabric, and storage, and which parts should receive priority.