Featured »

April 16, 2013 – 12:39 pm |

Imagine you have an HPC cluster with a hundred compute nodes named node001-node100. The two commands below will help you generate a list of node names – either all name on one line or one name per …

Read the full story »
Networking

Unix and Linux network configuration. Multiple network interfaces. Bridged NICs. High-availability network configurations.

Applications

Reviews of latest Unix and Linux software. Helpful tips for application support admins. Automating application support.

Data

Disk partitioning, filesystems, directories, and files. Volume management, logical volumes, HA filesystems. Backups and disaster recovery.

Monitoring

Distributed server monitoring. Server performance and capacity planning. Monitoring applications, network status and user activity.

Commands & Shells

Cool Unix shell commands and options. Command-line tools and application. Things every Unix sysadmin needs to know.

Home » Archive by Category

Articles in Monitoring

openlava Quick Test

February 22, 2013 – 12:46 pm | 5 Comments

After years working with PBS and LSF, ran into Jeff Layton’s “Share the Load” review of openlava resource manager in the Feb 2013 issue of the Admin Magazine and nostalgia took over. So I built …

Filesystem Performance Testing Using dd

February 5, 2013 – 3:35 pm | 5 Comments

Below is a simple script to test filesystem read/write performance using dd with varying blocksize parameter. This can be useful for testing local filesystems as well as network-mounted filesystems. The end result will be a …

Simple Host Monitoring with SSH

September 23, 2011 – 12:10 am | 6 Comments
Simple Host Monitoring with SSH

Sometimes you just need something very simple to monitor a server or an application on a temporary basis. A basic ping monitor is fine, but it will only tell you if a server is responding on the network. It will not tell you if there is some other problem on the system. The script below relies on passwordless SSH setup to periodically log into the monitored nodes and check on their health by executing a local or remote script.

Changing Process CPU Affinity on Linux

September 13, 2011 – 4:58 pm | 5 Comments
Changing Process CPU Affinity on Linux

A common real-life scenario: on a multi-CPU system Oracle processed have taken over and the system has ground to a crawl. The average system load is in double-digits and even logging in takes several minutes. The possible root causes for the problem can range from inefficient SQL queries (the common problem) to insufficient system resources. But at this point you just need to make the system a bit more responsive, so you can start troubleshooting.

Load-Testing HPC Linux Clusters with “stress”

April 28, 2011 – 8:43 pm | 5 Comments
Load-Testing HPC Linux Clusters with “stress”

The “stress” is a simple-to-use load generator for POSIX systems that I found very useful for stress-testing HPC clusters. The current version of the application is 1.0.4 and it was easy to compile and install. Stress can create configurable system load for CPU, memory, I/O, and disks. In the example below we ran “stress” on a SLES 11 HPC cluster with HP CMU 4.2 installed.

Installing Ganglia on RHEL

March 21, 2011 – 2:59 pm | 12 Comments
Installing Ganglia on RHEL

This is a quick follow-up to my earlier post about installing Ganglia from source on SLES. Here we will install Ganglia from precompiled RPMs on an RHEL server. The basic cluster setup for this example remains the same: two clusters: CLUSTER1 and CLUSTER2 with head nodes head_node1 and head_node2

Server and Network Monitoring with iPhone

February 25, 2010 – 6:53 pm | 8 Comments
Server and Network Monitoring with iPhone

What is a Unix sysadmin doing with an iPhone, you ask? It was a birthday present, if that’s all right with you. I know, I should have gotten something odd with a beta version of …

Copying Data: Are We There Yet?

December 27, 2009 – 7:12 pm | 3 Comments
Copying Data: Are We There Yet?

I am sure this will sound familiar: you are copying a large amount of data – either locally or over the network – and you are wondering how long it will take and if there is a way to make things go faster.You may be surprised, but it does matter what type of files you are copying: 1Gb-worth of many small files will take considerably longer to copy than two 500Mb files. The hardware you are using is an important consideration, but it’s not the only factor limiting data transfer speed.

Testing Filesystem Performance with Bonnie++

July 10, 2009 – 4:33 pm | 18 Comments
Testing Filesystem Performance with Bonnie++

Bonnie++ is a benchmark utility designed to test performance of hard drives and filesystems by simulating various types of disk I/O. Bonnie++ may be used to test local disks as well as network-mounted filesystems. It …

Linux and High I/O Wait

December 21, 2008 – 12:07 am | 3 Comments
Linux and High I/O Wait

When you look at the CPU activity of your computer, one of the parameters is the iowait. This value shows how much time your CPU wastes while it is waiting for I/O operations for complete. …

Linux performance tuning

August 22, 2007 – 2:40 pm | 4 Comments
Linux performance tuning

Linux Performance Tuning
April | May 2007 | by Jaqui Lynch
Note: This is the second article in a two-part series. The first installment was published in the February/March issue.
In last issue’s article I introduced basic Linux* …

Simple network monitoring with ping

April 11, 2006 – 10:12 am | 3 Comments
Simple network monitoring with ping

In the Spring of 2005 Comcast experienced a major DNS outage. Since then many Comcast users have switched to DNS servers that belong to Verizon and other ISPs. Comcast started taking a lot of flak …

Monitoring process CPU and memory usage

December 15, 2005 – 11:24 am |
Monitoring process CPU and memory usage

This article contains examples of using prstat to monitor CPU and memory utilization by individual processes and groups of processes.
Example 1: Show CPU and memory usage by all processes called “*ora_smon_imanax*”
The following prstat command will …