Installing and Configuring Ganglia
Ganglia is a distributed performance monitoring application used primarily for tracking status of high-performance compute (HPC) clusters. Ganglia is a royal pain in the ass to install and configure even for a seasoned Unix sysadmin. Ganglia is a nice tool – very functional and free. It is a bit outdated in its design. It is poorly documented and there isn’t much useful information available online about installing and configuring Ganglia. First, some basics about Ganglia-related pieces of the puzzle:
rrdtool
The glorious Round Robin Database package. This needs to be installed on the Ganglia master server (the one collecting and presenting all the data; see diagram below).
core
When installing from RPM, Ganglia core is a separate package that needs to be installed first. When building from source – don’t worry about it.
gmond
Ganglia monitoring daemon – runs at boot time on all monitored nodes.
gmetad
Ganglia Meta Daemon – runs only on the Ganglia master server. Collects data from gmond instances on the monitored cluster nodes.
web-frontend
A set of PHP files for your Web server to display Ganglia data and graphs.
SAMPLE LAYOUT
ganglia_server – the master Ganglia server which holds all the collected performance data and runs the Web server.
head_node1 – the head node of CLUSTER1.
in1, in2, … , in41 – compute nodes of CLUSTER1.
head_node2 – the head node of CLUSTER2.
node1, node2, … , node60 – compute nodes of CLUSTER2.
In most cases, individual compute nodes are locked into cluster’s internal network. They cannot directly communicate with the outside network. Only the head node can. As the result, the Ganglia master server, which is located outside of the cluster, cannot access these nodes directly. So, data collected by each compute node is passed on to the head node and from there it goes to the Ganglia server. Nothing to it.
INSTALLING RRDTOOL
RRDTool needs to be installed ONLY on the Ganglia master server (see diagram above). This app is usually installed by default in most Linux distros. Unfortunately, most of these default installations are bastardized and cannot be used by Ganglia. I would strongly recommend downloading rrdtool source and compiling it. You don’t need to uninstall the existing version of rrdtool before compiling a new version.
Let’s say the new version of rrdtool was compiled and installed in /usr/local/rrdtool-1.2.27/. Now you need to relink three RRDTool-related binary files in /usr/bin to point to the new version:
for i in rrdcgi rrdtool rrdupdate do mv /usr/bin/${i} /usr/bin/${i}.orig ln -s /usr/local/rrdtool-1.2.27/bin/${i} /usr/bin/${i} done
That should be it for the RRDTool.
INSTALLING GMOND
Gmond needs to be installed on all monitored nodes and on the cluster’s head node, even if you don’t want to monitor the head node. No need to install gmond on the Ganglia master server. Download the latest source code for Ganglia and put it in /tmp. Unzip, untar, cd to the resulting directory and…
sh ./configure make make install which gmond
Repeat on every freaking compute node and head node on both CLUSTER1 and 2. Get rid of the source code directory.
Now, to get gmond started at boot time:
vi /etc/init.d/gmond and paste the following nonsense:
#! /bin/sh PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin DAEMON=/usr/sbin/gmond NAME=gmond DESC="Ganglia Monitor Daemon" test -x $DAEMON || exit 0 set -e case "$1" in start) echo -n "Starting $DESC: " start-stop-daemon --start --quiet --pidfile /var/run/$NAME.pid \ --exec $DAEMON echo "$NAME." ;; stop) echo -n "Stopping $DESC: " start-stop-daemon --stop --quiet --oknodo \ --exec $DAEMON 2>&1 > /dev/null echo "$NAME." ;; reload) ;; restart|force-reload) $0 stop $0 start ;; *) N=/etc/init.d/$NAME # echo "Usage: $N {start|stop|restart|reload|force-reload}" >&2 echo "Usage: $N {start|stop|restart|force-reload}" >&2 exit 1 ;; esac exit 0
Save and exit from vi and do the following:
chmod 755 /etc/init.d/gmond chkconfig --add gmond
CONFIGURING GMOND
This is where the fun begins. First, generate a default configuration file on the head node AND one of the compute nodes in CLUSTER1:
gmond --default_config > /etc/gmond.conf
Now open this file on one of the compute nodes and make changes to the following sections (see the diagram above):
cluster { name = "CLUSTER1" } udp_send_channel { host = head_node1 port = 8649 } tcp_accept_channel { port = 8649 }
Remove any sections about joining multicast or udp_recv_channel. Put this gmond.conf into /etc/ on every compute node of CLUSTER1.
Before you get up for a cup of coffee, edit the /etc/gmond.conf on the head node of CLUSTER1 (head_node1). Make changes to the following sections:
cluster { name = "CLUSTER1" } udp_recv_channel { port = 8649 } tcp_accept_channel { port = 8649 }
Remove any sections about joining multicast or udp_send_channel. Save the file and now you may get that coffee, ’cause there’s much more.
Configure CLUSTER2 in the same way, but replace “CLUSTER1″ with “CLUSTER2″ and “head_node1″ with “head_node2″, correspondingly.
INSTALLING GMETAD
Once again: install gmetad ONLY on the Ganglia master server (see diagram above). Download the latest source code for Ganglia and put it in /tmp. Unzip, untar, cd to the resulting directory and…
sh ./configure \ CFLAGS"-I/usr/local/rrdtool-1.2.27/include" \ CPPFLAGS="-I/usr/local/rrdtool-1.2.27/include" \ LDFLAGS="-L/usr/local/rrdtool-1.2.27/lib" --with-gmetad
Naturally, you will replace “rrdtool-1.2.27″ with the correct version of rrdtool you just installed. And then the usual:
make make install which gmetad
Don’t delete source code directory just yet but proceed to configuring gmetad.
CONFIGURING GMETAD
Edit the default configuration file: vi /etc/gmetad.conf and add the following lines:
data_source "CLUSTER1" head_node1 data_source "CLUSTER2" head_node2 RRAs "RRA:AVERAGE:0.5:1:105408"
To get gmetad started at boot time, vi /etc/gmetad and copy/paste this stuff into it:
#! /bin/sh PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin DAEMON=/usr/sbin/gmetad NAME=gmetad DESC="Ganglia Meta Daemon" test -x $DAEMON || exit 0 set -e case "$1" in start) echo -n "Starting $DESC: " start-stop-daemon --start --quiet --pidfile /var/run/$NAME.pid \ --exec $DAEMON echo "$NAME." ;; stop) echo -n "Stopping $DESC: " start-stop-daemon --stop --quiet --oknodo \ --exec $DAEMON 2>&1 > /dev/null echo "$NAME." ;; reload) ;; restart|force-reload) $0 stop $0 start ;; *) N=/etc/init.d/$NAME # echo "Usage: $N {start|stop|restart|reload|force-reload}" >&2 echo "Usage: $N {start|stop|restart|force-reload}" >&2 exit 1 ;; esac exit 0
Save and exit from vi and do the following:
chmod 755 /etc/init.d/gmetad chkconfig --add gmetad
INSTALLING WEB-FRONTEND
In the source code directory there will be a “web” subfolder. Move it over to the htdocs folder of you apache2 server (oh, you also need to run apache2 and PHP). So, as an example:
mv /tmp/ganglia_3.0.5/web /srv/www/htdocs/ganglia chown -R nobody:nobody /srv/www/htdocs/ganglia
That should do it.
STARTING EVERYTHING
On both head_node1 and head_node2 do “/etc/init.d/gmond start” and do the same on all compute nodes on both clusters.
On the Ganglia master server do “/etc/init.d/gmetad start”
Open your browser and go to http://ganglia_server.domain/ganglia/
You may or may not see something like this:
If you see nothing, just relax, have a drink, take a three-day weekend and on Tuesday start from the top of this page. And this time pay attention to the stupid details.
Popularity: 8% [?]
Related posts:






hi, thanks,The article was very well written, very helpful to me
[Reply]
Really good article. I was wondering if you could add the config file for gmetad. It looks like there was a stub for it, but it’s not there. Thanks.
[Reply]
[...] is a quick follow-up to my earlier post about installing Ganglia from source on SLES. Here we will install Ganglia from precompiled RPMs on an RHEL server. The basic cluster [...]
[...] do that, I followed this blog, with slight changes. For worker nodes, everything is as KrazyWorks says, that is, [...]