A Simple Process Monitoring Script
Let’s say you are running a data restore. Things are moving along, but network is congested and the backup server is overloaded. You really don’t feel like staring at the restore status for the next several hours and just want to be notified when the process completes. The simplest method of monitoring for processes starting or ending on Unix systems is by using “ps” inside a “while” loop.
In the example below we are monitoring a NetBackup restore running on a Linux box. A typical NetBackup restore process on a Unix system looks something like this:
root 5526 1 1 03:14 ? 00:02:57 tar -x -v -Y -p -P -I 1329909118 -U 0 -k -Q -J clnt_lc_messages=C -J clnt_lc_time=C -J clnt_lc_ctype=C -J clnt_lc_collate=C -J clnt_lc_numeric=C -J restoreid=3451740.001 -J job_total=3 -J client=hpc12prod.de.krazyworks.com -J requesting_client=nbs2.de.krazyworks.com -J browse_client=hpc12prod.de.krazyworks.com -J backup_time=1329452235 -L /usr/openv/netbackup/logs/user_ops/netbac/logs/jbp-21746329909118065321000000023-5eaqEQ.log -J spsrestoreoptions=0 -f - -J verbose=0 -J disallow_server_file_writes=0
So when we run “ps”, we need to “grep” for a unique string, such as “restoreid”. As long as there are processes on the system that match this string, the monitoring script will sleep for ten minutes and then it will check again. Once the last “restoreid” process disappears, the script will send an email to the sysadmin.
#!/bin/bash # Notify admin when process is done while [ `/bin/ps -ef | grep -c [r]estoreid` -gt 0 ] do sleep 600 done echo "Process finished on `hostname` as of `date +'%Y-%m-%d %H:%M'`" | mailx -s "Message from `hostname`" firstname.lastname@example.org exit 0
Save the script as, for example, /root/process_monitor.bash and make it executable. Then you can start this process in the background using nohup:
nohup /root/process/monitor.bash /dev/null 2>&1 &
Simple scripts like this can save you a lot of time and aggravation in the long run. System administration is all about automation and when it comes to automatic tasks, Unix knows no equals.