Networking

Unix and Linux network configuration. Multiple network interfaces. Bridged NICs. High-availability network configurations.

Applications

Reviews of latest Unix and Linux software. Helpful tips for application support admins. Automating application support.

Data

Disk partitioning, filesystems, directories, and files. Volume management, logical volumes, HA filesystems. Backups and disaster recovery.

Monitoring

Distributed server monitoring. Server performance and capacity planning. Monitoring applications, network status and user activity.

Commands & Shells

Cool Unix shell commands and options. Command-line tools and application. Things every Unix sysadmin needs to know.

Home » Commands & Shells, Featured

Gnuplot with Bash

Submitted by on April 2, 2019 – 4:08 pm

OK, so both of these things have been around forever and will be around long after we’re gone. It’s worth your time to learn how to use the to together.

Frequency Histogram

I have a DHCP server that logs to /var/log/messages. Relevant lines in the log file look something like this:

Let’s say I want to get a frequency histogram (a bar chart showing the number of occurrences of something during a given period of time) of DHCPREQUEST events every day for the current month. Here’s how we do it:

And the result:

Lets disassemble the syntax a bit. This first part extracts relevant log lines from /var/log/messages*, prints only the date portion of the line, and sorts it chronologically:

The next piece is a scarcely used but very handy feature of the uniq command that tells it to compare only the first six characters (-w6) of each line and prepend (-c) each line with the number of matched occurrences:

Finally, we tell gnuplot that our time format is short month name and day number (%b %d) with a space in between (considered as a single column) and that we want to generate a timeline chart using field 2 for the date (x-axis) and field 1 for data (y-axis).

Just keep in mind that, since we specified date/time format as %b %d, the space is part of this specification, therefore, month and day are considered one column. This is why it is field 2 and not 2-3. Really, understand this. This is easily the most common mistake people make when doing gnuplot time charts.

Depending on the type of the timestamp used by the log file, you may run into some issues. It just so happens that today in January 30th and I am looking at /var/adm/messgaes. The standard syslog time format is %b %d %H:%M:%S. There is no year. The sort-by-month option of sort will have some issues with this. Not knowing what year it is, it will logically assume that December comes after January. So, you may need to get creative in how you sort the timestamps.

Just quick departure from the main subject. Regretfully, rsyslog follows the RFC 3164 (The BSD Syslog Protocol) and has no provision for a year component in the log timestamp. If this is something you really need, you probably should switch to syslog-ng.

Now, let’s do the same chart but aggregate data on hourly rather than daily basis:

Remembering this syntax every time you want to review a log file is not practical. A better option would be write a little helper script. As input, the script will take whatever relevant log file lines you throw at it from STDIN, as well as a couple of command-line arguments to help it correctly sort and summarize the data. You can download this script here.

The default options for gtl are -f " " -p 6 -k "1,2,3" -s "-k1M -k2n -k3V", which works fine for your standard syslog format. So now I can use it like so:

But for other common log formats you would need to specify some parameters. For example, the xferlog used by vsftpd has a slightly different timestamp format: Mon Nov 6 17:52:03 2017, adding the day of week to the beginning of the timestamp. In this case the options for the gtl command would change a bit:

 

 

Print Friendly, PDF & Email

Leave a Reply