Searching Log Files
This is a quick example of finding the first and last occurrence of an event in the log file. This can be useful for determining the service outage window and for general troubleshooting. In the example below I am looking at the PostFix maillog file, trying to figure out the duration of the problem I’ve been having with one of the mail relay servers.
[root@postfix01 ]# grep relay01 /var/log/maillog | grep "timed out" | grep "Apr 16 [0-1][0-9]:" | sed -n '1p;$p' Apr 16 00:32:47 postfix01 postfix/smtp: D6E34180E9: conversation with relay01.domain.com[192.168.3.122] timed out while sending end of data -- message may be sent more than once Apr 16 10:36:57 postfix01 postfix/error: D1E786608C0: to=<firstname.lastname@example.org>, relay=none, delay=12649, delays=12649/0/0/0.2, dsn=4.4.2, status=deferred (delivery temporarily suspended: conversation with relay01.krazyworks.com[192.168.3.122] timed out while sending end of data -- message may be sent more than once)
So, from the output I can see the “timed out” errors for mail relay server “relay01” started on April 16 at 00:32 and ended at 10:36 on the same day. With this information, I can go to my network and email admins and ask them to check the logs on their end to see if anything has been going on during this time window. This is much more productive than just saying “my customers received duplicate emails”.