Server monitoring—like data backups—is one of those things never really know you need until it’s too late.
And then you really wish you had it.
Thankfully, I’ve been hosting apps on the internet long enough that I’ve learned my lessons,
and now consider monitoring an essential part of the process.
One tool that I often use is a small, free and open-source library called monit,
which you can install on your servers and use to alert you of all kinds of things.
The only problem with monit is that it’s pretty confusing to set up,
and every time I need to install it on a new server it takes ages to remember how the heck it works.
So this time when setting up my latest app server I decided to take some notes
and document them here so that future me (and maybe some of you?) can use this as a guide.
What are we monitoring?
I use StatusCake to monitor my public sites and tell me if something is wrong.
For my SaaS Pegasus apps I combine this with django-health-check
which helps notify you if a piece of your infrastructure (e.g. Database, Cache, Celery) is down.
These tools are great because you don’t have to set up anything yourself,
but they are limited by what can be tested via a public endpoint.
Monit helps to plug this gap by tracking things that happen on the machine.
I mostly use it for detecting reboots and when I’m running low on disk space.
Disk space is a very useful thing to monitor proactively, because there is often a huge difference between
“my disk is 80% full” and “my disk is 100% full” from a how-effed-are-you perspective.
Here’s how I set it up.
Setting up disk space monitoring
Before configuring anything, you have to install monit:
Monit uses a similar set up to apache or nginx where you can have a directory of config files that
get imported. It uses a “conf-available”, “conf-enabled” paradigm for turning on and off configurations.
To set up disk monitoring, create a file at /etc/monit/conf-available/disks
with the following contents:
check device var with path /dev/sda
if SPACE usage > 60% then alert
I don’t have any idea what this file format is but it’s pretty readable!
You may need to replace /dev/sda
with the exact path of your disk which you can find by running
and finding the main drive (usually the one which is the largest).
You can also change the threshold you want to be notified of.
I like to start with 60% as a nice, conservative number.
After creating this file, you can enable the configuration by creating a symlink from the conf-enabled
folder:
$ ln -s /etc/monit/conf-available/disks /etc/monit/conf-enabled/disks
And restarting monit:
sudo service restart monit
Setting up email alerts with an external email service
The last step is to make sure monit can email you when something goes wrong.
It’s possible to run your own mail server on the machine for this, but I prefer using an external email service
like Mailgun (the one I use), which involves less fiddling.
Email set up involves making a few changes to the /etc/monit/monitrc
file.
The sections below will already be in that file but commented out, so you mostly just have to find
them, uncomment them, and then customize them to your needs.
Here’s the configuration to enable mailgun:
set mailserver smtp.mailgun.org port 587
username you@example.com password "<a mailgun sending password>"
using TLSV13 with timeout 30 seconds
One quirk is that you need to update the TLS version to TLSV13
from the default that’s in the file.
Next you can customize the mail-format
section which looks like this:
set mail-format {
from: monit@example.org
subject: monit alert -- $EVENT $SERVICE
message: $EVENT Service $SERVICE
Date: $DATE
Action: $ACTION
Host: $HOST
Description: $DESCRIPTION
Your faithful employee,
Monit
}
Finally, you configure the email address you want it to send alerts to like this:
set alert you@example.com # alerts will be sent to this address
Restart monit again:
sudo service restart monit
And if you set things up properly you should immediately get an email from monit telling you it restarted
(it will say “Monit instance changed,” which is the same thing you’ll see if your machine reboots).
If you want you can also test the disk space usage alert by reducing the threshold below your current usage
and restarting monit again, then setting it back.
Now you’ll know whenever your machine reboots or you’re running low on space!
Monit has a bunch of other things it can monitor, CPU, memory, and various programs.
I haven’t dug too closely into these other options, but you might want to if you have stronger requirements.
Happy monitoring!