I have been playing with monitoring applications and servers for a while now and especially like Prometheus and Grafana.

In this post I’ll explain how I monitor my servers.

Setup

My setup consists of two medium powerful servers I installed VirtualBox on. When connecting you can either forward the virtualbox application using ssh -YC (more info on that), or you can use the VBoxManage utility on the command line.

I will be using vi a lot in this tutorial. If you are not familiar with it, I’d like to recommend this funny game to quickly learn the basics. Or just something easy like nano.

When I post configuration file contents or commands, there might be some values wrapped in angle brackets. Those have to be replace with values from your setup that match the description in the brackets.

Networking

To make the VM accessable from the internet, you need to buy an IP address from your hosting provider for every VM you want to add (some hosting providers also offer subnets as a whole). I have no idea about IPv6 but it seems like you get a subnet for free.

You simply need to set up a bridged network interface in most cases. Check back with your hosting provider if you are unsure about this.

Monitoring

Obviously, it’s a possible security risk to do the monitoring on public facing network interfaces. But to monitor multiple servers from a single place, it is necessary for the servers to talk to each other. Multiple solutions exist, and I definitely have not tried them all. To connect the servers with each other, I previously used a separate Host-Only-Network adapter in VirtualBox where I connected all the monitoring stuff. This is the way I would still recommend, but I recently switched to using Wireguard, because it is easier and faster to set up and allows cross-host communication. In the best case, the Wireguard variant does a round trip through the hoster’s network (in worse cases it goes further), while the virtual network’s packets stay in the server. (Im not a networking expert though, NICs might be smarter or dumber than I think…)

Host-Only-Network

To set up the Host-Only-Network, go to the VirtualBox preferences and create a network before trying to change any VM settings (I run into this almost every time). Then set up the network interface as required by your operating system.

Wireguard

To set up Wireguard you can either read the documentation on their website or, if you are on a debian based os (all commands as root, run on every server you want to connect):

Install the requirements for adding ppa repos:

apt install apt-transport-https software-properties-common

Add the Wireguard ppa repo:

add-apt-repository ppa:wireguard/wireguard

Install Wireguard:

apt update && apt install wireguard-tools wireguard-dkms

Generate the private key:

wg genkey > /etc/wireguard/private.key && chmod 600 private.key

Generate the public key:

wg pubkey > /etc/wireguard/public.key < /etc/wireguard/private.key

Generate the interface configuration file:
```
vi /etc/wireguard/wg0.conf
```
The content should look like the following:

[Interface]
PrivateKey = <The private key>
ListenPort = <The port the server will run on>
SaveConfig = true
Address = 10.0.0.1/24

[Peer]
PublicKey = <The public key of the target server>
AllowedIPs = 10.0.0.2/24
Endpoint = <The host name and port of the target server>

Note that you have to switch the IPs when installing this on the other server!

Also, if you want to have multiple Wireguard connections on one server, you also have to change the IPs.

Scraping stuff

I already mentioned that I will use a combination of Prometheus (to get the data) and Grafana (to display it) for my monitoring setup. In this case, Prometheus will run on the same server as Grafana and will scrape the individual targets over Wireguard (or the Host-Only-Network if you went down that path). This makes the Grafana dashboard load faster while the scrapes will take a bit longer. You will have to see what fits best in your case.

Installing the Monitor

The official method of installing Grafana using APT has changed. I’ll update this post in the next few days but until then you’ll have to refer to the official documentation.

The monitor in this case means a server/vm dedicated to monitoring. Grafana will run here and this (virtual) machine should have a proper DNS entry (for TLS). Let’s continue setting it up, assuming that the grafana is set up:

Install Grafana, Prometheus and dependencies:

 apt update && apt install grafana prometheus nginx certbot

Stop all stuff that might be running (ignore “not found” errors):

 service grafana-server stop && service prometheus stop && service nginx stop

Delete the default NGINX config:
```
 rm /etc/nginx/sites-enabled/default
```

Generate a Diffie-Hellman exchange parameter:

 openssl dhparam -out /etc/nginx/dhparam.pem 4096

Generate a TLS certificate:
```
certbot --certonly -d <your-domain> 
```

Create a NGINX config:

server {
   listen 80;
   server_name <Monitor Domain Name>;
   rewrite ^ https://$server_name$request_uri? permanent;
}
   
server {
   server_name <Monitor Domain Name>;
   listen 443 ssl;
   ssl on;
   ssl_certificate /etc/letsencrypt/live/<Monitor Domain Name>/fullchain.pem; # path to your cacert.pem
   ssl_certificate_key /etc/letsencrypt/live/<Monitor Domain Name>/privkey.pem; # path to your privkey.pem
      
   ssl_protocols TLSv1.3;# Requires nginx >= 1.13.0 else use TLSv1.2
   ssl_prefer_server_ciphers on;
   ssl_dhparam /etc/nginx/dhparam.pem;
   ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-SHA384;
   ssl_ecdh_curve secp384r1; # Requires nginx >= 1.1.0
   ssl_session_timeout  10m;
   ssl_session_cache shared:SSL:10m;
   ssl_session_tickets off; # Requires nginx >= 1.5.9
   ssl_stapling on; # Requires nginx >= 1.3.7
   ssl_stapling_verify on; # Requires nginx => 1.3.7
   resolver $DNS-IP-1 $DNS-IP-2 valid=300s;
   resolver_timeout 5s;
   add_header Strict-Transport-Security &quot;max-age=63072000; includeSubDomains; preload&quot;;
   add_header X-Frame-Options DENY;
   add_header X-Content-Type-Options nosniff;
   add_header X-XSS-Protection &quot;1; mode=block&quot;;
      
   location / {
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header X-Forwarded-Proto $scheme;
      proxy_set_header X-Real-IP $remote_addr;
      proxy_set_header Host $http_host;
      proxy_pass http://localhost:3000;
      client_max_body_size 0;
   }
}

Edit the Grafana config file:
```
vi /etc/grafana/grafana.ini 
```
- This file is very well documented, you just need to uncomment the settings you want to change:
  - instance_name = <Monitor Domain Name>
  - protocol = http
  - http_addr = 127.0.0.1
  - http_port = 3000
  - domain = <Monitor Domain Name>
  - root_url = https://<Monitor Domain Name>
Edit the prometheus defaults file:
```
vi /etc/default/prometheus
```
- There you just need to set the interface:
```
ARGS='--web.listen-address="127.0.0.1:9091"'
```
Edit the prometheus config:
```
vi /etc/prometheus/prometheus.yml
```
- Please refer to their documentation on how to add targets (or just read the file, I think it’s self-explanatory).

Start all the stuff:

service grafana-server start && service prometheus start && service nginx start

Now you should be able to reach your Grafana at https://<Monitor Domain Name>.

Setting up targets

As you can read from the list in the Prometheus documentation, there are lots of exporters and the list is not even complete. This is why I can’t show every exporter there is in this tutorial, but I will explain two of the ones I use:

Just to be sure: Those have to be installed on the system you want to monitor!

node_exporter

The node_exporter service exports machine metrics like CPU, RAM, disk usage, entropy state, temperatures, etc.

To install the current version:

Go to the latest release page and copy the download link for the archive matching your os and architecture
Fetch the node_exporter archive:
```
 wget <URL>
```

Extract it:

 tar xvfz node_exporter-<VERSION>.<OS>-<ARCH>.tar.gz

Copy the executable to the bin folder:

 mv -v node_exporter-<VERSION>.<OS>-<ARCH>/node_exporter /usr/local/bin

Remove the archive and the extracted folder:

 rm -rf node_exporter-<VERSION>.<OS>-<ARCH>*

Create a systemd service for it:

 vi /lib/systemd/system/node-exporter.service

The content should look like this:

[Unit]
Description=Prometheus node_exporter
Wants=network-online.target
After=network-online.target

[Service]
Type=simple
ExecStart=/usr/local/bin/node_exporter --web.listen-address="<IP of monitor interface>:9100"

[Install]
WantedBy=multi-user.target

Enable and start the systemd service:
```
 systemctl enable --now node-exporter
```

You should now be able to do curl http://<IP of monitor interface>:9100/metrics and get a lot of text output containing the metrics.

nginx-lua-prometheus

This is a nice plugin for NGINX that exposes a /metrics endpoint using lua.

This requires another version of NGINX. In Ubuntu/Debian based distros you can install it like this:

apt install nginx-extra

After installing NGINX, add the following to the bottom of /etc/nginx/nginx.conf:

lua_shared_dict prometheus_metrics 10M;
lua_package_path "/home/<username>/nginx-lua-prometheus/?.lua";
init_by_lua '
  prometheus = require("prometheus").init("prometheus_metrics")
  metric_requests = prometheus:counter("nginx_http_requests_total", "Number of HTTP requests", {"host", "status"})
  metric_latency = prometheus:histogram("nginx_http_request_duration_seconds", "HTTP request latency", {"host"})
  metric_connections = prometheus:gauge("nginx_http_connections", "Number of HTTP connections", {"state"})
';
log_by_lua '
  metric_requests:inc(1, {ngx.var.server_name, ngx.var.status})
  metric_latency:observe(tonumber(ngx.var.request_time), {ngx.var.server_name})
';
}

You need to replace the with a user of your choice. Then clone the repository containing the lua scripts into that users home directory:

git clone https://github.com/knyar/nginx-lua-prometheus

The last thing to do is to activate the endpoint. To do that, create the file /etc/nginx/sites-enabled/prometheus.endpoint and put the following content into it:

server {
  listen 9145;
  allow 127.0.0.1;
  deny all;
  location /metrics {
  content_by_lua '
    metric_connections:set(ngx.var.connections_reading, {"reading"})
    metric_connections:set(ngx.var.connections_waiting, {"waiting"})
    metric_connections:set(ngx.var.connections_writing, {"writing"})
    prometheus:collect()
  ';
  }
}

Now, after running service nginx restart, you should be able to get the metrics using:

curl http://127.0.0.1:9145/metrics

This is also the URL you need to configure in Prometheus. But be careful. Maybe you want the NGINX Endpoint to be available on the Wireguard interface. Change the IP accordingly if you want this!

systemd_exporter

The systemd_exporter service exports metrics about SystemD services running on the system, which for example allows you to build alerts for the case that a specific service fails.

To install the current version:

Go to the latest release page and copy the download link for the archive matching your os and architecture
Fetch the systemd_exporter archive:
```
 wget <URL>
```

Extract it:

 tar xvfz systemd_exporter-<VERSION>.<OS>-<ARCH>.tar.gz

Copy the executable to the bin folder:

 mv -v systemd_exporter-<VERSION>.<OS>-<ARCH>/node_exporter /usr/local/bin

Remove the archive and the extracted folder:

 rm -rf systemd_exporter-<VERSION>.<OS>-<ARCH>*

Create a systemd service for it:

 vi /lib/systemd/system/systemd_exporter.service

The content should look like this:

[Unit]
Description=Prometheus systemd_exporter
Wants=network-online.target
After=network-online.target

[Service]
Type=simple
ExecStart=/usr/local/bin/systemd_exporter --web.listen-address="<IP of monitor interface>:9558"

[Install]
WantedBy=multi-user.target

Enable and start the systemd service:

 systemctl enable --now systemd_exporter

You should now be able to do curl http://<IP of monitor interface>:9558/metrics and get a lot of text output containing the systemd metrics.

Other exporters / More Information

Prometheus has a list on their website where some exporters are listed
There exists an “Awesome Prometheus”-List on GitHub
Prometheus also has documentation on how to write your own exporter

I hope this post was helpful :)

If you have any improvements or comments, please let me know!