I have been playing with monitoring applications and servers for a while now and especially like Prometheus and Grafana.

In this post I’ll explain how I monitor my servers.

Setup

My setup consists of two medium powerful servers I installed VirtualBox on. When connecting you can either forward the virtualbox application using ssh -YC (more info on that), or you can use the VBoxManage utility on the command line.

I will be using vi a lot in this tutorial. If you are not familiar with it, I’d like to recommend this funny game to quickly learn the basics. Or just something easy like nano.

When I post configuration file contents or commands, there might be some values wrapped in angle brackets. Those have to be replace with values from your setup that match the description in the brackets.

Networking

To make the VM accessable from the internet, you need to buy an IP address from your hosting provider for every VM you want to add (some hosting providers also offer subnets as a whole). I have no idea about IPv6 but it seems like you get a subnet for free.

You simply need to set up a bridged network interface in most cases. Check back with your hosting provider if you are unsure about this.

Monitoring

Obviously, it’s a possible security risk to do the monitoring on public facing network interfaces. But to monitor multiple servers from a single place, it is necessary for the servers to talk to each other. Multiple solutions exist, and I definitely have not tried them all. To connect the servers with each other, I previously used a separate Host-Only-Network adapter in VirtualBox where I connected all the monitoring stuff. This is the way I would still recommend, but I recently switched to using Wireguard, because it is easier and faster to set up and allows cross-host communication. In the best case, the Wireguard variant does a round trip through the hoster’s network (in worse cases it goes further), while the virtual network’s packets stay in the server. (Im not a networking expert though, NICs might be smarter or dumber than I think…)

Host-Only-Network

To set up the Host-Only-Network, go to the VirtualBox preferences and create a network before trying to change any VM settings (I run into this almost every time). Then set up the network interface as required by your operating system.

Wireguard

To set up Wireguard you can either read the documentation on their website or, if you are on a debian based os (all commands as root, run on every server you want to connect):

  1. Install the requirements for adding ppa repos:
    apt install apt-transport-https software-properties-common
    
  2. Add the Wireguard ppa repo:
    add-apt-repository ppa:wireguard/wireguard
    
  3. Install Wireguard:
    apt update && apt install wireguard-tools wireguard-dkms
    
  4. Generate the private key:
    wg genkey > /etc/wireguard/private.key && chmod 600 private.key
    
  5. Generate the public key:
    wg pubkey > /etc/wireguard/public.key < /etc/wireguard/private.key
    
  6. Generate the interface configuration file:
    vi /etc/wireguard/wg0.conf
    

    The content should look like the following:

[Interface]
PrivateKey = <The private key>
ListenPort = <The port the server will run on>
SaveConfig = true
Address = 10.0.0.1/24

[Peer]
PublicKey = <The public key of the target server>
AllowedIPs = 10.0.0.2/24
Endpoint = <The host name and port of the target server>

Note that you have to switch the IPs when installing this on the other server!

Also, if you want to have multiple Wireguard connections on one server, you also have to change the IPs.

Scraping stuff

I already mentioned that I will use a combination of Prometheus (to get the data) and Grafana (to display it) for my monitoring setup. In this case, Prometheus will run on the same server as Grafana and will scrape the individual targets over Wireguard (or the Host-Only-Network if you went down that path). This makes the Grafana dashboard load faster while the scrapes will take a bit longer. You will have to see what fits best in your case.

Installing the Monitor

The official method of installing Grafana using APT has changed. I’ll update this post in the next few days but until then you’ll have to refer to the official documentation.

The monitor in this case means a server/vm dedicated to monitoring. Grafana will run here and this (virtual) machine should have a proper DNS entry (for TLS). Let’s continue setting it up, assuming that the grafana is set up:

  1. Install Grafana, Prometheus and dependencies:
    apt update && apt install grafana prometheus nginx certbot
    
  2. Stop all stuff that might be running (ignore “not found” errors):
    service grafana-server stop && service prometheus stop && service nginx stop 
    
  3. Delete the default NGINX config:
    rm /etc/nginx/sites-enabled/default
    
  4. Generate a Diffie-Hellman exchange parameter:
    openssl dhparam -out /etc/nginx/dhparam.pem 4096
    
  5. Generate a TLS certificate:
    certbot --certonly -d <your-domain> 
    
  6. Create a NGINX config:
    vi /etc/nginx/sites-enable/<domain>.conf
    

Its content should look like the following:

server {
  listen 80;
  server_name <Monitor Domain Name>;
  rewrite ^ https://$server_name$request_uri? permanent;
}


server {
  server_name <Monitor Domain Name>;
  listen 443 ssl;
  ssl on;
  ssl_certificate /etc/letsencrypt/live/<Monitor Domain Name>/fullchain.pem; # path to your cacert.pem
  ssl_certificate_key /etc/letsencrypt/live/<Monitor Domain Name>/privkey.pem; # path to your privkey.pem
  
  ssl_protocols TLSv1.3;# Requires nginx >= 1.13.0 else use TLSv1.2
  ssl_prefer_server_ciphers on;
  ssl_dhparam /etc/nginx/dhparam.pem;
  ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-SHA384;
  ssl_ecdh_curve secp384r1; # Requires nginx >= 1.1.0
  ssl_session_timeout  10m;
  ssl_session_cache shared:SSL:10m;
  ssl_session_tickets off; # Requires nginx >= 1.5.9
  ssl_stapling on; # Requires nginx >= 1.3.7
  ssl_stapling_verify on; # Requires nginx => 1.3.7
  resolver $DNS-IP-1 $DNS-IP-2 valid=300s;
  resolver_timeout 5s;
  add_header Strict-Transport-Security &quot;max-age=63072000; includeSubDomains; preload&quot;;
  add_header X-Frame-Options DENY;
  add_header X-Content-Type-Options nosniff;
  add_header X-XSS-Protection &quot;1; mode=block&quot;;

  location / {
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header Host $http_host;
    proxy_pass http://localhost:3000;
    client_max_body_size 0;
  }
}
  1. Edit the Grafana config file:
    vi /etc/grafana/grafana.ini 
    

    This file is very good documented, you just need to uncomment the settings you want to change:

    • instance_name = <Monitor Domain Name>
    • protocol = http
    • http_addr = 127.0.0.1
    • http_port = 3000
    • domain = <Monitor Domain Name>
    • root_url = https://<Monitor Domain Name>
  2. Edit the prometheus defaults file:
    vi /etc/default/prometheus
    

    There you just need to set the interface:

    ARGS='--web.listen-address="127.0.0.1:9091"'
    
  3. Edit the prometheus config:
    vi /etc/prometheus/prometheus.yml
    

    Please refer to their documentation on how to add targets (or just read the file, I think it’s self-explanatory).

  4. Start all the stuff:
    service grafana-server start && service prometheus start && service nginx start
    

Now you should be able to reach your Grafana at https://<Monitor Domain Name>.

Setting up targets

As you can read from the list in the Prometheus documentation, there are lots of exporters and the list is not even complete. This is why I can’t show every exporter there is in this tutorial, but I will explain two of the ones I use:

Just to be sure: Those have to be installed on the system you want to monitor!

node_exporter

The node_exporter service seems to also be part of prometheus but it’s an older version and seems to crash after running for some time (at least on my system). There even is a default target entry for it in /etc/prometheus/prometheus.yml.

To install the current version:

  1. Install Go
    apt install golang
    
  2. Fetch node_exporter:
    go get github.com/prometheus/node_exporter
    
  3. Build it:
    cd /root/go/src/github.com/prometheus/node_exporter && make
    
  4. Create a systemd service for it:
    vi /etc/systemd/system/node-exporter.service
    

    The content should look like this:

[Unit]
Description=Custom version (current git) of prometheus node_exporter
Wants=network-online.target
After=network-online.target

[Service]
Type=simple
ExecStart=/root/go/bin/node_exporter --web.listen-address="<IP of monitor interface>:9101"

[Install]
WantedBy=multi-user.target
  1. Enable the systemd service:
    systemctl enable node-exporter
    
  2. Start the service:
    systemctl start node-exporter
    

You should now be able to do curl <IP of monitor interface>:9101 and get a lot of text output containing your system’s metrics.

nginx-lua-prometheus

This is a nice plugin for NGINX that exposes a /metrics endpoint using lua.

This requires another version of NGINX. In Ubuntu/Debian based distros you can install it like this:

apt install nginx-extra

After installing NGINX, add the following to the bottom of /etc/nginx/nginx.conf:

lua_shared_dict prometheus_metrics 10M;
lua_package_path "/home/<username>/nginx-lua-prometheus/?.lua";
init_by_lua '
  prometheus = require("prometheus").init("prometheus_metrics")
  metric_requests = prometheus:counter("nginx_http_requests_total", "Number of HTTP requests", {"host", "status"})
  metric_latency = prometheus:histogram("nginx_http_request_duration_seconds", "HTTP request latency", {"host"})
  metric_connections = prometheus:gauge("nginx_http_connections", "Number of HTTP connections", {"state"})
';
log_by_lua '
  metric_requests:inc(1, {ngx.var.server_name, ngx.var.status})
  metric_latency:observe(tonumber(ngx.var.request_time), {ngx.var.server_name})
';
}

You need to replace the with a user of your choice. Then clone the repository containing the lua scripts into that users home directory:

git clone https://github.com/knyar/nginx-lua-prometheus

The last thing to do is to activate the endpoint. To do that, create the file /etc/nginx/sites-enabled/prometheus.endpoint and put the following content into it:

server {
  listen 9145;
  allow 127.0.0.1;
  deny all;
  location /metrics {
  content_by_lua '
    metric_connections:set(ngx.var.connections_reading, {"reading"})
    metric_connections:set(ngx.var.connections_waiting, {"waiting"})
    metric_connections:set(ngx.var.connections_writing, {"writing"})
    prometheus:collect()
  ';
  }
}

Now, after running service nginx restart, you should be able to get the metrics using:

curl http://127.0.0.1:9145/metrics

This is the also URL you need to configure in Prometheus. But be careful. Maybe you want the NGINX Endpoint to be available on the Wireguard interface. Change the IP accordingly if you want this!

Other exporters / More Information

I hope this post was helpful :)