Showing posts with label prometheus. Show all posts
Showing posts with label prometheus. Show all posts

Sunday, 8 November 2020

Prometheus Monitoring - Part 5

My previous articles mainly discussed regarding the alerts, alerts manager, and notifications. 

In this article, we will include Prometheus as a datasource from Grafana for data visualisations. 
  • Install & Configure Grafana
    • Install nginx
    • Install & Configure Nginx 
    • Configure Nginx reverse proxy
    • Configure SSL
    • Register to DNS
  • Setup Prometheus DataSource
  • Setup Prometheus Dashboards
  • Create dashboards for node exporters  
Install & Configure Grafana
I am installing Grafana using Ubuntu 20.04, with a root account.

sudo apt update
sudo apt-get install -y adduser libfontconfig1
wget https://dl.grafana.com/oss/release/grafana_7.2.0_amd64.deb
sudo dpkg -i grafana_7.2.0_amd64.deb
sudo service grafana-server start
sudo systemctl enable grafana-server.service


Your Grafana server will be hosted at http://[your Grafana server ip]:3000
The default Grafana login is
Username :admin
Password :admin

If you need to have an SSL, installing an nginx proxy would be fine and then configure reverse proxy to redirect accordingly.

sudo apt install nginx -y
sudo vim /etc/nginx/sites-enabled/prometheus


server {
    listen 80;
    listen [::]:80;
    server_name  prometheus.YOUR-DOMAIN-NAME;

    location / {
proxy_pass           http://localhost:3000/;
    }
}


Save and test the new configuration has no errors
nginx -t

http://YOUR-DOMAIN-NAME
Visiting your ip address directly will still show the default Nginx welcome page. you can remove( rm /etc/nginx/sites-enabled/default )

restart nginx,
sudo service nginx restart
sudo service nginx status


Add SSL certificates to the grafana dashboards.

sudo snap install --classic certbot
sudo certbot --nginx

Once those certs are installed, you can use https://grafana.domainname.com to login from the browser.



Setup Prometheus DataSource

Once you logged into the grafana, go to "Configurations" → Click on "DataSources" → Select "Prometheus" .
Configurations required over here have to be filled up.




Go to the explore tabs in which Prometheus is already selected, and run the query "go_threads". 



Setup Prometheus Dashboards

Go to Prometheus Configurations → Datasources → Click on the datasources which you have created → select Dashboards → Prometheus 2.0 Stats → Click on Import.



Create dashboards for node exporters

Configurations section choose → Plugins → Click on "Find more plugins on Grafana.com" → Select "Dashboards" → Select [ English version ] → copy ID : 11074

Grafana Web page, select "Manage" from Dashboards → Select "Import" → Paste the ID : "11074"




Prometheus Monitoring - Part 4

In previous post we discussed about PromQL and Alerts
https://sunlnx.blogspot.com/2020/11/promotheus-monitoring-part-3.html

In this article, we will discuss these

  • Prometheus Alert Manager
  • Configure SMTP local on Prometheus Server
  • Test Alerts

Prometheus Alert Manager

The AlertManager handles alerts sent by client applications such as the Prometheus server. It takes care of deduplicating, grouping, and routing them to the correct receiver integration such as email, PagerDuty, or OpsGenie. It also takes care of silencing and inhibition of alerts.

Install the Prometheus Alert Manager

sudo apt install prometheus-alertmanager
sudo service prometheus-alertmanager status
ps -u prometheus

Note that the service is running on port 9093

Visit http://[your domain name or ip]:9093/


Block Port 9093 for external requests

iptables -A INPUT -p tcp -s localhost --dport 9093 -j ACCEPT
iptables -A INPUT -p tcp --dport 9093 -j DROP
iptables -L

iptables-save > /etc/iptables/rules.v4
iptables-save > /etc/iptables/rules.v6
tail -4 prometheus.yml
- job_name: alertmanager
      static_configs:
        - targets: ['localhost:9093']

verify the config and restart the service. Once they are successful, it must display the target in the Prometheus UI.



Configure SMTP for Alerts

Setup a simple local SMTP server which can only send emails from localhost.

sudo apt install mailutils
sudo vim /etc/postfix/main.cf

Go to the End of the line,
inet_interfaces = loopback-only
inet_protocols = ipv4


sudo systemctl restart postfix

Make sure your forward and reverse looks fine, otherwise it is very likely that email providers don't think this would be a valid email and you won't receive any emails.
Once verified, fire below command from the prometheus server,

echo "This is the body" | mail -s "This is the subject" -a "FROM:admin@yourdomainname" your@email-address

check your mail account, you would have received an email..

configure the Alert Manager process to send emails when the alerting rules fire and resolve.
cd /etc/prometheus
cp alertmanager.yml alertmanager_orig.yml
cat >  alertmanager.yml
[ctrl-d]
Add the below contents and configure your alerts

route:

  receiver: smtp-local
receivers:
  - name: 'smtp-local'
    email_configs:
    - to: 'sunlnx@gmail.com'
      from: 'promoalertadmin@devtestlabs.in'
      require_tls: false
      #auth_username: 'alertmanager'
      #auth_password: 'password'
      #auth_secret: 'secret'
      #auth_identity: 'identity'
      smarthost: localhost:25
      send_resolved: true

  Now, you would have received an alert as the state is in Firing.  




Source mentioned in the email are w.r.t to localhost and we would configure it to use the prometheus source, to change it
sudo vim /etc/default/prometheus
ARGS="--web.enable-admin-api --web.external-url=https://example.com"


restart your prometheus server to take effect.
systemctl restart prometheus

Thanks.

Saturday, 7 November 2020

Prometheus Monitoring - Part 3

Please check my previous posts as we have discussed in detail over scrape targets installations and configuration.

We would be discussing the below items on the Prometheus.
  • PromQL Queries
  • Saving/Recording Rules
  • Alerting Rules
PromQL Queries

The query language used in Prometheus is called PromQL (Prometheus Query Language). The data can either be viewed as a graph, as tabled data, or in external systems such as Grafana, Zabbix and others.

Simple examples

node_cpu_seconds_total{}
node_cpu_seconds_total{instance="promonode01.devtestlabs.in:80"} 

Regular Expressions
list only nodes with specific domains.

node_cpu_seconds_total{instance=~".*.devtestlabs.*"} 
node_cpu_seconds_total{instance=~".*.devtestlabs.*",mode=~".*irq*"}

Data Types

Scalar
A numeric floating point value

Instant vector
A set of time series containing a single sample for each time series.

Range Vector
A set of time series containing a range of data points over time for each time series.

node_cpu_seconds_total{instance=~".*.devtestlabs.*",mode=~".*irq*"}[1m]
node_cpu_seconds_total{instance=~".*.devtestlabs.*",mode=~".*irq*"}[5m]

Functions/Subfunctions
  • Start with this instant vector node_netstat_Tcp_InSegs{instance="localhost:9100"}
  • Convert it to a Range Vector and then convert it back to an instant vector using rate rate(node_netstat_Tcp_InSegs{instance="localhost:9100"}[1m])
  • Wrap it in the ceiling function ceil(rate(node_netstat_Tcp_InSegs{instance="localhost:9100"}[1m]))
  • Convert it to a range vector and get the per-second derivative of the time series deriv(ceil(rate(node_netstat_Tcp_InSegs{instance="localhost:9100"}[1m]))[1m:])
More Info: https://prometheus.io/docs/prometheus/latest/querying/functions/

Saving Rules
To run a query every time would be difficult, hence we can store the resultant query, Custom rules can be created and saved in the config file. 

Examples
Memory available percentage: 100 - (100 * node_memory_MemFree_bytes / node_memory_MemTotal_bytes))
Root Disk Space: 100 * node_filesystem_free_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}

Login to your prometheus server and configure the rules as below

vim /etc/prometheus/prometheus_rules.yml
groups:
  - name: custom_rules
    rules:
      - record: node_memory_free_percent
        expr: 100 - (100 * node_memory_MemFree_bytes / node_memory_MemTotal_bytes)
      - record: node_filesystem_free_percent
        expr: 100 * node_filesystem_free_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}


check for syntax,
promtool check rules prometheus_rules.yml
Checking prometheus_rules.yml
  SUCCESS: 2 rules found


Include these rules in the main prometheus configurations.
vim /etc/prometheus/prometheus.yml
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"
  - "prometheus_rules.yml"


check for the syntax of the main config file,

promtool check config prometheus.yml
Checking prometheus.yml
  SUCCESS: 1 rule files found

Checking prometheus_rules.yml
  SUCCESS: 2 rules found


since all are fine, restart prometheus server,
systemctl restart prometheus

You would now be able to see the custom rules  in the dashboard.
 


Alerts
We will create a new group named alert_rules and add in the same config file which created earlier rules.
vim /etc/promotheus/prometheus_rules.yml
  - name: alert_rules
    rules:
      - alert: InstanceDown
        expr: up == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Instance {{ $labels.instance }} down"
          description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minute."

      - alert: DiskSpaceFree10Percent
        expr: node_filesystem_free_percent <= 10
        labels:
          severity: warning
        annotations:
          summary: "Instance {{ $labels.instance }} has 10% or less Free disk space"
          description: "{{ $labels.instance }} has only {{ $value }}% or less free."


check for the syntax of the config file and restart the service when there are SUCCESS message.

 promtool check config prometheus.yml
Checking prometheus.yml
  SUCCESS: 1 rule files found

Checking prometheus_rules.yml
  SUCCESS: 3 rules found


sudo service prometheus restart
sudo service prometheus status

 



Prometheus Monitoring - part 2

This is to further post on Prometheus monitoring.  In the earlier section we have seen how to install and configure.
https://sunlnx.blogspot.com/2020/11/prometheus-installations-and.html

In this section, we could see below topics
  • Introduction to Scrape Targets
  • Adding a new node exporter externally to Promotheus node exporter
  • Configuring reverse proxy to the newly installed Prometheus node exporter
  • Deleting Time Service DataBase for unused nodes
Introduction to Scrape Targets

When you install prometheus, it would install two metric endpoints.
Prometheus : http://127.0.0.1:9090/metrics
Node Exporter : http://127.0.0.1:9100/metrics

You could see the same when you go to target sections in the Prometheus section.


 
when the above endpoints are executed in the local machine, you would be provided with all the metrics available on the prometheus as well as the nodes.
curl http://127.0.0.1:9090/metrics
curl http://127.0.0.1:9100/metrics

You can find the same in the UI when you go to the configuration section. 



You would be able to scrape targets from the config file /etc/prometheus/prometheus.yml
Further nodes, could be added in scrape_confis and you can define the targets for specific intervals.

If there are no intervals defined, it would be considered from the global section of the prometheus. i.e 15s.

tail -20 /etc/prometheus/prometheus.yml
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'
    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s
    scrape_timeout: 5s
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    static_configs:
      - targets: ['localhost:9090']
  - job_name: node
    # If prometheus-node-exporter is installed, grab stats about the local
    # machine by default.
    static_configs:
      - targets: ['localhost:9100']

 
Search from the graph you could find many of the metrics related to the prometheus or with the nodes which you can configure.
search by "prometheus" or "nodes" will provide all the metrics details.
 
Lets, look for the memory graph for the nodes... 

console 


Graph
 


Install New node exporter

Install an external Prometheus Node Exporter on a different server.

sudo apt-get update
apt install prometheus-node-exporter -y
sudo service prometheus-node-exporter status


Register if you have any of your Domain hosting web server to point your IP to 'A'. 

verify your prometheus server is running fine - http://[your domain or ip]:9100/metrics

You can now block port 9100 externally, but leave it open internally for localhost
iptables -A INPUT -p tcp -s localhost --dport 9100 -j ACCEPT
iptables -A INPUT -p tcp --dport 9100 -j DROP


Even your prometheus server won't be able to query the new node as it would be unreachable. there are a couple of solutions ..

1) either you would allow from only your domain and restrict other traffic.

2) You could install nginx reverse proxy and add a path specific for the node exporter.

Add the new exporter to the prometheus server for monitoring.

tail -6 /etc/prometheus/prometheus.yml
  - job_name: node_promonode01
    #
    static_configs:
      - targets: ['localhost:9100']
      - targets: ['promonode01.devtestlabs.in:9100']


validate the configurations files,
# promtool check config /etc/prometheus/prometheus.yml
Checking /etc/prometheus/prometheus.yml
  SUCCESS: 0 rule files found

on successful, restart prometheus service.

sudo service prometheus restart
sudo service prometheus status

 


Configuring reverse proxy to the newly installed Prometheus node exporter

Refer to previous post on how to setup nginx reverse proxy.. 

Modify the nginx conf for the new node, with below location to allow prometheus server to query the data.
sudo vim /etc/nginx/sites-enabled/promonode01
    location /metrics {
        allow  206.189.139.255;  <== Promotheus server
        deny all;
        proxy_pass           http://localhost:9100/metrics;
    }

Even if there is a request http from the prometheus server, nginx would convert to https and will redirect traffic to provide metric details to the server.
These configurations already exist on the server. 

server {
    if ($host = promonode01.devtestlabs.in) {
        return 301 https://$host$request_uri;
    } # managed by Certbot

try to hit the URL, you would be getting the Forbidden Error as it won't allow your to get the metric details.
 



Try to hit from the prometheus server using curl, you would be able to retrieve results.
curl -X GET https://promonode01.devtestlabs.in/metrics

Modify your promotheus server to scrape targets as we have configured reverse proxy
  - job_name: node_promonode01
    #
    static_configs:
      - targets: ['localhost:9100']
      - targets: ['promonode01.devtestlabs.in']


validate the config file and restart the promotheus server.
service prometheus restart
service prometheus status


 


Deleting Time Service DataBase for unused nodes

Data will be automatically deleted after the storage retention time has passed. By default it is 15 days.
If you want to delete specific data earlier, then you are able. Since we have configured a couple of new job_name in the config files it would show us a couple of wanted nodes in the graphs, we would be deleting them. 

Even though below are the 3 nodes, from the time service database shows us a couple of more graphs...

Console
go_threads{instance="localhost:9090",job="prometheus"} 9
go_threads{instance="localhost:9100",job="node"} 7
go_threads{instance="promonode01.devtestlabs.in:80",job="node"} 7

Graph
 

You need to enable the admin api in Prometheus before you can.
sudo vim /etc/default/prometheus
ARGS="--web.enable-admin-api"

Restart Prometheus and check status
sudo service prometheus restart
sudo service prometheus status


You can now make calls to the admin api.
After executing the above queries from the prometheus server, you would no longer be able to see the nodes in the graphs.


After deleting you should disable the admin api again,
sudo nano /etc/default/prometheus
Remove --web.enable-admin-api from the ARGS variable. eg,

ARGS=""

restart Prometheus and check status,

sudo service prometheus restart
sudo service prometheus status


That's so far we have on this tutorial. we would see more on the coming tutorials.



Friday, 6 November 2020

Prometheus Monitoring - part 1

Prometheus Monitoring
We shall be installing and configuring the Prometheus on the Ubuntu server.  you can either create a local box or from any of the cloud providers.

Arch Diagram


Prerequisites
I have set up this monitoring tool using Ubuntu 20.04 LTS Server with root access. You can use other operating systems, such as Centos, but since it was already installed for some demo purpose I wanted to configure this.

Installations
Prometheus package installed both Prometheus and the Prometheus Node to be installed.

sudo apt-get update
sudo apt install prometheus -y
sudo service prometheus status
sudo service prometheus-node-exporter status


Prometheus should now be running.
ps -u prometheus

You can visit it at http://[your ip address]:9090

Pointing your 'A' Domain name 

If your Prometheus server is accessible from the internet, you want it to look more professional to clients, login to your domain name provider, and add an A Name record that points to the IP address of the new Prometheus server.

Reverse Proxy Prometheus with Nginx
One option to help secure our Prometheus server is to put it behind a reverse proxy so that we can later add SSL and an Authentication layer over the default unrestricted Prometheus web interface.

sudo apt install nginx -y
sudo vim /etc/nginx/sites-enabled/prometheus

server {
    listen 80;
    listen [::]:80;
    server_name  prometheus.YOUR-DOMAIN-NAME;

    location / {
        proxy_pass           http://localhost:9090/;
    }
}


Save and test the new configuration has no errors
nginx -t

http://YOUR-DOMAIN-NAME
Visiting your ip address directly will still show the default Nginx welcome page. you can remove

rm /etc/nginx/sites-enabled/default

restart nginx,
sudo service nginx restart
sudo service nginx status


Add SSL to Prometheus Reverse Proxy
We will now add transport encryption to the Prometheus web user interface.
Certbot will install a LetsEncrypt SSL certificate for free. Ensure your domain name has propagated before running CertBot.

sudo snap install --classic certbot
sudo certbot --nginx


<snip>
.
.
Follow the prompts and select the domain name I want to secure.
.
.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Congratulations! You have successfully enabled https://prometheus.YOUR-DOMAIN-NAME
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
<snip>


Add Basic User Authentication Prometheus UI
Everything is great so far, but anybody in the world with the internet access and the URL can visit my Prometheus server and see my data.
To solve this problem, we will add user authentication.

cd /etc/nginx
sudo apt install apache2-utils
htpasswd -c /etc/nginx/.htpasswd admin


Nginx Prometheus config file, 

sudo vim /etc/nginx/sites-enabled/prometheus
server {
    ...

    #addition authentication properties
    auth_basic  "Protected Area";  <=============== append
    auth_basic_user_file /etc/nginx/.htpasswd; <=== append

    location / {
        proxy_pass           http://localhost:9090/;
    }

    ...
}


restart nginx,
sudo service nginx restart
sudo service nginx status

when you try to open your Prometheus server, it would prompt for your basic authentication.

you would still be able to access the IP:9090 of the Prometheus server and hence we block ports from external connections.

iptables -A INPUT -p tcp -s localhost --dport 9090 -j ACCEPT
iptables -A INPUT -p tcp --dport 9090 -j DROP
iptables -A INPUT -p tcp -s localhost --dport 9100 -j ACCEPT
iptables -A INPUT -p tcp --dport 9100 -j DROP
iptables -L


To save rules permanently,
sudo apt install iptables-persistent
iptables-save > /etc/iptables/rules.v4
iptables-save > /etc/iptables/rules.v6

you have now successfully installed Prometheus server on your machine.