Saturday 7 November 2020

Prometheus Monitoring - part 2

This is to further post on Prometheus monitoring.  In the earlier section we have seen how to install and configure.
https://sunlnx.blogspot.com/2020/11/prometheus-installations-and.html

In this section, we could see below topics
  • Introduction to Scrape Targets
  • Adding a new node exporter externally to Promotheus node exporter
  • Configuring reverse proxy to the newly installed Prometheus node exporter
  • Deleting Time Service DataBase for unused nodes
Introduction to Scrape Targets

When you install prometheus, it would install two metric endpoints.
Prometheus : http://127.0.0.1:9090/metrics
Node Exporter : http://127.0.0.1:9100/metrics

You could see the same when you go to target sections in the Prometheus section.


 
when the above endpoints are executed in the local machine, you would be provided with all the metrics available on the prometheus as well as the nodes.
curl http://127.0.0.1:9090/metrics
curl http://127.0.0.1:9100/metrics

You can find the same in the UI when you go to the configuration section. 



You would be able to scrape targets from the config file /etc/prometheus/prometheus.yml
Further nodes, could be added in scrape_confis and you can define the targets for specific intervals.

If there are no intervals defined, it would be considered from the global section of the prometheus. i.e 15s.

tail -20 /etc/prometheus/prometheus.yml
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'
    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s
    scrape_timeout: 5s
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    static_configs:
      - targets: ['localhost:9090']
  - job_name: node
    # If prometheus-node-exporter is installed, grab stats about the local
    # machine by default.
    static_configs:
      - targets: ['localhost:9100']

 
Search from the graph you could find many of the metrics related to the prometheus or with the nodes which you can configure.
search by "prometheus" or "nodes" will provide all the metrics details.
 
Lets, look for the memory graph for the nodes... 

console 


Graph
 


Install New node exporter

Install an external Prometheus Node Exporter on a different server.

sudo apt-get update
apt install prometheus-node-exporter -y
sudo service prometheus-node-exporter status


Register if you have any of your Domain hosting web server to point your IP to 'A'. 

verify your prometheus server is running fine - http://[your domain or ip]:9100/metrics

You can now block port 9100 externally, but leave it open internally for localhost
iptables -A INPUT -p tcp -s localhost --dport 9100 -j ACCEPT
iptables -A INPUT -p tcp --dport 9100 -j DROP


Even your prometheus server won't be able to query the new node as it would be unreachable. there are a couple of solutions ..

1) either you would allow from only your domain and restrict other traffic.

2) You could install nginx reverse proxy and add a path specific for the node exporter.

Add the new exporter to the prometheus server for monitoring.

tail -6 /etc/prometheus/prometheus.yml
  - job_name: node_promonode01
    #
    static_configs:
      - targets: ['localhost:9100']
      - targets: ['promonode01.devtestlabs.in:9100']


validate the configurations files,
# promtool check config /etc/prometheus/prometheus.yml
Checking /etc/prometheus/prometheus.yml
  SUCCESS: 0 rule files found

on successful, restart prometheus service.

sudo service prometheus restart
sudo service prometheus status

 


Configuring reverse proxy to the newly installed Prometheus node exporter

Refer to previous post on how to setup nginx reverse proxy.. 

Modify the nginx conf for the new node, with below location to allow prometheus server to query the data.
sudo vim /etc/nginx/sites-enabled/promonode01
    location /metrics {
        allow  206.189.139.255;  <== Promotheus server
        deny all;
        proxy_pass           http://localhost:9100/metrics;
    }

Even if there is a request http from the prometheus server, nginx would convert to https and will redirect traffic to provide metric details to the server.
These configurations already exist on the server. 

server {
    if ($host = promonode01.devtestlabs.in) {
        return 301 https://$host$request_uri;
    } # managed by Certbot

try to hit the URL, you would be getting the Forbidden Error as it won't allow your to get the metric details.
 



Try to hit from the prometheus server using curl, you would be able to retrieve results.
curl -X GET https://promonode01.devtestlabs.in/metrics

Modify your promotheus server to scrape targets as we have configured reverse proxy
  - job_name: node_promonode01
    #
    static_configs:
      - targets: ['localhost:9100']
      - targets: ['promonode01.devtestlabs.in']


validate the config file and restart the promotheus server.
service prometheus restart
service prometheus status


 


Deleting Time Service DataBase for unused nodes

Data will be automatically deleted after the storage retention time has passed. By default it is 15 days.
If you want to delete specific data earlier, then you are able. Since we have configured a couple of new job_name in the config files it would show us a couple of wanted nodes in the graphs, we would be deleting them. 

Even though below are the 3 nodes, from the time service database shows us a couple of more graphs...

Console
go_threads{instance="localhost:9090",job="prometheus"} 9
go_threads{instance="localhost:9100",job="node"} 7
go_threads{instance="promonode01.devtestlabs.in:80",job="node"} 7

Graph
 

You need to enable the admin api in Prometheus before you can.
sudo vim /etc/default/prometheus
ARGS="--web.enable-admin-api"

Restart Prometheus and check status
sudo service prometheus restart
sudo service prometheus status


You can now make calls to the admin api.
After executing the above queries from the prometheus server, you would no longer be able to see the nodes in the graphs.


After deleting you should disable the admin api again,
sudo nano /etc/default/prometheus
Remove --web.enable-admin-api from the ARGS variable. eg,

ARGS=""

restart Prometheus and check status,

sudo service prometheus restart
sudo service prometheus status


That's so far we have on this tutorial. we would see more on the coming tutorials.



No comments:

Post a Comment