Monitoring Stack
Prometheus is an open-source monitoring and alerting tool designed to collect metrics from various sources, store them efficiently, and enable querying and visualization of these metrics for monitoring purposes.
- Key Features:
- Data Collection
- Metrics Storage
- Query Language
- Alerting
- Service Discovery
- Visualizations
- Exporters
The kube-prometheus-stack is pre-configured to gather metrics from all Kubernetes components, making it ideal for cluster monitoring. It provides a standard set of dashboards and alerting rules, many of which originate from the kubernetes-mixin project.
Components of the kube-prometheus-stack
- Prometheus Operator: Manages Prometheus instances in your DOKS cluster.
- Grafana: Visualizes metrics and plots data using dashboards.
- Alertmanager: Configures notifications (e.g., PagerDuty, Slack, email) based on alerts received from the Prometheus server.
Data Gathering
Prometheus uses a pull model, expecting services to expose a /metrics
endpoint for scraping. A time series database stores the data points for each metric that Prometheus retrieves.
Grafana facilitates data collection from the Prometheus time series database and allows you to create stunning graphs organized into dashboards. You can also run queries using the PromQL language. Make sure to allocate block storage for both Prometheus and Grafana instances using Persistent Volumes (PVs) to persist all data (metrics and settings).
Alerting
Alerts sent by client programs like the Prometheus server are managed by the Alertmanager component. It handles deduplication, grouping, and routing to the appropriate receiver integration, such as email, PagerDuty, or Slack. Additionally, it manages alert inhibition and silencing.
Documentation
For more information, please refer to the official documentation for each component:
- Prometheus: Overview of features and configuration options.
- Prometheus Operator: Useful information on using the operator.
- Alertmanager: Learn about Alertmanager and its integrations with various notification platforms.
Getting Started after Deploying the Kubernetes Monitoring Stack
Please refer to this document to connect your Kubernetes cluster:
Once you have set up your cluster using kubectl, proceed with executing the following commands.
helm repo add jetstack prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
How to check if Prometheus monitoring stack is running
helm ls -n kube-prometheus-stack
Verify if Prometheus stack pods are up and running:**
kubectl get pods -n kube-prometheus-stack
Accessing Prometheus Web Panel:
You can access the Prometheus web console by port forwarding the kube-prometheus-stack-prometheus
service:
kubectl port-forward svc/kube-prometheus-stack-prometheus -n kube-prometheus-stack 9090
- Next, launch a web browser of your choice, and enter the following URL: http://localhost:9090. To see what targets were discovered by Prometheus, please navigate to http://localhost:9090/targets
Accessing Grafana web panel
- You can connect to Grafana by port forwarding the prometheus-grafana service:
kubectl port-forward svc/prometheus-grafana 3000:80 -n kube-prometheus-stack
-
Default Credentials:
- User: admin
- Password: prom-operator
- Please change your password after first login (Recommended).
-
Next, launch a web browser of your choice, and enter the following URL: http://localhost:3000. You can take a look around and see what dashboards are available for you to use from the kubernetes-mixin project as an example, by navigating to the following URL: http://localhost:3000/dashboards?tag=kubernetes-mixin.
Tweaking Helm Chart Values
-
The kube-prometheus-stack provides some custom values to start with. Please have a look at the values file from the main GitHub repository (explanations are provided inside, where necessary).
-
You can always inspect all the available options, as well as the default values for the kube-prometheus-stack Helm chart by running the following command:
helm show values prometheus-community/kube-prometheus-stack
- After tweaking the Helm values file (
values.yml
) according to your needs, you can always apply the changes via thehelm upgrade
command, as shown below:
helm upgrade prometheus prometheus-community/kube-prometheus-stack \
--version 48.6.0 --namespace kube-prometheus-stack --values values.yml
- If you want to run a service on a public ip then you can edit that specific service with the below command:**
kubectl edit svc {service_name} -n kube-prometheus-stack
- Change the type from ‘ClusterIP’ to ‘LoadBalancer’; the available public IP gets automatically attached to the service. You can then use the IP and respective port of your choice to access the Prometheus web panel publicly.
Configuring Service Monitors for Prometheus
-
To monitor applications in your cluster, you typically define a ServiceMonitor CRD. This is a custom resource definition provided by the Prometheus Operator, which helps you add new services that need to be monitored.
-
A typical ServiceMonitor configuration looks like the below:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: example-app
labels:
team: frontend
spec:
selector:
matchLabels:
app: example-app
endpoints:
- port: web
-
Explanations for the above configuration:
-
spec.selector.matchLabels.app: Tells ServiceMonitor which application to monitor, based on a label.
-
spec.endpoints.port: A reference to the port label used by the application that needs monitoring.
-
-
The
kube-prometheus-stack
Helm values file provided in the GitHub marketplace repository contains a dedicated section (namedadditionalServiceMonitors
) where you can define a list of additional services to monitor. Below snippet sets up Nginx Ingress Controller monitoring as an example:
additionalServiceMonitors:
- name: "ingress-nginx-monitor"
selector:
matchLabels:
app.kubernetes.io/name: ingress-nginx
namespaceSelector:
matchNames:
- ingress-nginx
endpoints:
- port: "metrics"
- After adding required services to monitor, you need to upgrade the stack via the helm upgrade command, in order to apply the changes:
helm upgrade prometheus prometheus-community/kube-prometheus-stack
--version 48.6.0 --namespace kube-prometheus-stack --values values.yml
You can also check the full list of available `CRDs <https://github.com/prometheus-operator/prometheus-operator#customresourcedefinitions>`_ which you can use to control the
Prometheus Operator, by visiting the official GitHub documentation page
Upgrading Kubernetes Prometheus Stack
- You can check what versions are available to upgrade by navigating to the
kube-prometheus-stack
official releases page from GitHub. Alternatively, you can also useArtifactHUB
, which provides a more rich and user-friendly interface.
helm upgrade prometheus prometheus-community/kube-prometheus-stack \
--version <KUBE_PROMETHEUS_STACK_NEW_VERSION> --namespace kube-prometheus-stack \
--values <YOUR_HELM_VALUES_FILE>
Replace KUBE_PROMETHEUS_STACK_NEW_VERSION with appropriate version
Replace YOUR_HELM_VALUES_FILE with your appropriate values file
See `helm upgrade <https://helm.sh/docs/helm/helm_upgrade/>`_ for command documentation.
Also, please make sure to check the official recommendations for
various `upgrade paths <https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack#upgrading-chart>`_, from an existing release to a new major
version of the Prometheus stack.
Uninstalling Kubernetes Prometheus Stack
helm uninstall prometheus -n kube-prometheus-stack
- To delete namespace please use below command:
kubectl delete ns kube-prometheus-stack
- CRDs created by this chart are not removed by default and should be manually cleaned up:
kubectl delete crd alertmanagerconfigs.monitoring.coreos.com
kubectl delete crd alertmanagers.monitoring.coreos.com
kubectl delete crd podmonitors.monitoring.coreos.com
kubectl delete crd probes.monitoring.coreos.com
kubectl delete crd prometheusagents.monitoring.coreos.com
kubectl delete crd prometheuses.monitoring.coreos.com
kubectl delete crd prometheusrules.monitoring.coreos.com
kubectl delete crd scrapeconfigs.monitoring.coreos.com
kubectl delete crd servicemonitors.monitoring.coreos.com
kubectl delete crd thanosrulers.monitoring.coreos.com