# Monitoring Stack [Prometheus](https://prometheus.io/) is an open-source monitoring and alerting tool designed to collect metrics from various sources, store them efficiently, and enable querying and visualization of these metrics for monitoring purposes. - **Key Features:** - Data Collection - Metrics Storage - Query Language - Alerting - Service Discovery - Visualizations - Exporters The [kube-prometheus-stack](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/) is pre-configured to gather metrics from all Kubernetes components, making it ideal for cluster monitoring. It provides a standard set of dashboards and alerting rules, many of which originate from the [kubernetes-mixin](https://github.com/kubernetes-monitoring/kubernetes-mixin/) project. ## Components of the kube-prometheus-stack 1. **Prometheus Operator**: Manages Prometheus instances in your cluster. 2. **Grafana**: Visualizes metrics and plots data using dashboards. 3. **Alertmanager**: Configures notifications (e.g., PagerDuty, Slack, email) based on alerts received from the Prometheus server. ### Data Gathering Prometheus uses a pull model, expecting services to expose a `/metrics` endpoint for scraping. A time series database stores the data points for each metric that Prometheus retrieves. Grafana facilitates data collection from the Prometheus time series database and allows you to create stunning graphs organized into dashboards. You can also run queries using the PromQL language. ### Alerting Alerts sent by client programs like the Prometheus server are managed by the [Alertmanager](https://github.com/prometheus/alertmanager/) component. It handles deduplication, grouping, and routing to the appropriate receiver integration, such as email, PagerDuty, or Slack. ### Documentation For more information, please refer to the official documentation for each component: - [Prometheus](https://prometheus.io/docs/introduction/overview/): Overview of features and configuration options. - [Prometheus Operator](https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/user-guides/getting-started.md): Useful information on using the operator. - [Alertmanager](https://prometheus.io/docs/alerting/latest/alertmanager/): Learn about Alertmanager and its integrations with various notification platforms. --- ## Setup Guide This guide walks you through deploying Prometheus and Grafana on your E2E Kubernetes cluster with secure HTTPS access via NGINX Ingress and Let's Encrypt certificates. **Please refer to this document to connect your Kubernetes cluster first:** * [How to Download kubeconfig.yaml File](/docs/myaccount/kubernetes/#how-to-download-kubeconfigyaml-file) ### Step 1: Install Ingress Controller The Ingress controller allows external access to services running inside your cluster. ```bash helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx helm repo update helm install ingress-nginx ingress-nginx/ingress-nginx \ --namespace ingress-nginx \ --create-namespace ``` After installation, obtain the public IP: ```bash kubectl get svc -n ingress-nginx ``` > **Note:** The NGINX Ingress Controller is scheduled for retirement in March 2026. > For new production deployments, consider using the > [Kubernetes Gateway API](./kubernetes_gateway_api) instead. ### Step 2: Configure DNS Records Create DNS A records pointing to the Ingress public IP: | Type | Name | Value | |------|------|-------| | A | prometheus | `` | | A | grafana | `` | DNS changes may take a few minutes to propagate. ### Step 3: Install Prometheus and Grafana ```bash kubectl create namespace monitoring helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm repo update helm install monitoring prometheus-community/kube-prometheus-stack \ -n monitoring ``` Verify pods are running: ```bash kubectl get pods -n monitoring ``` ### Step 4: Enable SSL (HTTPS) To secure access, SSL certificates are automatically issued using Let's Encrypt. **Install cert-manager:** ```bash kubectl apply -f https://github.com/cert-manager/cert-manager/releases/latest/download/cert-manager.crds.yaml helm repo add jetstack https://charts.jetstack.io helm repo update helm install cert-manager jetstack/cert-manager \ -n cert-manager \ --create-namespace ``` Verify: ```bash kubectl get pods -n cert-manager ``` ### Step 5: Create SSL Issuer Create `cluster-issuer.yaml`: ```yaml apiVersion: cert-manager.io/v1 kind: ClusterIssuer metadata: name: letsencrypt-prod spec: acme: email: your-email@yourdomain.com server: https://acme-v02.api.letsencrypt.org/directory privateKeySecretRef: name: letsencrypt-prod solvers: - http01: ingress: class: nginx ``` Apply the configuration: ```bash kubectl apply -f cluster-issuer.yaml kubectl get clusterissuer ``` ### Step 6: Expose Monitoring Using a Single Ingress Create one Ingress resource for both Prometheus and Grafana. Create `monitoring-ingress.yaml`: ```yaml apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: monitoring-ingress namespace: monitoring annotations: kubernetes.io/ingress.class: nginx cert-manager.io/cluster-issuer: letsencrypt-prod spec: tls: - hosts: - prometheus.yourdomain.com - grafana.yourdomain.com secretName: monitoring-tls rules: - host: prometheus.yourdomain.com http: paths: - path: / pathType: Prefix backend: service: name: monitoring-kube-prometheus-prometheus port: number: 9090 - host: grafana.yourdomain.com http: paths: - path: / pathType: Prefix backend: service: name: monitoring-grafana port: number: 80 ``` ```bash kubectl apply -f monitoring-ingress.yaml ``` Verify the Ingress: ```bash kubectl get ingress -n monitoring ``` ### Step 7: Access Monitoring Services Once SSL is issued, access the services using your browser: - **Prometheus:** `https://prometheus.yourdomain.com` - **Grafana:** `https://grafana.yourdomain.com` **Grafana default login:** - **Username:** `admin` - **Password:** Retrieve using the command below: ```bash kubectl --namespace monitoring get secrets monitoring-grafana \ -o jsonpath="{.data.admin-password}" | base64 -d ; echo ``` ### Step 8: Import a Sample Dashboard Grafana provides ready-made dashboards for Kubernetes. 1. Log in to Grafana 2. Select **Dashboards → Import** 3. Enter **Dashboard ID:** `15661` 4. Select **Prometheus** as the data source 5. Click **Import** You will now see cluster-level metrics including node resource overview, CPU/memory usage, and network traffic. --- ## Advanced Configuration ### Port Forwarding Access (Alternative) If you prefer to access monitoring without Ingress, use port forwarding: ```bash # List services kubectl get svc -n monitoring # Access Prometheus kubectl port-forward svc/ -n monitoring 9090:9090 ``` Navigate to `http://localhost:9090`. To see discovered targets: `http://localhost:9090/targets` ```bash # Access Grafana kubectl port-forward svc/monitoring-grafana 3000:80 -n monitoring ``` Navigate to `http://localhost:3000`. ### Configuring ServiceMonitors for Prometheus To monitor applications in your cluster, define a `ServiceMonitor` CRD. This custom resource is provided by the Prometheus Operator and allows you to add new services for monitoring. A typical ServiceMonitor configuration: ```yaml apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: example-app labels: team: frontend spec: selector: matchLabels: app: example-app endpoints: - port: web ``` - **`spec.selector.matchLabels.app`**: Tells ServiceMonitor which application to monitor, based on a label. - **`spec.endpoints.port`**: A reference to the port label used by the application that needs monitoring. The `kube-prometheus-stack` Helm values file contains an `additionalServiceMonitors` section where you can define additional services. Example for NGINX Ingress Controller monitoring: ```yaml additionalServiceMonitors: - name: "ingress-nginx-monitor" selector: matchLabels: app.kubernetes.io/name: ingress-nginx namespaceSelector: matchNames: - ingress-nginx endpoints: - port: "metrics" ``` After adding services to monitor, upgrade the stack to apply changes: ```bash helm upgrade monitoring prometheus-community/kube-prometheus-stack \ --namespace monitoring \ -f values.yaml ``` ### Tweaking Helm Chart Values Inspect all available options and default values for the kube-prometheus-stack Helm chart: ```bash helm show values prometheus-community/kube-prometheus-stack ``` After tweaking the Helm values file (`values.yaml`) according to your needs, apply the changes: ```bash helm upgrade monitoring prometheus-community/kube-prometheus-stack \ --namespace monitoring \ --version 48.6.0 \ -f values.yaml ``` > **Security Note:** Exposing monitoring services publicly via LoadBalancer is not recommended without proper access controls. Ensure the service is protected using authentication, an ingress controller with TLS, or restrict access to trusted IP ranges. Prefer private networking or VPN-based access where possible. --- ## Upgrading Kubernetes Prometheus Stack Check available versions on the [`kube-prometheus-stack`](https://github.com/prometheus-community/helm-charts/releases) releases page or on [ArtifactHUB](https://artifacthub.io/packages/helm/prometheus-community/kube-prometheus-stack). ```bash helm upgrade monitoring prometheus-community/kube-prometheus-stack \ --version \ --namespace monitoring \ --values ``` Replace `KUBE_PROMETHEUS_STACK_NEW_VERSION` with the target version and `YOUR_HELM_VALUES_FILE` with your values file path. For command documentation, see [helm upgrade](https://helm.sh/docs/helm/helm_upgrade/). Please check the official recommendations for various [upgrade paths](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack#upgrading-chart) from an existing release to a new major version. --- ## Uninstalling Kubernetes Prometheus Stack ```bash helm uninstall monitoring -n monitoring ``` Delete the namespace: ```bash kubectl delete ns monitoring ``` CRDs created by this chart are not removed by default and should be manually cleaned up: ```bash kubectl delete crd alertmanagerconfigs.monitoring.coreos.com kubectl delete crd alertmanagers.monitoring.coreos.com kubectl delete crd podmonitors.monitoring.coreos.com kubectl delete crd probes.monitoring.coreos.com kubectl delete crd prometheusagents.monitoring.coreos.com kubectl delete crd prometheuses.monitoring.coreos.com kubectl delete crd prometheusrules.monitoring.coreos.com kubectl delete crd scrapeconfigs.monitoring.coreos.com kubectl delete crd servicemonitors.monitoring.coreos.com kubectl delete crd thanosrulers.monitoring.coreos.com ``` ---