Files
k3s-cluster/monitoring/grafana-deployment.yaml
Roger Oriol bf1387dc3e monitoring: add Grafana dashboards + kube-state-metrics & node-exporter
Dashboards (provisioned via ConfigMaps into Grafana pod, 'K3s Cluster' folder):
- Cluster Overview: per-namespace CPU/mem/net/fs, pod counts, pod health (KSM)
- Pods & Services: per-pod CPU/mem/net/fs, throttling, pod status, restarts, PVCs
- Nodes: per-node CPU%/mem%, load average, disk usage, network (node-exporter)
- Control Plane & API Server: request rate, latency p95, 5xx, kubelet/PLEG
- Prometheus Self-Monitoring: ingestion, series, scrape duration, memory

Exporters (auto-scraped via existing kubernetes-service-endpoints job):
- kube-state-metrics: pod/deployment/PVC/replica state (kube_pod_status_phase,
  kube_pod_container_status_restarts_total, kube_persistentvolumeclaim_*)
- node-exporter (DaemonSet, hostNetwork): node_cpu_seconds_total,
  node_memory_*, node_filesystem_*, node_load*, node_network_*
2026-06-26 19:48:17 +02:00

80 lines
2.4 KiB
YAML

apiVersion: apps/v1
kind: Deployment
metadata:
name: grafana
namespace: monitoring
labels:
app: grafana
spec:
replicas: 1
selector:
matchLabels:
app: grafana
template:
metadata:
labels:
app: grafana
spec:
containers:
- name: grafana
image: grafana/grafana:10.2.3
ports:
- containerPort: 3000
name: http
env:
- name: GF_SECURITY_ADMIN_USER
value: "admin"
- name: GF_SECURITY_ADMIN_PASSWORD
value: "admin"
- name: GF_INSTALL_PLUGINS
value: ""
volumeMounts:
- name: grafana-storage
mountPath: /var/lib/grafana
- name: grafana-datasources
mountPath: /etc/grafana/provisioning/datasources
- name: grafana-dashboard-provider
mountPath: /etc/grafana/provisioning/dashboards
- name: dashboards-cluster-overview
mountPath: /var/lib/grafana/dashboards/cluster-overview
- name: dashboards-pods
mountPath: /var/lib/grafana/dashboards/pods
- name: dashboards-nodes
mountPath: /var/lib/grafana/dashboards/nodes
- name: dashboards-control-plane
mountPath: /var/lib/grafana/dashboards/control-plane
- name: dashboards-prometheus
mountPath: /var/lib/grafana/dashboards/prometheus
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
volumes:
- name: grafana-storage
persistentVolumeClaim:
claimName: grafana-storage
- name: grafana-datasources
configMap:
name: grafana-datasources
- name: grafana-dashboard-provider
configMap:
name: grafana-dashboard-provider
- name: dashboards-cluster-overview
configMap:
name: grafana-dashboard-cluster-overview
- name: dashboards-pods
configMap:
name: grafana-dashboard-pods
- name: dashboards-nodes
configMap:
name: grafana-dashboard-nodes
- name: dashboards-control-plane
configMap:
name: grafana-dashboard-control-plane
- name: dashboards-prometheus
configMap:
name: grafana-dashboard-prometheus