memory fixes
This commit is contained in:
195
RASPBERRY_PI_SCHEDULING_FIX.md
Normal file
195
RASPBERRY_PI_SCHEDULING_FIX.md
Normal file
@@ -0,0 +1,195 @@
|
|||||||
|
# Raspberry Pi Node Scheduling Fix - Implementation Guide
|
||||||
|
|
||||||
|
## Problem Summary
|
||||||
|
Your Raspberry Pi node (4GB RAM) keeps crashing because high-resource applications are scheduling on it instead of on nodes with more capacity.
|
||||||
|
|
||||||
|
## Root Causes Identified
|
||||||
|
1. **High-memory applications without node targeting:**
|
||||||
|
- n8n PostgreSQL: 2-4Gi memory requirements
|
||||||
|
- Minecraft server: 1-4Gi memory requirements
|
||||||
|
- OpenWebUI: 1-2Gi memory requirements
|
||||||
|
- Phoenix services: 512Mi-2Gi memory requirements
|
||||||
|
- Jellyfin: 512Mi-2Gi memory requirements
|
||||||
|
|
||||||
|
2. **Missing node selectors:** Only Gitea services target ARM64 architecture
|
||||||
|
3. **No taints/tolerations:** Raspberry Pi node isn't protected from heavy workloads
|
||||||
|
4. **Resource limits missing:** Some applications can consume unlimited resources
|
||||||
|
|
||||||
|
## Solution Applied
|
||||||
|
|
||||||
|
### Modified Files with Node Selectors (Prevent RPi Scheduling)
|
||||||
|
|
||||||
|
✅ **Updated these manifests to include `nodeSelector: hardware: high-memory`:**
|
||||||
|
|
||||||
|
1. `/n8n/postgres-deployment.yaml` - PostgreSQL (2-4Gi memory)
|
||||||
|
2. `/minecraft-server/ss.yaml` - Minecraft server (1-4Gi memory)
|
||||||
|
3. `/openwebui/openwebui.yaml` - OpenWebUI (1-2Gi memory)
|
||||||
|
4. `/phoenix/phoenix-statefulset.yaml` - Phoenix app (512Mi-2Gi memory)
|
||||||
|
5. `/phoenix/postgres-statefulset.yaml` - Phoenix PostgreSQL (256Mi-1Gi memory)
|
||||||
|
6. `/jellyfin/jellyfin.yaml` - Jellyfin media server (512Mi-2Gi memory)
|
||||||
|
7. `/monitoring/prometheus-deployment.yaml` - Prometheus (512Mi-1Gi memory)
|
||||||
|
|
||||||
|
### Implementation Steps
|
||||||
|
|
||||||
|
#### Step 1: Label and Taint Your Nodes
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 1. Identify your nodes
|
||||||
|
kubectl get nodes -o wide
|
||||||
|
|
||||||
|
# 2. Label your powerful nodes
|
||||||
|
kubectl label nodes <powerful-node-1> hardware=high-memory
|
||||||
|
kubectl label nodes <powerful-node-2> hardware=high-memory
|
||||||
|
|
||||||
|
# 3. Label your Raspberry Pi node
|
||||||
|
kubectl label nodes <raspberry-pi-node> hardware=low-memory
|
||||||
|
kubectl label nodes <raspberry-pi-node> node-type=raspberry-pi
|
||||||
|
|
||||||
|
# 4. Taint the Raspberry Pi to prevent most workloads
|
||||||
|
kubectl taint nodes <raspberry-pi-node> node-type=raspberry-pi:NoSchedule
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Step 2: Apply Updated Manifests
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Apply all updated manifests
|
||||||
|
kubectl apply -f n8n/postgres-deployment.yaml
|
||||||
|
kubectl apply -f minecraft-server/ss.yaml
|
||||||
|
kubectl apply -f openwebui/openwebui.yaml
|
||||||
|
kubectl apply -f phoenix/phoenix-statefulset.yaml
|
||||||
|
kubectl apply -f phoenix/postgres-statefulset.yaml
|
||||||
|
kubectl apply -f jellyfin/jellyfin.yaml
|
||||||
|
kubectl apply -f monitoring/prometheus-deployment.yaml
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Step 3: Force Reschedule Existing Pods
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Delete existing pods to force rescheduling on correct nodes
|
||||||
|
kubectl delete pods -n n8n -l service=postgres-n8n
|
||||||
|
kubectl delete pods -n minecraft -l app=minecraft-server
|
||||||
|
kubectl delete pods -l app=open-webui
|
||||||
|
kubectl delete pods -n phoenix -l app=phoenix
|
||||||
|
kubectl delete pods -n phoenix -l app=postgres
|
||||||
|
kubectl delete pods -n jellyfin -l app=jellyfin
|
||||||
|
kubectl delete pods -n monitoring -l app=prometheus
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Step 4: Verify Pod Scheduling
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check where pods are scheduled
|
||||||
|
kubectl get pods -o wide --all-namespaces | grep -E "(n8n|minecraft|openwebui|phoenix|jellyfin|prometheus)"
|
||||||
|
|
||||||
|
# Verify node resource usage
|
||||||
|
kubectl top nodes
|
||||||
|
|
||||||
|
# Check events for scheduling issues
|
||||||
|
kubectl get events --sort-by='.lastTimestamp' | tail -20
|
||||||
|
```
|
||||||
|
|
||||||
|
### Optional: Add Tolerations for Lightweight Services
|
||||||
|
|
||||||
|
For services that CAN run on Raspberry Pi, add tolerations:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
# Example for Pi-hole (good candidate for RPi)
|
||||||
|
spec:
|
||||||
|
template:
|
||||||
|
spec:
|
||||||
|
tolerations:
|
||||||
|
- key: "node-type"
|
||||||
|
operator: "Equal"
|
||||||
|
value: "raspberry-pi"
|
||||||
|
effect: "NoSchedule"
|
||||||
|
affinity:
|
||||||
|
nodeAffinity:
|
||||||
|
preferredDuringSchedulingIgnoredDuringExecution:
|
||||||
|
- weight: 100
|
||||||
|
preference:
|
||||||
|
matchExpressions:
|
||||||
|
- key: node-type
|
||||||
|
operator: In
|
||||||
|
values: ["raspberry-pi"]
|
||||||
|
```
|
||||||
|
|
||||||
|
**Good candidates for Raspberry Pi:**
|
||||||
|
- Pi-hole (DNS filtering)
|
||||||
|
- Home Assistant (IoT hub)
|
||||||
|
- Fava (lightweight accounting)
|
||||||
|
- Vaultwarden (password manager)
|
||||||
|
- Glance (dashboard)
|
||||||
|
|
||||||
|
### Monitoring and Validation
|
||||||
|
|
||||||
|
#### Check Resource Usage
|
||||||
|
```bash
|
||||||
|
# Monitor node resource consumption
|
||||||
|
kubectl top nodes
|
||||||
|
kubectl top pods --all-namespaces --sort-by=memory
|
||||||
|
|
||||||
|
# Check pod distribution across nodes
|
||||||
|
kubectl get pods -o wide --all-namespaces | awk '{print $8}' | sort | uniq -c
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Verify Scheduling Constraints
|
||||||
|
```bash
|
||||||
|
# Check node labels and taints
|
||||||
|
kubectl get nodes --show-labels
|
||||||
|
kubectl describe nodes | grep -E "(Name:|Taints:|Labels:)"
|
||||||
|
|
||||||
|
# Verify no high-memory pods on RPi
|
||||||
|
kubectl get pods -o wide --all-namespaces | grep <raspberry-pi-node-name>
|
||||||
|
```
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### If Pods Stay Pending
|
||||||
|
```bash
|
||||||
|
# Check why pods can't be scheduled
|
||||||
|
kubectl describe pod <pending-pod-name> -n <namespace>
|
||||||
|
|
||||||
|
# Common issues:
|
||||||
|
# - Node doesn't have required labels
|
||||||
|
# - Resource requests too high for available nodes
|
||||||
|
# - No nodes tolerate the pod's requirements
|
||||||
|
```
|
||||||
|
|
||||||
|
### If You Need to Rollback
|
||||||
|
```bash
|
||||||
|
# Remove node selectors from manifests and reapply
|
||||||
|
# Remove taints from Raspberry Pi
|
||||||
|
kubectl taint nodes <raspberry-pi-node> node-type=raspberry-pi:NoSchedule-
|
||||||
|
|
||||||
|
# Remove labels if needed
|
||||||
|
kubectl label nodes <node-name> hardware-
|
||||||
|
kubectl label nodes <node-name> node-type-
|
||||||
|
```
|
||||||
|
|
||||||
|
## Expected Results
|
||||||
|
|
||||||
|
After implementation:
|
||||||
|
1. **High-resource applications** will only schedule on powerful nodes
|
||||||
|
2. **Raspberry Pi node** will be protected from resource-heavy workloads
|
||||||
|
3. **Cluster stability** will improve with proper resource distribution
|
||||||
|
4. **Pi node crashes** should stop occurring
|
||||||
|
5. **Lightweight services** can still run on Pi (with tolerations)
|
||||||
|
|
||||||
|
## Architecture Summary
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
||||||
|
│ Powerful │ │ Powerful │ │ Raspberry Pi │
|
||||||
|
│ Node 1 │ │ Node 2 │ │ Node (4GB) │
|
||||||
|
│ │ │ │ │ │
|
||||||
|
│ • n8n Postgres │ │ • Minecraft │ │ • Pi-hole │
|
||||||
|
│ • Phoenix │ │ • OpenWebUI │ │ • Glance │
|
||||||
|
│ • Jellyfin │ │ • Prometheus │ │ • Fava │
|
||||||
|
│ • Grafana │ │ • Other apps │ │ • Vaultwarden │
|
||||||
|
│ │ │ │ │ • Home Asst │
|
||||||
|
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
||||||
|
hardware=high-mem hardware=high-mem hardware=low-mem
|
||||||
|
TAINTED (protected)
|
||||||
|
```
|
||||||
|
|
||||||
|
The Raspberry Pi is now protected while still being available for lightweight services that benefit from its unique characteristics.
|
||||||
@@ -38,6 +38,9 @@ spec:
|
|||||||
labels:
|
labels:
|
||||||
app: jellyfin
|
app: jellyfin
|
||||||
spec:
|
spec:
|
||||||
|
# Prevent scheduling on Raspberry Pi due to high resource requirements (512Mi-2Gi memory, 500m-2000m CPU)
|
||||||
|
nodeSelector:
|
||||||
|
hardware: high-memory
|
||||||
containers:
|
containers:
|
||||||
- name: jellyfin
|
- name: jellyfin
|
||||||
image: jellyfin/jellyfin:latest
|
image: jellyfin/jellyfin:latest
|
||||||
|
|||||||
@@ -13,6 +13,9 @@ spec:
|
|||||||
labels:
|
labels:
|
||||||
app: minecraft-server
|
app: minecraft-server
|
||||||
spec:
|
spec:
|
||||||
|
# Prevent scheduling on Raspberry Pi due to high resource requirements (1Gi-4Gi memory, 1-2 CPU)
|
||||||
|
nodeSelector:
|
||||||
|
hardware: high-memory
|
||||||
containers:
|
containers:
|
||||||
- name: minecraft-server
|
- name: minecraft-server
|
||||||
image: itzg/minecraft-server:latest # Or specific version if needed
|
image: itzg/minecraft-server:latest # Or specific version if needed
|
||||||
|
|||||||
@@ -15,6 +15,9 @@ spec:
|
|||||||
labels:
|
labels:
|
||||||
app: prometheus
|
app: prometheus
|
||||||
spec:
|
spec:
|
||||||
|
# Prevent scheduling on Raspberry Pi due to resource requirements (512Mi-1Gi memory, 500m-1000m CPU)
|
||||||
|
nodeSelector:
|
||||||
|
hardware: high-memory
|
||||||
serviceAccountName: prometheus
|
serviceAccountName: prometheus
|
||||||
containers:
|
containers:
|
||||||
- name: prometheus
|
- name: prometheus
|
||||||
|
|||||||
@@ -20,6 +20,9 @@ spec:
|
|||||||
labels:
|
labels:
|
||||||
service: postgres-n8n
|
service: postgres-n8n
|
||||||
spec:
|
spec:
|
||||||
|
# Prevent scheduling on Raspberry Pi due to high memory requirements (2-4Gi)
|
||||||
|
nodeSelector:
|
||||||
|
hardware: high-memory
|
||||||
containers:
|
containers:
|
||||||
- image: postgres:18
|
- image: postgres:18
|
||||||
name: postgres
|
name: postgres
|
||||||
|
|||||||
45
node-management-commands.md
Normal file
45
node-management-commands.md
Normal file
@@ -0,0 +1,45 @@
|
|||||||
|
# Node Management Commands for Raspberry Pi Scheduling Issues
|
||||||
|
|
||||||
|
## 1. Taint the Raspberry Pi Node (Recommended Approach)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Find your Raspberry Pi node name
|
||||||
|
kubectl get nodes -o wide
|
||||||
|
|
||||||
|
# Taint the Raspberry Pi node to prevent scheduling (except for tolerating pods)
|
||||||
|
kubectl taint nodes <raspberry-pi-node-name> node-type=raspberry-pi:NoSchedule
|
||||||
|
|
||||||
|
# Alternative: Use a more descriptive taint
|
||||||
|
kubectl taint nodes <raspberry-pi-node-name> hardware=low-memory:NoSchedule
|
||||||
|
```
|
||||||
|
|
||||||
|
## 2. Label Nodes for Better Targeting
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Label your Raspberry Pi node
|
||||||
|
kubectl label nodes <raspberry-pi-node-name> node-type=raspberry-pi
|
||||||
|
kubectl label nodes <raspberry-pi-node-name> hardware=low-memory
|
||||||
|
|
||||||
|
# Label your more powerful nodes
|
||||||
|
kubectl label nodes <powerful-node-1> node-type=worker
|
||||||
|
kubectl label nodes <powerful-node-1> hardware=high-memory
|
||||||
|
kubectl label nodes <powerful-node-2> node-type=worker
|
||||||
|
kubectl label nodes <powerful-node-2> hardware=high-memory
|
||||||
|
```
|
||||||
|
|
||||||
|
## 3. Verify Node Configuration
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check node labels and taints
|
||||||
|
kubectl describe nodes
|
||||||
|
|
||||||
|
# See which nodes have what resources available
|
||||||
|
kubectl describe nodes | grep -A 5 "Allocatable"
|
||||||
|
```
|
||||||
|
|
||||||
|
## 4. Remove Taint if Needed
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Remove the taint if you need to rollback
|
||||||
|
kubectl taint nodes <raspberry-pi-node-name> node-type=raspberry-pi:NoSchedule-
|
||||||
|
```
|
||||||
@@ -25,6 +25,9 @@ spec:
|
|||||||
labels:
|
labels:
|
||||||
app: open-webui
|
app: open-webui
|
||||||
spec:
|
spec:
|
||||||
|
# Prevent scheduling on Raspberry Pi due to high resource requirements (1Gi-2Gi memory, 1-2 CPU)
|
||||||
|
nodeSelector:
|
||||||
|
hardware: high-memory
|
||||||
volumes:
|
volumes:
|
||||||
- name: webui-data
|
- name: webui-data
|
||||||
persistentVolumeClaim:
|
persistentVolumeClaim:
|
||||||
|
|||||||
@@ -42,6 +42,9 @@ spec:
|
|||||||
labels:
|
labels:
|
||||||
app: phoenix
|
app: phoenix
|
||||||
spec:
|
spec:
|
||||||
|
# Prevent scheduling on Raspberry Pi due to high resource requirements (512Mi-2Gi memory, 500m-2000m CPU)
|
||||||
|
nodeSelector:
|
||||||
|
hardware: high-memory
|
||||||
initContainers:
|
initContainers:
|
||||||
- name: wait-for-postgres
|
- name: wait-for-postgres
|
||||||
image: busybox:1.36
|
image: busybox:1.36
|
||||||
|
|||||||
@@ -33,6 +33,9 @@ spec:
|
|||||||
labels:
|
labels:
|
||||||
app: postgres
|
app: postgres
|
||||||
spec:
|
spec:
|
||||||
|
# Prevent scheduling on Raspberry Pi due to resource requirements (256Mi-1Gi memory, 250m-1000m CPU)
|
||||||
|
nodeSelector:
|
||||||
|
hardware: high-memory
|
||||||
containers:
|
containers:
|
||||||
- name: postgres
|
- name: postgres
|
||||||
image: postgres:16
|
image: postgres:16
|
||||||
|
|||||||
79
raspberry-pi-tolerations-examples.yaml
Normal file
79
raspberry-pi-tolerations-examples.yaml
Normal file
@@ -0,0 +1,79 @@
|
|||||||
|
# Examples of tolerations for services that SHOULD run on Raspberry Pi
|
||||||
|
# These services have low resource requirements and can benefit from Pi-specific features
|
||||||
|
|
||||||
|
# 1. Pi-hole - Perfect for Raspberry Pi (DNS filtering, network service)
|
||||||
|
---
|
||||||
|
apiVersion: apps/v1
|
||||||
|
kind: Deployment
|
||||||
|
metadata:
|
||||||
|
name: pihole
|
||||||
|
spec:
|
||||||
|
template:
|
||||||
|
spec:
|
||||||
|
# Allow scheduling on Raspberry Pi
|
||||||
|
tolerations:
|
||||||
|
- key: "node-type"
|
||||||
|
operator: "Equal"
|
||||||
|
value: "raspberry-pi"
|
||||||
|
effect: "NoSchedule"
|
||||||
|
# Prefer Raspberry Pi for network services
|
||||||
|
affinity:
|
||||||
|
nodeAffinity:
|
||||||
|
preferredDuringSchedulingIgnoredDuringExecution:
|
||||||
|
- weight: 100
|
||||||
|
preference:
|
||||||
|
matchExpressions:
|
||||||
|
- key: node-type
|
||||||
|
operator: In
|
||||||
|
values: ["raspberry-pi"]
|
||||||
|
|
||||||
|
# 2. Home Assistant - May benefit from running on Pi for local device access
|
||||||
|
---
|
||||||
|
apiVersion: apps/v1
|
||||||
|
kind: Deployment
|
||||||
|
metadata:
|
||||||
|
name: home-assistant
|
||||||
|
namespace: home-assistant
|
||||||
|
spec:
|
||||||
|
template:
|
||||||
|
spec:
|
||||||
|
# Allow scheduling on Raspberry Pi (good for IoT hub role)
|
||||||
|
tolerations:
|
||||||
|
- key: "node-type"
|
||||||
|
operator: "Equal"
|
||||||
|
value: "raspberry-pi"
|
||||||
|
effect: "NoSchedule"
|
||||||
|
# Prefer Raspberry Pi for home automation
|
||||||
|
affinity:
|
||||||
|
nodeAffinity:
|
||||||
|
preferredDuringSchedulingIgnoredDuringExecution:
|
||||||
|
- weight: 80
|
||||||
|
preference:
|
||||||
|
matchExpressions:
|
||||||
|
- key: node-type
|
||||||
|
operator: In
|
||||||
|
values: ["raspberry-pi"]
|
||||||
|
|
||||||
|
# 3. Lightweight services (Fava, Vaultwarden, Glance)
|
||||||
|
---
|
||||||
|
apiVersion: apps/v1
|
||||||
|
kind: Deployment
|
||||||
|
metadata:
|
||||||
|
name: lightweight-service-example
|
||||||
|
spec:
|
||||||
|
template:
|
||||||
|
spec:
|
||||||
|
# Allow scheduling on Raspberry Pi for lightweight workloads
|
||||||
|
tolerations:
|
||||||
|
- key: "node-type"
|
||||||
|
operator: "Equal"
|
||||||
|
value: "raspberry-pi"
|
||||||
|
effect: "NoSchedule"
|
||||||
|
# No preference - let scheduler decide based on resource availability
|
||||||
|
resources:
|
||||||
|
requests:
|
||||||
|
memory: "64Mi"
|
||||||
|
cpu: "50m"
|
||||||
|
limits:
|
||||||
|
memory: "512Mi"
|
||||||
|
cpu: "500m"
|
||||||
118
validate-scheduling.sh
Executable file
118
validate-scheduling.sh
Executable file
@@ -0,0 +1,118 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
# Raspberry Pi K3s Scheduling Validation Script
|
||||||
|
# Run this to check your cluster configuration and pod distribution
|
||||||
|
|
||||||
|
echo "=== Kubernetes Node Analysis ==="
|
||||||
|
echo
|
||||||
|
|
||||||
|
echo "1. Node Overview:"
|
||||||
|
kubectl get nodes -o wide
|
||||||
|
echo
|
||||||
|
|
||||||
|
echo "2. Node Resource Capacity:"
|
||||||
|
kubectl describe nodes | grep -A 5 "Allocatable:"
|
||||||
|
echo
|
||||||
|
|
||||||
|
echo "3. Node Labels and Taints:"
|
||||||
|
kubectl get nodes --show-labels
|
||||||
|
echo
|
||||||
|
kubectl describe nodes | grep -E "(Name:|Taints:)" | grep -A 1 "Name:"
|
||||||
|
echo
|
||||||
|
|
||||||
|
echo "=== Pod Distribution Analysis ==="
|
||||||
|
echo
|
||||||
|
|
||||||
|
echo "4. High-Resource Pods Location:"
|
||||||
|
echo "Checking where memory-intensive applications are scheduled..."
|
||||||
|
echo
|
||||||
|
|
||||||
|
echo "n8n PostgreSQL pods:"
|
||||||
|
kubectl get pods -n n8n -o wide | grep postgres || echo "No n8n postgres pods found"
|
||||||
|
echo
|
||||||
|
|
||||||
|
echo "Minecraft server pods:"
|
||||||
|
kubectl get pods -n minecraft -o wide || echo "No minecraft pods found"
|
||||||
|
echo
|
||||||
|
|
||||||
|
echo "OpenWebUI pods:"
|
||||||
|
kubectl get pods -o wide | grep open-webui || echo "No OpenWebUI pods found"
|
||||||
|
echo
|
||||||
|
|
||||||
|
echo "Phoenix pods:"
|
||||||
|
kubectl get pods -n phoenix -o wide || echo "No Phoenix pods found"
|
||||||
|
echo
|
||||||
|
|
||||||
|
echo "Jellyfin pods:"
|
||||||
|
kubectl get pods -n jellyfin -o wide || echo "No Jellyfin pods found"
|
||||||
|
echo
|
||||||
|
|
||||||
|
echo "Prometheus pods:"
|
||||||
|
kubectl get pods -n monitoring -o wide | grep prometheus || echo "No Prometheus pods found"
|
||||||
|
echo
|
||||||
|
|
||||||
|
echo "=== Resource Usage ==="
|
||||||
|
echo
|
||||||
|
|
||||||
|
echo "5. Current Node Resource Usage:"
|
||||||
|
kubectl top nodes 2>/dev/null || echo "Metrics server not available - install with: kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml"
|
||||||
|
echo
|
||||||
|
|
||||||
|
echo "6. Top Memory-Consuming Pods:"
|
||||||
|
kubectl top pods --all-namespaces --sort-by=memory 2>/dev/null | head -10 || echo "Metrics server not available"
|
||||||
|
echo
|
||||||
|
|
||||||
|
echo "=== Pod Events (Recent Issues) ==="
|
||||||
|
echo
|
||||||
|
|
||||||
|
echo "7. Recent Pod Scheduling Events:"
|
||||||
|
kubectl get events --all-namespaces --sort-by='.lastTimestamp' | grep -E "(Failed|Error|Warning)" | tail -10
|
||||||
|
echo
|
||||||
|
|
||||||
|
echo "=== Validation Summary ==="
|
||||||
|
echo
|
||||||
|
|
||||||
|
# Count pods per node
|
||||||
|
echo "8. Pod Distribution Per Node:"
|
||||||
|
echo "Node Pod Count"
|
||||||
|
echo "------------------------|---------"
|
||||||
|
kubectl get pods --all-namespaces -o wide --no-headers | awk '{print $8}' | sort | uniq -c | awk '{printf "%-24s| %s\n", $2, $1}'
|
||||||
|
echo
|
||||||
|
|
||||||
|
echo "=== Recommendations ==="
|
||||||
|
echo
|
||||||
|
|
||||||
|
# Check if any high-resource pods are on wrong nodes
|
||||||
|
echo "9. Checking for Potential Issues:"
|
||||||
|
|
||||||
|
# Get Raspberry Pi node name (assumes it has 'pi' in the name or is ARM64)
|
||||||
|
RPI_NODE=$(kubectl get nodes -o jsonpath='{.items[?(@.status.nodeInfo.architecture=="arm64")].metadata.name}' | head -1)
|
||||||
|
|
||||||
|
if [ -n "$RPI_NODE" ]; then
|
||||||
|
echo "Detected Raspberry Pi node: $RPI_NODE"
|
||||||
|
|
||||||
|
# Check if high-resource pods are on RPi
|
||||||
|
HIGH_MEM_PODS=$(kubectl get pods --all-namespaces -o wide | grep "$RPI_NODE" | grep -E "(postgres|minecraft|phoenix|jellyfin|prometheus|openwebui)")
|
||||||
|
|
||||||
|
if [ -n "$HIGH_MEM_PODS" ]; then
|
||||||
|
echo "⚠️ WARNING: High-resource pods found on Raspberry Pi node:"
|
||||||
|
echo "$HIGH_MEM_PODS"
|
||||||
|
echo
|
||||||
|
echo "These pods should be moved to more powerful nodes."
|
||||||
|
else
|
||||||
|
echo "✅ Good: No high-resource pods detected on Raspberry Pi node."
|
||||||
|
fi
|
||||||
|
else
|
||||||
|
echo "ℹ️ Could not auto-detect Raspberry Pi node. Please check manually."
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo
|
||||||
|
echo "=== Next Steps ==="
|
||||||
|
echo
|
||||||
|
echo "If you see high-resource pods on your Raspberry Pi node:"
|
||||||
|
echo "1. Apply the node labels: kubectl label nodes <powerful-node> hardware=high-memory"
|
||||||
|
echo "2. Apply the taint: kubectl taint nodes <rpi-node> node-type=raspberry-pi:NoSchedule"
|
||||||
|
echo "3. Apply updated manifests with nodeSelectors"
|
||||||
|
echo "4. Delete problematic pods to force rescheduling"
|
||||||
|
echo
|
||||||
|
echo "See RASPBERRY_PI_SCHEDULING_FIX.md for detailed instructions."
|
||||||
Reference in New Issue
Block a user