Homelab Kubernetes Platform (RKE2 + Rancher)

A Proxmox-hosted RKE2 cluster operated via Rancher, exposing internal services through ingress-nginx + Pi-hole DNS and increasingly managed via GitOps (Kustomize + Argo CD). Built to practice real-world platform ops: networking, ingress, storage topology, rollouts, backups, and runbook documentation.

Proxmox VELinuxKubernetes (RKE2)Rancheringress-nginxArgo CDKustomizePi-holeHomepageWiki.jsUptime Kuma

Overview

I run a production-style homelab Kubernetes platform on Proxmox using RKE2, with Rancher as the primary operator interface. Internal services are exposed via ingress-nginx and resolved through Pi-hole (*.homelab) so apps have stable hostnames instead of NodePorts.

Workloads and infrastructure are managed as code with Kustomize, and I’ve introduced Argo CD to move toward a true GitOps workflow: desired state lives in Git, changes are reviewed and reproducible, and cluster drift becomes visible and correctable.

Stateful services use PVC-backed persistence, with storage constraints (node-local affinity under local-path) explicitly documented. I validate platform patterns with small, real deployments (including a Train → Store → Serve “MLOps lab” workload) and capture runbooks and troubleshooting notes in an internal wiki.

What this demonstrates

Practical Kubernetes operations: deployments, services, ingress, PVCs, probes, rollouts
“Platform plumbing”: DNS + ingress as the stable interface for internal services
Storage topology awareness: designing around node-local persistence (local-path, WaitForFirstConsumer)
Rancher-first visibility while keeping infra as code
GitOps foundations: Kustomize structure + Argo CD reconciliation, diffs, and controlled sync
Operability: backups, verification steps, rollback thinking, documentation discipline
Isolated testing and sandboxing (namespaces and scoped projects)

(I keep workloads small and reproducible—this isn't a "datacenter at home," it's a controlled learning environment.)

Operational Practices

Reliability & recovery

Proxmox snapshots before risky changes
Clear "rollback" mindset when upgrading
Backups for configuration and critical data (plus restore testing)

Observability

Track host + VM resource usage (CPU/RAM/disk)
Use dashboards as "truth surfaces" for debugging instead of guessing

Share this project

RAG Lab: Local LLM + Vector Search

Kubernetes (RKE2), Argo CD, Kustomize...

MLOps Lab: Argo CD