Homelab Kubernetes Platform (RKE2 + Rancher)

A Proxmox-hosted RKE2 cluster operated via Rancher, exposing internal services through ingress-nginx + Pi-hole DNS and increasingly managed via GitOps (Kustomize + Argo CD). Built to practice real-world platform ops: networking, ingress, storage topology, rollouts, backups, and runbook documentation.

Proxmox VELinuxKubernetes (RKE2)Rancheringress-nginxArgo CDKustomizePi-holeHomepageWiki.jsUptime Kuma

Overview

I run a production-style homelab Kubernetes platform on Proxmox using RKE2, with Rancher as the primary operator interface. Internal services are exposed via ingress-nginx and resolved through Pi-hole (*.homelab) so apps have stable hostnames instead of NodePorts.

Workloads and infrastructure are managed as code with Kustomize, and I’ve introduced Argo CD to move toward a true GitOps workflow: desired state lives in Git, changes are reviewed and reproducible, and cluster drift becomes visible and correctable.

Stateful services use PVC-backed persistence, with storage constraints (node-local affinity under local-path) explicitly documented. I validate platform patterns with small, real deployments (including a Train → Store → Serve “MLOps lab” workload) and capture runbooks and troubleshooting notes in an internal wiki.

Homepage View

What this demonstrates

  • Practical Kubernetes operations: deployments, services, ingress, PVCs, probes, rollouts
  • “Platform plumbing”: DNS + ingress as the stable interface for internal services
  • Storage topology awareness: designing around node-local persistence (local-path, WaitForFirstConsumer)
  • Rancher-first visibility while keeping infra as code
  • GitOps foundations: Kustomize structure + Argo CD reconciliation, diffs, and controlled sync
  • Operability: backups, verification steps, rollback thinking, documentation discipline
  • Isolated testing and sandboxing (namespaces and scoped projects)

(I keep workloads small and reproducible—this isn't a "datacenter at home," it's a controlled learning environment.)

Operational Practices

Reliability & recovery

  • Proxmox snapshots before risky changes
  • Clear "rollback" mindset when upgrading
  • Backups for configuration and critical data (plus restore testing)

Observability

  • Track host + VM resource usage (CPU/RAM/disk)
  • Use dashboards as "truth surfaces" for debugging instead of guessing

Share this project

Share: