There are many arguments for and against building a homelab, a search on reddit and medium can give you a swarm of other people’s setup and experience.
But in my case, I have 3 compelling reasons:
1. Having a “real” K8s cluster to play with
Yes, you can spin up a local cluster with minikube or k3d, but I see them mainly for CI or local testing kinds of use cases. To me, a “cluster” is supposed to have multiple nodes, connected over local network, and any one of them can fail at any time. If it is too simple, you don’t call it K8s, right?
2. Not breaking the bank when it comes to computing power
Cloud is not cheap. Yes cloud vendors always tell you how building your own data center can be 100x more expensive than signing a multi-year VM deal. But for hobby use cases like homelab, the total cost of running a handful of $50 Raspberry Pi 24/7 is equivalent to running EC2s of similar spec for… 2 or 3 months? Of course, if we use cloud we can spin up and tear down clusters on-demand (Terraform FTW!), freeing everything up when we’re done playing. But waiting for 30min to spin up a K8s cluster when you need one is just damn annoying 😅.
3. Playing around with latest technologies
Work is supposed to be boring. I mean the tech stacks not the actual work, of course :). When it comes to cluster networking, who would choose the shiniest eBPF-based solution when you can just pick the “battle-tested” Istio, right? Troubleshooting and operating eBPF-based solutions are non-trivial at this moment, but I believe they will mature at some point in the future, and that’s one of the reason why you want a homelab to experiment with latest techs.
Many subtle details bubbled up
As an engineer, we are supposed to pay attention to details when building systems, I think no one will ever disagree. However, when we are at work, we tend to focus on the business problem rather than the inner details of technologies. When your company is new to Kafka and needs to deploy a Kafka cluster, will they give you a week to study how Kafka internals work? Very likely they don’t, because “running Kafka” is not really part of the business, so anything that “works” is fine, and you move on to the next task.
When you are packing so many applications in a handful of Raspberry Pi, many issues can appear, CPU throttling is just one of them. If you encounter this at work, probably your first intuition is to “add more workers” and call it a day. If you aren’t running at Netflix or Google’s scale, then throwing money to the problem makes perfect sense. But you don’t have this option at home, you don’t buy 1 more Pi to solve every single problem, and instead work on improving resource efficiency, dive deeper into kernel internals, and understand systems better. This is not something we can always afford at work, but in the process of running a homelab I find myself more exposed to subtle details I have never encountered at work.
Build multi-architecture cluster
ARM is really becoming the industry trend during past 2 years, from AWS’ Graviton to Apple’s M1. Using ARM-based processor is not only because they are “cheaper”, but they are also sometimes “faster” because of their simpler architecture.
However, not every application is “ARM-ready”. Yes you can build ARM container images yourself or even re-compile applications to support ARM64, but some applications contain native C++ code that is going to require a rewrite or re-factor (e.g. falco, and some eBPF-based application).
While ARM64 should be the default (Raspberry Pi FTW!!), it would make sense to add 1–2 x86 SBC to the cluster, to run those applications without ARM64 support (for now). For me I used Seeed Studio’s Odyssey SBC, but basically any one with an Intel-based and power-friendly CPU can do. Having a multi-arch cluster gives you more flexibility, when a workload is ARM-ready, just run them on the Pi! (Hint: use K8s node-selector!), when it’s not, you still have the option of running it on one of your x86 node!
Don’t use SD card as storage
Raspberry Pi by default uses SD card as main storage, but they are pretty darn slow when it comes to running K8s. If you are “serious”, you should consider booting from SSD instead (It’s not hard, I will make a detailed post later).
You don’t have to run everything
Yes we build a homelab cluster to “run stuffs”, but sometimes using a SaaS is much better than wasting our precious computing capacity. For example, if you need a distributed tracing solution, you can run Jaeger on your own of course, but that would mean setting up ElasticSearch as well, which is quite resource-intensive, and perhaps a better solution is to use a SaaS like honeycomb.io (their free tier is quite generous!). Answering the question of “whether to run something” should follow your role and interest, “Is it something I need to know in detail for my career?”, perhaps as an DevOps engineer I am less concerned about “how to run Prometheus”, as long as I get monitoring done.
Homelab doesn’t mean no cloud
You are still gonna need the cloud even if you have a homelab. At the very least maybe you’ll need S3 for backup, running ML jobs with GPU instances on the cloud or even running another cluster on the cloud (ArgoCD for multi-cluster!). All those Terraform stuffs are still relevant! Of course, when using the cloud, pay attention to data transfer cost, design your cluster dataflow so that you don’t often send gigabytes of data from the cloud to your home.
So is it a time+money sink?
Honestly speaking, it costs quite some investment in money and time. It’s not just about the hardware costs (the servers + SSDs, bunch of network equipment, etc), but also time to troubleshoot issues (I still remember spending nights fixing broken ingress 😭). But after all I think running a homelab solidifies my system knowledge, at least in interview I can tell people “how I fit xxxx into a bunch of Raspberry Pis”.
Ciao and good luck with your homelab journey!