Summary

For sometime I have been in need not only to study for my CKA but to find an easier way to deploy new clusters, and test new applications. Before this endeavour the best option was to use K3d, and this worked well for trying out applications but if I wanted to play with a new CNI I was out of luck. The next option was to deploy a cluster to VirtualBox, which again, works but my desktop only has so many resources available to it and the setup even with vagrant was far from bullet proof. As a result it was time to move on to finding a solution for the lab. The goal was to find a solution to my cluster building woes, I wanted it to be easy to get a cluster up and running with as little friction as possible, but I didn’t want a cluster with a bunch of caveats, this needed to be a full cluster with all the bells and whistles.

For a short while my lab has been running a bare metal RKE2 cluster, and I have switched back and forth from bare metal Rocky + RKE2, and Proxmox + Rocky + RKE2, both of these deployment models have their pros and cons. Bare metal deployments have the advantage of being very light on resources and just simple. The advantage to only maintaining 5-7 nodes can’t be overstated, the lab is supposed to be a fun place to learn new skills and host some useful programs, and a slew of servers and VMs all needing to be maintained can take the fun out of things real quick. The disadvantage being things are terribly rigid, I can play with Kubernetes and Kubernetes only, maybe some VMs with KubeVirt and that’s about it. Want to just try out a new k8s distro, or try out a new CNI? Well where do I intend on putting said new cluster? There’s also the potential for making mistakes and now I have down time…….well that’s not much fun.

The other choice is to fall back to a hypervisor. The benefits are obvious to many, want a new cluster? Make one! Did you learn what you set out to learn? Cool, kill it, now everything in production stays clean and safe. The obvious loss here being we are back to having to maintain a slew of servers and VMs, and with all of that comes increased overhead. One option to avoid all this is to look into Kamaji, and while I like the idea, its purpose is to deploy the control-plane into the cluster not the entire cluster, and even if I could I would still need kube-virt or vCluster and once again the complexity begins to grow.

Potential Solutions

The main goal is to rebuild the lab into an environment more flexible than what I have currently. I want the ability to deploy clusters simply, and quickly, these must be capable of being long lived and fully featured clusters with storage, and load balancers. Integration with Rancher is not a requirement but would be nice.

Proxmox

I began looking into potential solutions, Proxmox was a natural choice as I have used it off and on for nearly a decade, it’s tried and true and a good hypervisor without the risk of a take over probably………cough VMWare cough. In the past Proxmox worked well however the method of deploying clusters was slow an laborious. The process worked but had a fair few moving pieces, the workflow was roughly:

  1. Create template VMs via packer
  2. Deploy cluster VMs via Terraform (Provisioned via cloud-init)
  3. Template Ansible inventory file from Terraform
  4. Provision cluster with RKE2 Ansible play/roles
  5. Determine if data will be long lived
    1. If so: Cut a pool from Ceph for cluster access
    2. If not: Install Longhorn prequistes

Again, the process worked and was fine overall but I want to find something a little easier, and a little different while also learning something new. Proxmox is a fine option, but its worth seeing what else is out there.

VMWare

No, great product, but the cheap lab subscription they offer(ed?) can’t be trusted and I don’t need to have my home lab held hostage by a subscription. The cost benefit analysis does not make sense here. I do wish I could give a better reason here but having my lab held hostage to a subscription change is not worth the risk, so this alone cuts VMWare out.

Harvester

As you can tell from the title this is the route I ended up going. There are a fair few advantages to going with Harvester, it offers me cluster creation via the (external) Rancher UI, or Terraform. Harvester also comes with the ability to deploy the Harvester Cloud Provider, this then gives you the option to take advantage of Harvesters load balancer (kube-vip), and Harvesters builtin CSI (Longhorn). This essentially gets me everything I needed. I can deploy and destroy clusters, have all the bells and whistles, play with new CNIs, practice for my CKA, everything, overall this was a somewhat obvious choice.

There are some drawbacks however, it is without a doubt the least mature option available. I still want to have a life so there is some apprehension around if the platform is stable enough to actually use. The good part of the platform is that it is based on other open source projects, the Kubernetes distribution is RKE2, VMs are KubeVirt, the load balancer is kube-vip, the storage is Longhorn (a Rancher project), with a base OS of SLE-Micro (SUSE). So some projects are really SUSE’s projects (remember Rancher is owned by SUSE). Nearly all of these I already have extensive experience with, RKE2, and Longhorn especially. My primary concern is with KubeVirt as I have no experience with VM workloads in Kubernetes so there may be a learning curve here when issues come up (hint, hint, they do).

Runner Ups

  • Bare Metal
  • XCP-NG