Current: 2025-03-01
Summary
The primary purpose of this page is to detail the infrastructure in use. This is useful as a resource when a post explaining some issue leaves out details that may be of value to the post. Every post should detail any relevant infrastructure but at times this could be missed, as a result this post should hopefully resolve any gaps left out.
Hardware
Network Hardware
Only relevant devices are displayed, “leaf” switches and APs are excluded.
Device | Role |
---|---|
Unifi UXG Pro | Firewall |
Unifi USW Pro 24 PoE | Switch |
Unifi USW Aggregation | 10 Gig Switch |
Server Hardware
Host | CPU | RAM | HDDs | SSDs | NICs |
---|---|---|---|---|---|
pve-01 | i5-10400 | 128GB | 3 x 8TB | 1 x 250GB, 1 x 1TB | 1 x 1Gb, 1 x 10Gb |
pve-02 | i5-10400 | 128GB | 3 x 8TB | 1 x 250GB, 1 x 1TB | 1 x 1Gb, 1 x 10Gb |
pve-03 | i5-10400 | 128GB | 3 x 8TB | 1 x 250GB, 1 x 1TB | 1 x 1Gb, 1 x 10Gb |
pve-04 | i5-10400 | 128GB | 3 x 8TB | 1 x 250GB, 1 x 1TB | 1 x 1Gb, 1 x 10Gb |
Minio | i3-14100 | 128GB | 8 x 10TB | 1 x 1TB | 1 x 10Gb |
Network
Physical Layout (L1)
Logical Layout (L2/3)
Servers
Proxmox
PVE-01 through PVE-04 are all Proxmox 8.x. Each Proxmox server uses the single 250GB NVME SSD for the OS, the 1TB NVME SSD for VM storage, and the 8TB HDDs are all consumed by Ceph as OSD’s. This leaves me with very limited storage for VMs but the trade off is lots of distributed storage via Ceph. Nearly all data exists in my Kubernetes clusters and as such Ceph is what backs each clusters storage, this gives me ~96TB of available Ceph storage.
Minio
The Minio server is a single node running a rootless Podman Minio container on top of Rocky 9, all data is backed by a ZFS RAIDZ-2 pool. More info on how I built this server can be found here: Backup Server Woes. In total I am left with 54TB of available backup storage.
Application Clusters
The lab consists of three RKE2 clusters:
- Rancher
- Internal
- DMZ
Each cluster is built via Terraform from VM templates generated by Packer. Each VM is Rocky 9 and is a minimal install with qemu-guest-agent
and cloud-init
installed. Each cluster has its own access to an RBD pool namespace in Ceph, while the Internal and DMZ clusters each get access to their own additional CephFS subvolume group, like so:
The previous design I used had a Ceph RBD and Ceph FS for each cluster. This means I had three Ceph pools in total per cluster, so this redeployment simplifies the Ceph cluster significantly. On top of this a new deployment allowed me to add missing image features to the RBD Ceph storage class. This time I enabled layering,exclusive-lock,object-map,fast-diff
, now RBD images will show storage usage, like so:
Rancher Cluster
The Rancher cluster is effectively a C&C cluster, if you are familiar with Rancher MCM this is not surprising. I have added a couple extra deployments to the Rancher cluster though, mainly:
- Harbor
- ArgoCD
I like using Harbor as an image proxy for the most part, while I no longer have an ISP with a datacap it does help prevent getting rate limited from time to time by Dockerhub. In addition to that I generally opt for ArgoCD as my GitOps tool of choice, so I once again have opted for that here and this ArgoCD instance is responsible for all three clusters.
Internal Cluster
This is a smaller cluster mostly consisting of applications of dubious security, no need to be exposed, or simply more sensitive.
DMZ Cluster
Finally the DMZ cluster is the primary cluster for the lab. It exists in its own network as one would expect.