Seeing the final picture made me think of something: one of my "life hacks" is to not accept cables that are too long. I used to think the longer the better, and just coil it up or something. But long cables just get in the way and collect dust etc.
If something is going to be considered permanent, cut that cable down to length. Either buy shorter moulded cables, or learn to wire cables yourself. Too often have I left a cable long "just in case" only for it to get in the way for years to come.
For patch cables it's easiest and best to buy moulded cables that fit your rack. For things like power cables (extension leads etc.) it's easiest to wire them yourself (at least, in the UK where our plugs are designed to be wired by anyone).
I am very much on board with this line of thinking. Because things are still somewhat in flux, it was much easier to plan for excess cabling and have a place for that cabling to live in the rack so that things can be moved if needed. I'll probably re-cable it with cut-to-length cables in the future.
One thing that I haven't found a solution for though: I have a lot of USB and HDMI cable coiled up behind the Beelink boxes (for KVM connectivity). I've found the normal length cables (1', 3', 6', etc), but I haven't been able to find custom length cables for those specific connections. Do you happen to know anywhere I can find those?
On the other hand, I did this, cut my cables, and after I needed to reorganise things slightly, it was very difficult, even a centimetre was at luxury. Also, when I need to move a computer for some reason, there’s no room, at all. These days, I’m trying to leave at least some extra cm (usually that’s an inch or two, depending on their location) for that. I’d do very tight cable cut only for situations when I’m super sure nothing will ever move. Again, even then, I’d rather leave some extra inch, just in case.
For those not keeping count, total hardware spend is in the 13k-20k USD ballpark, by my count.
The thing that I like about this post is that it touches on many of the difficulties of running a homelab with many physical hosts. You might not need all or most of this setup, but at least you have an idea of how this particular design (a decent one!) scales after reading this.
- Array of off-the-shell compute servers with console access + networking + power
- ArgoCD + GitOps makes a K8 cluster declarative
- Talos makes the physical hosts that provide the K8 cluster be declarative
- Dedicated machines for storage, Control Plan, and Networking isolate the infrequently-changing stateful parts
This homelab does seem compute focused, which might be right for OP but is normally a mistake that people make when they build their first homelab. I'm wondering what OP's home internet bandwidth is. It seems odd to have so much compute behind a network bottleneck unless OP has high-compute-small-output workloads (ml training, data searching/analysis, NOT video encoding)
A single machine with a lot of CPU and a LOT of memory/storage is typically what people want---so that projects they're setting up are fast and having lots of old/idling projects is fine. My old recommendation was a mini-ITX with 128 GB of ram and a modern AMD cpu should take most people very far. Perhaps a smaller NUC/Beelink computer if you're not storage hungry.
However, keep in mind that a single machine will make it hard to tinker with large parts of the stack. It's harder to play with kernel mod settings if you need to constantly reboot all your projects and possibly nuke your lab. It's harder to test podman vs docker if in involves turning off all your contains. A more complicated home-lab gives you more surface area to tinker with. That's both more fun and makes you a better engineer. Of course, you can get most of this experience for far less money if you budget isn't quite as generous.
I personally prefer a digital nomad aesthetic, so I focus on small & simple on-prem hosts paired with cloud stacks. I'm willing to pay a premium on compute to have less commitment and physical footprint. I've been considering setting up a K8 cluster on Hetzner dedicated machines. In my case, that Mini-ITX box is actually a storage-optimized ATX build for backing up my laptop (daily-driver) and media server.
> This homelab does seem compute focused, which might be right for OP but is normally a mistake that people make when they build their first homelab.
I kept waiting for the description of what it would be used for, but there was only a passing reference to learning how to run AI workloads.
For some people, buying and assembling hardware is the hobby. This gets old fast, especially when you add up how much was spent on hardware that's now sitting idle while it becomes more outdated year over year.
I agree that for typical learning cases the best solution is a single, cheap consumer CPU paired with a lot of RAM. For the $8000 spent on those 8 mini PCs, you could build a 256GB RAM box with 2 or even 3 nVidia 5090 GPUs and be in a different league of performance. It's also much easier to resell big nVidia consumer GPUs and recoup some of your money for the next upgrade.
It does look fun to assemble all of this into a rack and make it all work together. However, it's an extremely expensive means to an end. If you just want to experiment with distributed systems you can pair 128GB of RAM with a 16-core consumer CPU and run dozens or even 100 small VMs without issue. If you want to do GPU work you can even use PCIe passthrough to assign GPUs to VMs.
> For those not keeping count, total hardware spend is in the 13k-20k USD ballpark, by my count.
Yep! Right around the $13.5k mark.
> This homelab does seem compute focused, which might be right for OP but is normally a mistake that people make when they build their first homelab.
Very compute focused for the specific projects that I intend to work on.
> I'm wondering what OP's home internet bandwidth is. It seems odd to have so much compute behind a network bottleneck unless OP has high-compute-small-output workloads (ml training, data searching/analysis, NOT video encoding)
1gbps symmetric + a failover Starlink connection. Not a ton of large output workloads at the moment.
> However, keep in mind that a single machine will make it hard to tinker with large parts of the stack.
Very much in agreement here. This is one of the reasons I went with multiple machines.
> I'm willing to pay a premium on compute to have less commitment and physical footprint.
I also like this mindset, but my other hobbies are piano (and I'm very sensitive to the way that the keys are weighted, so I prefer playing a real grand piano vs a portable/mini electronic piano) and woodworking (even more bulky equipment), so I'm already pretty committed to putting down roots.
In my experience, yes and no. Homelabs seem work because they let you experiment while striving towards goals (normally hosting something). That means that the organic problems that arise are the best teachers, since you're honestly motivated to overcome them.
Getting your homelab to boot up and run containers is an honest problem to solve. Figuring out the kernel modules to let you pass through your GPU or run ZFS on root are actual blockers to hosting a gitlab instance.
Running gitlab on multiple nodes to get high-availability is an honest problem to solve. Trying to do it in multiple VMs just to see how it works might teach you something, but it can feel pointless unless it's serving a real goal. I think that choosing a more complicated setup is good because it is hard, and forces you to learn more to achieve the same goal (and ultimately, some of those skills will hopefully be useful).
Additionally, there used to be some limitations on consumer CPUs around nested virtualization, which made it difficult to run VMs in VMs. When I was hosting apps in VMs on a machine, I would want to play around with the machine's configs, but risked disrupting my hosted apps. If I broke something, I didn't have my hosted apps until I got around to fixing my machine. Having multiple machines ensured that I could tinker with one while relying on the other. The process, by accident, gave me an intuition for planning upgrades to infrastructure. This intuition bubbles back into the design process.
I don't often own infrastructure for multiple upgrade cycles professionally, so it is a good way to earn some wisdom on my own.
With the way Ubiquity has treated their software stack on the network side in the past years (major bugs, regressions, and updates that had to be reissued multiple times), I wouldn't trust them with all my data. Ubiquiti's QA was outsourced to the customers and a NAS is the last place where I want to risk bad updates, no matter how many backups I have.
Synology has shady business practices more recently and outdated tech stack since forever but I haven't heard anything particularly bad about reliability. For a NAS the safety of the data is the highest priority. Anything that endangers the data isn't just a drawback, or a "con", it's an instant elimination. Right now I wouldn't trust Ubiquiti NAS to store my recycle bin, I need to see a long track record of reliability and a commitment to quality.
This is definitely one of the purchasing decisions that I regret. My backups are robust and trustworthy enough that I don't have data loss concerns, but the software is atrocious and the customizability is extremely limited.
e.g. I wanted to serve tftp directly from the NAS. I can log in and `apt install tftpd-hpa`, but that package has to be reinstalled every time the NAS updates.
I'll be replacing this in the medium term, but I'm not buying more hardware for a little while lol
Yea, but, like owning a car you soup up in your garage, it isn't about what you _need_, it's about what's fun and what's enough to give you something to do in your free time.
If we're going with the car analogy, this is like buy 8 Miatas to keep in your driveway instead of combing all of the money into a single much faster car.
If the goal is to have a lot of something so you can play with many different things, this gets the job done. If the goal is high performance and maximum efficiency for learning or compute, a setup with dozens of smaller computers like this is usually not the optimal choice.
It now only consists of a Intel n100 with a big SSD and 32GB RAM running Proxmox. These China TopTon-boxed with their 5x Intel i226-IV network cards are great and can be passively cooled.
Every night the Proxmox makes a backup onto a Raspberry Pi which runs the Proxmox Backup Server.
I've read that PBS requires fast (NVME-fast) storage and a decent CPU to handle incremental backups efficiently. What speeds do you get when restoring backups?
Yup. Mine rates 80W on idle, but it has _two_ NAS boxes, a pair of N150s, a Ryzen APU Steam server, and a beefy i7 with 128GB RAM and a 3060 (which of course is the one burning the most watts in use). Most of it is running Proxmox and backing up VMs to the Synology (which I've demoted to dumb storage with a few Docker containers since they started burning bridges with their customers).
Space and energy costs are a factor for me too. My current Mac Mini SMB is really good but DAS consumes lot of power. Ideally, I would really love Mikrotik's RB5009UPr+S+IN next iteration to have antennas and a 4-Bay Rose Data Server merged. Gateway, Switch, Access Point, PoE for CCTV and NAS - all in one.
This is awesome and I wish more of this were happening. Hardware home labs are the best way to learn. I gained most of my Linux/FreeBSD skills at home.
It feels with cloud computing a generation of computer scientists kind of missed out on the experience.
Enjoyed this. Lots of large homelabs like this are built in order to stream video or local llm models and that usually leaves me feeling a bit left out because I have been building my own, but have no interest in either of those things.
Some services I am interested in are hosting my own RSS feed reader, an ebook library, and a password manager. But I'm always looking for more if there are any suggestions.
Well there's loads - Nextcloud/Syncthing to replace Dropbox, Forgejo to replace Github, Joplin on the back of Nextcloud or similar for note taking.... personal wiki, todo list, email with webmail (Roundcube), home video camera management, etc etc
> I am interested in are hosting my own RSS feed reader, an ebook library, and a password manager
You can do that on a Raspberry Pi Zero for $15, and for $12 you can get a 128gb microsd card, plenty of storage. It'll take up minimal power and fit in an Altoid tin.
If something is going to be considered permanent, cut that cable down to length. Either buy shorter moulded cables, or learn to wire cables yourself. Too often have I left a cable long "just in case" only for it to get in the way for years to come.
For patch cables it's easiest and best to buy moulded cables that fit your rack. For things like power cables (extension leads etc.) it's easiest to wire them yourself (at least, in the UK where our plugs are designed to be wired by anyone).
One thing that I haven't found a solution for though: I have a lot of USB and HDMI cable coiled up behind the Beelink boxes (for KVM connectivity). I've found the normal length cables (1', 3', 6', etc), but I haven't been able to find custom length cables for those specific connections. Do you happen to know anywhere I can find those?
The thing that I like about this post is that it touches on many of the difficulties of running a homelab with many physical hosts. You might not need all or most of this setup, but at least you have an idea of how this particular design (a decent one!) scales after reading this.
- Array of off-the-shell compute servers with console access + networking + power
- ArgoCD + GitOps makes a K8 cluster declarative
- Talos makes the physical hosts that provide the K8 cluster be declarative
- Dedicated machines for storage, Control Plan, and Networking isolate the infrequently-changing stateful parts
This homelab does seem compute focused, which might be right for OP but is normally a mistake that people make when they build their first homelab. I'm wondering what OP's home internet bandwidth is. It seems odd to have so much compute behind a network bottleneck unless OP has high-compute-small-output workloads (ml training, data searching/analysis, NOT video encoding)
A single machine with a lot of CPU and a LOT of memory/storage is typically what people want---so that projects they're setting up are fast and having lots of old/idling projects is fine. My old recommendation was a mini-ITX with 128 GB of ram and a modern AMD cpu should take most people very far. Perhaps a smaller NUC/Beelink computer if you're not storage hungry.
However, keep in mind that a single machine will make it hard to tinker with large parts of the stack. It's harder to play with kernel mod settings if you need to constantly reboot all your projects and possibly nuke your lab. It's harder to test podman vs docker if in involves turning off all your contains. A more complicated home-lab gives you more surface area to tinker with. That's both more fun and makes you a better engineer. Of course, you can get most of this experience for far less money if you budget isn't quite as generous.
I personally prefer a digital nomad aesthetic, so I focus on small & simple on-prem hosts paired with cloud stacks. I'm willing to pay a premium on compute to have less commitment and physical footprint. I've been considering setting up a K8 cluster on Hetzner dedicated machines. In my case, that Mini-ITX box is actually a storage-optimized ATX build for backing up my laptop (daily-driver) and media server.
I kept waiting for the description of what it would be used for, but there was only a passing reference to learning how to run AI workloads.
For some people, buying and assembling hardware is the hobby. This gets old fast, especially when you add up how much was spent on hardware that's now sitting idle while it becomes more outdated year over year.
I agree that for typical learning cases the best solution is a single, cheap consumer CPU paired with a lot of RAM. For the $8000 spent on those 8 mini PCs, you could build a 256GB RAM box with 2 or even 3 nVidia 5090 GPUs and be in a different league of performance. It's also much easier to resell big nVidia consumer GPUs and recoup some of your money for the next upgrade.
It does look fun to assemble all of this into a rack and make it all work together. However, it's an extremely expensive means to an end. If you just want to experiment with distributed systems you can pair 128GB of RAM with a 16-core consumer CPU and run dozens or even 100 small VMs without issue. If you want to do GPU work you can even use PCIe passthrough to assign GPUs to VMs.
Future posts will address some of this. :)
Yep! Right around the $13.5k mark.
> This homelab does seem compute focused, which might be right for OP but is normally a mistake that people make when they build their first homelab.
Very compute focused for the specific projects that I intend to work on.
> I'm wondering what OP's home internet bandwidth is. It seems odd to have so much compute behind a network bottleneck unless OP has high-compute-small-output workloads (ml training, data searching/analysis, NOT video encoding)
1gbps symmetric + a failover Starlink connection. Not a ton of large output workloads at the moment.
> However, keep in mind that a single machine will make it hard to tinker with large parts of the stack.
Very much in agreement here. This is one of the reasons I went with multiple machines.
> I'm willing to pay a premium on compute to have less commitment and physical footprint.
I also like this mindset, but my other hobbies are piano (and I'm very sensitive to the way that the keys are weighted, so I prefer playing a real grand piano vs a portable/mini electronic piano) and woodworking (even more bulky equipment), so I'm already pretty committed to putting down roots.
Getting your homelab to boot up and run containers is an honest problem to solve. Figuring out the kernel modules to let you pass through your GPU or run ZFS on root are actual blockers to hosting a gitlab instance.
Running gitlab on multiple nodes to get high-availability is an honest problem to solve. Trying to do it in multiple VMs just to see how it works might teach you something, but it can feel pointless unless it's serving a real goal. I think that choosing a more complicated setup is good because it is hard, and forces you to learn more to achieve the same goal (and ultimately, some of those skills will hopefully be useful).
Additionally, there used to be some limitations on consumer CPUs around nested virtualization, which made it difficult to run VMs in VMs. When I was hosting apps in VMs on a machine, I would want to play around with the machine's configs, but risked disrupting my hosted apps. If I broke something, I didn't have my hosted apps until I got around to fixing my machine. Having multiple machines ensured that I could tinker with one while relying on the other. The process, by accident, gave me an intuition for planning upgrades to infrastructure. This intuition bubbles back into the design process.
I don't often own infrastructure for multiple upgrade cycles professionally, so it is a good way to earn some wisdom on my own.
With the way Ubiquity has treated their software stack on the network side in the past years (major bugs, regressions, and updates that had to be reissued multiple times), I wouldn't trust them with all my data. Ubiquiti's QA was outsourced to the customers and a NAS is the last place where I want to risk bad updates, no matter how many backups I have.
e.g. I wanted to serve tftp directly from the NAS. I can log in and `apt install tftpd-hpa`, but that package has to be reinstalled every time the NAS updates.
I'll be replacing this in the medium term, but I'm not buying more hardware for a little while lol
Cool, but nothing a single compute machine wouldn't get done with a bunch of VM's if learning and home workload are the focus.
This thing probably idles at a couple hundred watts.
It should be obvious to the reader that this is very much overkill, even for the stated goals of expandability and learning.
If the goal is to have a lot of something so you can play with many different things, this gets the job done. If the goal is high performance and maximum efficiency for learning or compute, a setup with dozens of smaller computers like this is usually not the optimal choice.
It now only consists of a Intel n100 with a big SSD and 32GB RAM running Proxmox. These China TopTon-boxed with their 5x Intel i226-IV network cards are great and can be passively cooled.
Every night the Proxmox makes a backup onto a Raspberry Pi which runs the Proxmox Backup Server.
Just reading of the logs:
- Backup duration: 3.21GiB in 36s
- Restoring a Snapshot: feels like <<1 minute
Next step will be most likekly a VDSL-modem + something from Ubiquiti as the new Fritz! product portfolio is... a weird mess.
It feels with cloud computing a generation of computer scientists kind of missed out on the experience.
Some services I am interested in are hosting my own RSS feed reader, an ebook library, and a password manager. But I'm always looking for more if there are any suggestions.
Check out https://github.com/awesome-selfhosted/awesome-selfhosted
You can do that on a Raspberry Pi Zero for $15, and for $12 you can get a 128gb microsd card, plenty of storage. It'll take up minimal power and fit in an Altoid tin.
Yet. :)