I cannot overstate the performance improvement of deploying onto bare metal. We typically see a doubling of performance, as well as extremely predictable baseline performance.
This is down to several things:
- Latency - having your own local network, rather than sharing some larger datacenter network fabric, gives around of order of magnitude reduced latency
- Caches – right-sizing a deployment for the underlying hardware, and so actually allowing a modern CPU to do its job, makes a huge difference
- Disk IO – Dedicated NVMe access is _fast_.
And with it comes a whole bunch of other benefits:
- Auto-scalers becomes less important, partly because you have 10x the hardware for the same price, partly because everything runs 2x the speed anyway, and partly because you have a fixed pool of hardware. This makes the whole system more stable and easier to reason about.
- No more sweating the S3 costs. Put a 15TB NVMe drive in each server and run your own MinIO/Garage cluster (alongside your other workloads). We're doing about 20GiB/s sustained on a 10 node cluster, 50k API calls per second (on S3 that is $20-$250 _per second_ on API calls!).
- You get the same bill every month.
- UPDATE: more benefits - cheap fast storage, run huge Postgresql instances at minimal cost, less engineering time spend working around hardware limitations and cloud vagaries.
And, if chose to invest in the above, it all costs 10x less than AWS.
Pitch: If you don't want to do this yourself, then we'll do it for you for half the price of AWS (and we'll be your DevOps team too):
Yup, I hope to god we are moving past the age of 'everything's fast if you have enough machines' and 'money is not real' era of software development.
I remember the point in my career when I moved from a cranky old .NET company, where we handled millions of users from a single cabinent's worth of beefy servers, to a cloud based shop where we used every cloud buzzword tech under the sun (but mainly everything was containerized node microservices).
I shudder thinking back to the eldritch horrors I saw on the cloud billing side, and the funny thing is, we were constantly fighting performance problems.
Tangential point but why is it that so often these leaving the cloud posts use the word "beefy" to describe the servers? It's always you don't need cloud because beefy servers handle pretty much any bla bla bla
My conspiracy theory is that "cloud scaling" was entirely driven by people who grew up watching sites get slash dotted and thought it was the absolute most important thing in the world that you can quickly scale up to infinity billion requests/second.
>to a cloud based shop where we used every cloud buzzword tech under the sun
this is me currently, not so much "tech" as repackaged cloud services as SAAS and middleware, department is amortizing hours and effort against projects, the eldritch horrors you're referring to are starting to manifest and I want out.
However old, .NET Framework was still using a decent JIT compiler with a statically typed language, powerful GC and a fully multi-threaded environment :)
Of course Node could not compete, and the cost had to be paid for each thinly sliced microservice carrying heavy runtime alongside it.
Over the years I tried occasionally to look into cloud, but it never made sense. A lot of complexity and significantly higher cost, for very low performance and a promise of "scalability". You virtually never need scalability so fast that you don't have time to add another server - and at baremetal costs, you're usually about a year ahead of the curve anyways.
As an infrastructure engineer (amongst other things), hard disagree here. I realize you might be joking, but a bit of context here: a big chunk of the success of Cloud in more traditional organizations is the agility that comes with it: (almost) no need to ask permission to anyone, ownership of your resources, etc. There is no reason that baremetal shouldn't provide the same customer-oriented service, at least for the low-level IaaS, give-me-a-VM-now needs. I'd even argue this type of self-service (and accounting!) should be done by any team providing internal software services.
I think there is a generational part as well. The ones of us that are now deep in our 40s or 50s grew up professionally in a self-hosted world, and some of us are now in decision-making positions, so we don't necessarily have to take the cloud pill anymore :)
As a career security guy, I've lost count of the battles I've lost in the race to the cloud...now it's 'we have to up the budget $250k a year to cover costs' and you just shrug.
The cost for your first on-prem datacenter server is pretty steep...the cost for the second one? Not so much.
It's not really. It just happens that when there is a huge bullshit hype out there, people that fall for it regret and come back to normal after a while.
Better things are still better. And this one was clearly only better for a few use-cases that most people shouldn't care about since the beginning.
I've left a job because it was impossible to explain this to an ex-Googler on the board who just couldn't stop himself from trying to be a CTO and clownmaker at the company.
The rough part was that we had made hardware investments and spent almost a year setting up the system for HA and immediate (i.e. 'low-hanging fruit') performance tuning and should have turned to architectural and more subtle improvements. This was a huge achievement for a very small team that had neither the use nor the wish to go full clown.
Ya but then you need to pay for a team to maintain network and continually secure and monitor the server and update/patch. The salaries of those professionals , really only make sense for a certain sized organization.
I still think small-midsized orgs may be better off in cloud for security / operations cost optimization.
This implies cloud infrastructure experts are cheaper than bare metal Linux/networking/etc experts. Probably in most smaller organizations, you have the people writing the code manage the infra, so it's an "invisible cost", but ime, it's easy to outgrow this and need someone to keep cloud costs in check within a couple of years, assuming you are growing as fast as an average start-up.
I very much understand this, and that is why we do what we do. Lots of companies feel exactly as you say. I.e. Sure it is cheaper and 'better', but we'll pay for it in salaries and additional incurred risk (what happens if we invest all this time and fail to successfully migrate?)
This is why we decided to bundle engineering time with the infrastructure. We'll maintain the cluster as you say, and with the time left over (the majority) we'll help you with all your other DevOps needs too (CI/CD pipelines, containerising software, deploying HA Valkey, etc). And even after all that, it still costs less than AWS.
Edit: We also take on risk with the migration – our billing cycle doesn't start until we complete the migration. This keeps our incentives aligned.
That used to be the case until recently. As much as neither I nor you want to admit it -- the truth is ChatGPT can handle 99% of what you would pay for "a team to maintain network and continually secure and monitor the server and update/patch." Infact, ChatGPT surpasses them as it is all encompassing. Any company now can simply pay for OpenAI's services and save the majority of the money they would have spent on the, "salaries of those professionals." BTW, ChatGPT Pro is only $200 a month ... who do you think they would rather pay?
If you haven't had to fight network configuration, monitoring, and security in a cloud provider you must have a very simple product. We deploy our product both in colos and on a cloud provider, and in our experience, bare-metal network maintenance and network maintenance in a PaaS consumes about the same number of hours.
There is a graph database which does disk IO for database startup, backup and restore as single threaded, sequential, 8kb operations.
On EBS it does at most 200MB/s disk IO just because the EBS operation latency even on io2 is about 0.5 ms. Even though the disk can go much faster, disk benchmarks can easily do multi-GB/s on nodes that have enough EBS throughput.
On instance local SSD on the same EC2 instance it will happily saturate the whatever instance can do (~2GB/s in my case).
I do not disagree, but just for the record, that's not what the article is about. They migrated to Hetzner cloud offering.
If they had migrated to a bare metal solution they would certainly have enjoyed an even larger increase in perf and decrease in costs, but it makes sense that they opted for the cloud offering instead given where they started from.
The AWS documents clarify this. When you get 1 vCPU in a Lambda you're only going to get up to 50% of the cycles. It improves as you move up the RAM:CPU tree but it's never the case that you get 100% of the vCPU cycles.
> on S3 that is $20-$250 _per second_ on API calls!
It is worth pointing out that if you look beyond the nickle & diming US-cloud providers, you will very quickly find many S3 providers who don't charge you for API calls and just the actual data-shifting.
Ironically, I think one of them is Hetzner's very own S3 service. :)
Other names IIRC include Upcloud and Exoscale ... but its not hard to find with the help of Mr Google, most results for "EU S3 provider" will likely be similar pricing model.
P.S. Please play nicely and remove the spam from the end of your post.
How do you deprogram your devs and ops people from the learned helplessness of cloud native ideology?
I've found that it's almost impossible to even hire people who aren't terrified of the idea of self-hosting. This is deeply bizarre for someone who installed Linux from floppy disks in 1994, but most modern devs have fully swallowed the idea that cloud handles things for them that mere mortals cannot handle.
This, in turn, is a big reason why companies use cloud in spite of the insane markup: it's hard to staff for anything else. Cloud has utterly dominated the developer and IT mindset.
>I've found that it's almost impossible to even hire people who aren't terrified of the idea of self-hosting
Are y'all hiring? [1]
I did 15 months at AWS and consider it the worst career move of my life. I much prefer working with self-hosting where I can actually optimize the entire hardware stack I'm working with. Infrastructure is fun to tinker with. Cloud hosting feels like a miserable black box that you dump your software into and "hope"
>I've found that it's almost impossible to even hire people who aren't terrified of the idea of self-hosting
Funny, I couldn't find a new job for a while because I had no cloud experience, finally and ironically I got hired at AWS. Every now and then these days I get headhunters unsure about my actual AWS experience because of my lack of certifications.
So you'd rather self host a database as well? How do you prevent data loss? Do you run a whole database cluster in multiple physical locations with automatic failover? Who will spend time monitoring replication lag? Where do you store backups? Who is responsible for tuning performance settings?
The cloud is also not fulfilling its end of the promise anymore - capacity on-demand. We’ve been struggling for over a month to get a single (1) non-beefy non-GPU instance on Azure, since they’ve been having just insane capacity issues, where even paying for “provisioned” capacity doesn’t make it available.
We will soon have 256 Zen 6c per socket, so at least 512 Core per server. Multiple PCIe 5.0 SSD at 14GB/s at up to half a Petabytes storage, and TBs of Memory.
And now Nvidia is in the game for Sever CPU, much faster time to market for PCIe in the future, and better x86 CPU implementation as well as ARM variants.
I love that you're not just preaching - you're offering the service at a lower cost. (I'm not affiliated and don't claim anything about their ability/reliability).
We moved DemandSphere from AWS to Hetzner for many of the same reasons back in 2011 and never looked back. We can do things that competitors can’t because of it.
> If you don't want to do this yourself, then we'll do it for you for half the price of AWS (and we'll be your DevOps team too
You might not realize but you are actually increasing the business case for AWS :-) Also those hardware savings will be eaten away by two days of your hourly bill. I like to look at my project costs across all verticals...
I understand the concern for sure. But we don't bill hourly in that way, as one thing our clients really appreciate is predictable costs. The fixed monthly price already includes engineering time to support your team.
I can't speak to Linode but in my experience the Digital Ocean VM performance is quite bad compared to bare metal offerings like Hetzner, OVH, etc. It's basically comparable to AWS, only a bit cheaper.
It's essentially the same product, but you do get lower disk latency. Best performance is always going to be a dedicated server which in the US seem to start around $80-100/month (just checking on serversearcher.com), DO and so on do provide a "dedicated cpu" product if that's too much.
> In reality, there is significant latency between Hetzner locations that make running multi-location workloads challenging, and potentially harmful to performance as we discovered through our post-deployment monitoring.
haha this reminds me of when I used to manage Solaris system consisting of 2 servers. Sparc T7, 1 box in one state and 1 box in another. No load balancer.
Thousands and thousands of users depending on that hardware.
What do you recommend for configuration management? I've had a fairly good experience with Ansible, but that was a long time ago... anything new in that pace?
I think you can get much farther with dedicated servers. I run a couple of nodes on Hetzner. The performance you get from a dedicated machine even if it is a 3 year old machine that you can get on server auction is absolutely bonkers and cannot be compared to VMs. The thing is that most of the server hardware is focused towards high core count, low clock speed processors that optimize for I/O rather than compute. It is overprovisioned by all cloud providers. Even the I/O part of the disk is crazy. It uses all sorts of shenanigans to get a drive that sitting on a NAS and emulating a local disk. Most startups do not need the hyper virtualized, NAS based drive. You can go much farther and much more cost-effectively with dedicated server rentals from Hetzner. I would love to know if they are any north-american (particularly canadian) companies that can compete with price and the quality of service like Hetzner. I know of OVH but I would love to know others in the same space.
As mentioned multiple times in other comments and places people think that doing what Google or FB is doing should be what everyone else is doing.
We are running modest operations on European VPS provider where I work and whenever we get a new hire (business or technical does not matter) it is like a Groundhog day - I have to explain — WE ALREADY ARE IN THE CLOUD, NO YOU WILL NOT START "MIGRATING TO CLOUD PROJECT" ON MY WATCH SO YOU CAN PAD YOUR CV AND MOVE TO ANOTHER COMPANY TO RUIN THEIR INFRA — or something along those lines but asking chatgpt to make it more friendly tone.
The number of times I have seen fresh "architects" come in with an architectural proposal for a 10 user internal LoB app that they got from a Meta or Microsoft worldscale B2C service blueprint ...
Did you "preheat" during those tests? It is very common for cloud instances to have "burstable" vCPUs. That is - after boot (or long idle), you get decent performance for first few minutes, then performance gradually tanks to a mere fraction of the initial burst.
Great website, but what a blunder to display the results as "cards" rather than a good old table so you can scan the results rather than having to actually read it. Makes it really hard to quickly find what you're looking for...
Edit: Ironically, that website doesn't have Hetzner in their index.
> I would love to know if they are any north-american (particularly canadian) companies that can compete with price and the quality of service like Hetzner
FWIW, Hetzner has two data centers in the US, in case you're just looking for "Hetzner quality but in the US", not for "American/Canadian companies similar to Hetzner".
Yeah no dedicated severs in the US sadly. I'm not aware of anyone who can quite match hetzners pricing in the US (but if someone does I'd love to know!). https://www.serversearcher.com throws up clouvider and latitiude at good pricing but.. not hetzner levels by any means.
For self hosted / cohosting my own kit, I buy refurbed servers from https://www.etb-tech.com/ because I can spec exactly what I want and see how the cost varies, what the delivery time is, etc.
Years ago Broadberry has a similar thing with Supermicro, but not any more. You have to talk to a sales person about how they can rip you off. Then they don't give you what you specced anyway -- I spec 8x8G sticks of ram, they provide 2x32G etc.
Be warned though that, when renting dedicated servers, there are certain issues you might have to deal with that usually aren't a factor when renting a VPS.
For example, I got a dedicated server from Hetzner earlier this year with a consumer Ryzen CPU that had unstable SIMD (ZFS checksums would randomly fail, and mprime also reported errors). Opened a ticket about it and they basically told me it wasn't an issue because their diagnostics couldn't detect it.
Yeah, their support, for better or worse, is really technical and you need to send all the evidence of any faults to convince them. But when I've had random issues happening, I've sent them all the troubleshooting and evidence I came across, and a couple of hours later they had provisioned a new host for me with the same specs.
And based on our different experiences, the quality of care you receive could differ too :)
> I would love to know if they are any north-american (particularly canadian) companies that can compete with price and the quality of service like Hetzner.
In a thread two days ago https://ioflood.com/ was recommended as US-based alternative
VMs are middle ground between AWS and dedicated hardware. With hardware you need to monitor it, report problems/failures to the provider, make necessary configuration changes (add/remove node to/from a cluster e. t. c.). If a team is coming from AWS it may have no experience with monitoring/troubleshooting problems caused by imperfect hardware.
One thing that frustrates me with estimating performance on AWS is that I have to dramatically estimate down from the performance of my dev laptop (M2 MBP). I've noticed performance tradeoffs around 15x slower when deployed to AWS. I realize that's a anecdotal number, but this is a fairly consistent trend I've seen at different companies running on cloud hosting services. One of the biggest performance hits is latency between servers. If you're server is making hundreds or thousands of db queries per second, you're going to feel that pain on the cloud more even if your cloud db server CPU is relatively bored. It is network latency. I look at the costs of AWS and it is easy to spend >$100,000 month.
I have ran services on bare metal, and VPSs, and I always got far better performance than I can get from AWS or GCP for a small fraction of the cost. To me "cloud" means vendor lock-in, terrible performance, and wild costs.
People do not realize for that fancy infinite storage scaling, that it means that AWS etc run network based storage. And that, like on a DB, can be a 10x performance hit.
I have seen hetzners cloud block storage to be quite slow. It became soon a bottleneck on our timescale databases. Now we're testing on netcup.com's "root servers" which are VPS with dedicated CPU cores and lots of very fast storage.
> . I know of OVH but I would love to know others in the same space.
When I've needed dedicated servers in the US I've used Vultr in the past, relatively nice pricing, only missing unmetered bandwidth for it to be my go-to. But all those US-specific cases been others paying for it, so hasn't bothered me, compared to personal/community stuff I host at Hetzner and pay for myself.
I've been eying Vultr for dedicated metal in Canada (Toronto Datacenter). How do they measure up to Hertzner? I'm not looking to get the best possible deal, but just better value than EC2 (which costs me a fair amount of egress).
Virtualization has a crazy overhead - when we moved to metal instances in AWS, we gained like 20-25% performance. I thought that since AWS has the smartest folks in the business and Intel & co. has been at this for decades, it'd be like a couple percent overhead at most, but no.
We’ve been seeing the same trend. Lots of teams moving to Hetzner for the price/performance, but then realizing they have to rebuild all the Postgres ops pieces (backups, failover, monitoring, etc.).
We ended up building a managed Postgres that runs directly on Hetzner. Same setup, but with HA, backups, and PITR handled for you. It’s open-source, runs close to the metal, and avoids the egress/I/O gotchas you get on AWS.
If anyone’s curious, I added here are some notes about our take [1], [2]. Always happy to talk about it if you have any questions.
This is one key draw to Big Cloud and especially PaaS and managed SQL for me (and dev teams I advise).
Not having an ops background I am nervous about:
* database backup+restore
* applying security patches on time (at OS and runtime levels)
* other security issues like making sure access to prod machines is restricted correctly, access is logged, ports are locked down, abnormal access patterns are detected
* DoS and similar protections are not my responsibility
It feels like picking a popular cloud provider gives a lot of cover for these things - sometimes technically, and otherwise at least politically...
Applying security patches on time is not much problem. Ones that you need to apply ASAP are rare and for DB engine you never put it on public access, most of the time exploit is not disclosed publicly and PoC code is not available for patched RCE right on day of patch release.
Most of the time you are good if you follow version updates for major releases as they come you do regression testing and put it on prod in your planned time.
Most problems come from not updating at all and having 2 or 3 year old versions because that’s what automated scanners will be looking for and after that much time someone much more likely wrote exploit code and shared it.
There must be SaaS services offering managed databases on different providers, like you buy the servers they put the software and host backups for you. Anyone got any tips?
to be fair, AWS' database restore support is generally only a small part of the picture - the only option available is to spin an entirely new DB cluster up from the backup, so if your data recovery strategy isn't "roll back all data to before the incident", you have to build out all your own functionality for merging the backup and live data...
Exactly this. For a small team that's focused on feature development and customer retention, I tend to gladly outsource this stuff and sleep easy at night. It's not even a cost or performance issue for me. It's about if I start focusing on this stuff, what about my actual business am I neglecting. It's a tradeoff.
I can attest to that. At Cloud 66 a lot of customers tell us that while the PaaS experience on Hetzner is great, they benefit from our managed DBs the most.
While I'm sure it's a great project, a few issues in the README gave me pause to think about how well it's kept up to date. Around half of the links in the list of dependencies are either out of date or just plain don't work, and referencing Vagrant with no mention of Docker.
I love how few comments on this and similar posts give much context along with their advice. Are you hosting a church newsletter in your spare time or a resource intensive web app with millions of paying enterprise customers and a dedicated dev ops team in 3 continents?
Any advice on price / performance / availability is meaningless unless you explain where you're coming from. The reason we see people overcomplicating everything to do with the web is that they follow advice from people with radically different requirements.
> The reason we see people overcomplicating everything to do with the web is that they follow advice from people with radically different requirements.
Or they've had cloud account managers sneaking into your C-suite's lunchtime meetings.
Other comments in this thread say they get directives to use AWS from the top.
Strangely that directive often comes with AWS's own architects embedded into your team and even more strangely they seem to recommend the most expensive server-less options available.
What they don't tell is you you'll be rebuilding and redeploying your containerised app daily with new Docker OS base images to keep up with the security scanners just like patching an OS on a bare metal server.
Different requirements, different skillsets, different costs, different challenges. AWS is only topically the same product as Hetzner, coming from someone who has used both quite a bit.
I've been running my SaaS on Hetzner servers for over 10 years now. Dedicated hardware, clusters in DE and FI, managed through ansible. I use vpncloud to set up a private VPN between the servers (excellent software, btw).
My hosting bill is a fraction of what people pay at AWS or other similar providers, and my servers are much faster. This lets me use a simpler architecture and fewer servers.
When I need to scale, I can always add servers. The only difference is that with physical servers you don't scale up/down on demand within minutes, you have to plan for hours/days. But that's perfectly fine.
I use a distributed database (RethinkDB, switching to FoundationDB) for fault tolerance.
Similar setup to me (including rethinkdb). Why choose FoundationDB? RethinkDb is still maintained and features added occasionally (I'm on the rethinkdb slack and maintain an async php driver). It just is one guy though, working on it part time.
RethinkDB is somewhat maintained, and while it is a very good database and works quite well, it is not future-proof. But the bigger reason is that I need better performance, and by now (after 10 years) I know my data access patterns well, so I can make really good use of FoundationDB.
The reason for FoundationDB specifically is mostly correctness, it is pretty much the only distributed database out there that gives you strict serializability and delivers on that promise. Performance is #2 on the list.
You use vpncloud to connect across different Hetzner data centers (DE + FI)? I thought/assumed Hetzner provided services to do this at little-to-no cost.
Not the GP, but I also use Hetzner, but use Tailscale to connect securely across different Hetzner regions (and indeed other VPS providers).
Hetzner does provide free Private Networks, but they only work within a single region - I'm not aware of them providing anything (yet) to securely connect between regions.
No, I use vpncloud for a local (within a datacenter) VPN. This lets me move more configuration into ansible (out of the provider's web interfaces), avoid additional fees, and have the same setup usable for any hosting provider, including virtual clouds. Very flexible.
We've helped quite a few teams move from AWS to Hetzner (and Netcup) lately, and I think the biggest surprise for people isn't the cost or the raw performance, it’s how simple things become when you remove 15 layers of managed abstractions.
You stop worrying about S3 vs EFS vs FSx, or Lambda cold starts, or EBS burst credits. You just deploy a Docker stacks on a fast NVMe box and it flies. The trade-off is you need a bit more DevOps discipline: monitoring, backups, patching, etc. But that's the kind of stuff that's easy to automate and doesn't really change week to week.
At Elestio we leaned into that simplicity, we provide fully managed open-source stacks for nearly 400 software and also cover CI/CD (from Git push to production) on any provider, including Hetzner.
Best feature of (some) the dedicated servers Hetzner offers is the unmetered bandwidth. I'm hosting a couple of image-heavy websites (mostly modding related) and since moving to Hetzner I sleep much better knowing I'll pay the same price every single month, and have been for the ~3 years I've been a happy Hetzner customer.
Hetzner is really great until you try to scale with them. We started building our service on top of Hetzner and had couple 100s of VMs running and during peak time we had to scale them to over 1000 VMs. And here couple of problems started, you get pretty often IPs which are black listed, so if you try to connect to services hosted by Google, AWS like S3 etc. you can't reach them. Also at one point there were no VMs available anymore in our region, which caused a lot of issues.
But in general if you don't need to scale crazy Hetzner is amazing, we still have a lot of stuff running on Hetzner but fan out to other services when we need to scale.
> Also at one point there were no VMs available anymore in our region, which caused a lot of issues.
I'm not sure if this is a difference between other clouds, at least a few years ago this was a weekly or even daily problem in GCP; my experience is if you request hundreds of VMs rapidly during peak hours, all the clouds struggle.
Right now, we can’t request even a single (1) non-beefy non-GPU VM in us-east on Azure. That’s been going on for over a month now, and that’s after being a customer for 2 years :(
We launch 30k+ VMs a day on GCP, regularly launching hundreds at a time when scheduled jobs are running. That’s one of the most stable aspects of our operation - in the last 5 years I’ve never seen GCP “struggle” with that except during major outages.
At the scale of providers like AWS and even the smaller GCP, “hundreds of VMs” is not a large amount.
Note that we might be talking about two different things here: some of us use physical servers from Hetzner, which are crazy fast, and a great value. And some of us prefer virtual servers, which (IMHO) are not that revolutionary, even though still much less expensive than the competition.
I didn't know AWS and GCP also did it. Not surprised.
The problem is that European regulators do nothing about such anti-competitive dirty tricks. The big clouds hide behind "lots of spam coming from them", which is not true.
First comment on that post claims that according to Mimecast, 37% of EU-based spam originates from Hetzner and Digital Ocean. People have been asking for 3 days for a link to the source (I can't find it either).
On the other hand, someone linked a report from last year[0]:
> 72% of BEC attacks in Q2 2024 used free webmail domains; within those, 72.4% used Gmail. Roughly ~52% of all BEC messages were sent from Gmail accounts that quarter.
Worth noting that this seems to be about Hetzners cloud product, not the dedicated servers. The cloud product is relatively new, and most of the people who move to Hetzner do so because of the dedicated instances, not to use their cloud.
I've ran into the IP deny list problem too, but for Windows VMs - you spin them up, only to realise that you can't get Windows Updates, can't reach the Powershell gallery etc.
And just deleting it and starting again is just going to give you the exact same IP again!
I ended up having to buy a dozen or so IPs until I found one that wasn't blocked, and then I could delete all the blocked ones.
This sound really intriguing, and I am really curious. What kind of service do you run where you need a 100s of VMs? Was there a reason for not going dedicated? Looking at their offering their biggest VM is (48 CPU, 192 GB RAM, 960 GB SSD). I can't even imagine using that much. Again, I'm really curious.
we have extremely processing heavy jobs where user upload large collection of files (audios, pdfs, videos etc.) and expect to get fast processing. its just that we need to fan out sometimes, since a lot of our users a sensitive to processing times.
I think they're great but it's unfortunate they don't have more locations which would at least enable you to spin VMs up in different locations during a shortage. If you rely on them it might be wise to have a second cloud provider that you can use in a pinch, there's many options.
AWS at least maintains IP lists of bots, active exploiters, ddos attackers, etc, that you can use to filter/rate limit traffic in WAF. Not so much AWS that blocks you but customers that decide to use these lists.
This is down to several things:
- Latency - having your own local network, rather than sharing some larger datacenter network fabric, gives around of order of magnitude reduced latency
- Caches – right-sizing a deployment for the underlying hardware, and so actually allowing a modern CPU to do its job, makes a huge difference
- Disk IO – Dedicated NVMe access is _fast_.
And with it comes a whole bunch of other benefits:
- Auto-scalers becomes less important, partly because you have 10x the hardware for the same price, partly because everything runs 2x the speed anyway, and partly because you have a fixed pool of hardware. This makes the whole system more stable and easier to reason about.
- No more sweating the S3 costs. Put a 15TB NVMe drive in each server and run your own MinIO/Garage cluster (alongside your other workloads). We're doing about 20GiB/s sustained on a 10 node cluster, 50k API calls per second (on S3 that is $20-$250 _per second_ on API calls!).
- You get the same bill every month.
- UPDATE: more benefits - cheap fast storage, run huge Postgresql instances at minimal cost, less engineering time spend working around hardware limitations and cloud vagaries.
And, if chose to invest in the above, it all costs 10x less than AWS.
Pitch: If you don't want to do this yourself, then we'll do it for you for half the price of AWS (and we'll be your DevOps team too):
https://lithus.eu
Email: adam@ above domain
I remember the point in my career when I moved from a cranky old .NET company, where we handled millions of users from a single cabinent's worth of beefy servers, to a cloud based shop where we used every cloud buzzword tech under the sun (but mainly everything was containerized node microservices).
I shudder thinking back to the eldritch horrors I saw on the cloud billing side, and the funny thing is, we were constantly fighting performance problems.
https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
If anyone from oxide computer or similar is reading, maybe you should rebrand to BEEFY server inc...
this is me currently, not so much "tech" as repackaged cloud services as SAAS and middleware, department is amortizing hours and effort against projects, the eldritch horrors you're referring to are starting to manifest and I want out.
Of course Node could not compete, and the cost had to be paid for each thinly sliced microservice carrying heavy runtime alongside it.
My employer is so conservative and slow that they are forerunning this Local Cloud Edge Our Basement thing by just not doing anything.
Over the years I tried occasionally to look into cloud, but it never made sense. A lot of complexity and significantly higher cost, for very low performance and a promise of "scalability". You virtually never need scalability so fast that you don't have time to add another server - and at baremetal costs, you're usually about a year ahead of the curve anyways.
I think there is a generational part as well. The ones of us that are now deep in our 40s or 50s grew up professionally in a self-hosted world, and some of us are now in decision-making positions, so we don't necessarily have to take the cloud pill anymore :)
Half-joking, half-serious.
The cost for your first on-prem datacenter server is pretty steep...the cost for the second one? Not so much.
It's not really. It just happens that when there is a huge bullshit hype out there, people that fall for it regret and come back to normal after a while.
Better things are still better. And this one was clearly only better for a few use-cases that most people shouldn't care about since the beginning.
My only “yes, but…” is that this:
> 50k API calls per second (on S3 that is $20-$250 _per second_ on API calls!).
kind of smells like abuse of S3. Without knowing the use case, maybe a different AWS service is a better answer?
Not advocating for AWS, just saying that maybe this is the wrong comparison.
Though I do want to learn about Hetzner.
The rough part was that we had made hardware investments and spent almost a year setting up the system for HA and immediate (i.e. 'low-hanging fruit') performance tuning and should have turned to architectural and more subtle improvements. This was a huge achievement for a very small team that had neither the use nor the wish to go full clown.
I still think small-midsized orgs may be better off in cloud for security / operations cost optimization.
This is why we decided to bundle engineering time with the infrastructure. We'll maintain the cluster as you say, and with the time left over (the majority) we'll help you with all your other DevOps needs too (CI/CD pipelines, containerising software, deploying HA Valkey, etc). And even after all that, it still costs less than AWS.
Edit: We also take on risk with the migration – our billing cycle doesn't start until we complete the migration. This keeps our incentives aligned.
Deleted Comment
On EBS it does at most 200MB/s disk IO just because the EBS operation latency even on io2 is about 0.5 ms. Even though the disk can go much faster, disk benchmarks can easily do multi-GB/s on nodes that have enough EBS throughput.
On instance local SSD on the same EC2 instance it will happily saturate the whatever instance can do (~2GB/s in my case).
If they had migrated to a bare metal solution they would certainly have enjoyed an even larger increase in perf and decrease in costs, but it makes sense that they opted for the cloud offering instead given where they started from.
The AWS documents clarify this. When you get 1 vCPU in a Lambda you're only going to get up to 50% of the cycles. It improves as you move up the RAM:CPU tree but it's never the case that you get 100% of the vCPU cycles.
However that's not due to aws overhead or oversubscription but x86 architecture. For production workloads 2 vcpu should be minimum recommendation.
On ARM where 1 vcpu = 1 core it is more straightforward.
It is worth pointing out that if you look beyond the nickle & diming US-cloud providers, you will very quickly find many S3 providers who don't charge you for API calls and just the actual data-shifting.
Ironically, I think one of them is Hetzner's very own S3 service. :)
Other names IIRC include Upcloud and Exoscale ... but its not hard to find with the help of Mr Google, most results for "EU S3 provider" will likely be similar pricing model.
P.S. Please play nicely and remove the spam from the end of your post.
I've found that it's almost impossible to even hire people who aren't terrified of the idea of self-hosting. This is deeply bizarre for someone who installed Linux from floppy disks in 1994, but most modern devs have fully swallowed the idea that cloud handles things for them that mere mortals cannot handle.
This, in turn, is a big reason why companies use cloud in spite of the insane markup: it's hard to staff for anything else. Cloud has utterly dominated the developer and IT mindset.
Are y'all hiring? [1]
I did 15 months at AWS and consider it the worst career move of my life. I much prefer working with self-hosting where I can actually optimize the entire hardware stack I'm working with. Infrastructure is fun to tinker with. Cloud hosting feels like a miserable black box that you dump your software into and "hope"
[1] https://cursedsilicon.net/resume.pdf
Funny, I couldn't find a new job for a while because I had no cloud experience, finally and ironically I got hired at AWS. Every now and then these days I get headhunters unsure about my actual AWS experience because of my lack of certifications.
And now Nvidia is in the game for Sever CPU, much faster time to market for PCIe in the future, and better x86 CPU implementation as well as ARM variants.
You might not realize but you are actually increasing the business case for AWS :-) Also those hardware savings will be eaten away by two days of your hourly bill. I like to look at my project costs across all verticals...
Doubt it. I've personally seen AWS bills in the tens of thousands, he's probably not that costly for a day.
They still use VMs, but as far as I know they have simple reserved instances, not “cloud”-like weather?
Is the performance better and more predictable on large VPSes?
(edit: I guess a big difference is that VPS can have local NVMe that is persistent, whrereas EC2 local disk is ephemeral? )
the devil is in the details, as they say.
Thousands and thousands of users depending on that hardware.
Extremely robust hardware.
And you are still charging half of AWS, which is that case I am just doing these work myself if I really think AWS is too expensive.
it also handled some databases and webservers on FreeBSD and Windows, I considered it better than Ansible.
We are running modest operations on European VPS provider where I work and whenever we get a new hire (business or technical does not matter) it is like a Groundhog day - I have to explain — WE ALREADY ARE IN THE CLOUD, NO YOU WILL NOT START "MIGRATING TO CLOUD PROJECT" ON MY WATCH SO YOU CAN PAD YOUR CV AND MOVE TO ANOTHER COMPANY TO RUIN THEIR INFRA — or something along those lines but asking chatgpt to make it more friendly tone.
Google doesn't even deploy most of its own code to run on VMs. Containers yes but not VMs.
https://dillonshook.com/postgres-cloud-benchmarks-for-indie-...
Too cool to not share, most of the providers listed there have dedicated servers too.
Edit: Ironically, that website doesn't have Hetzner in their index.
excellent website, thanks.
FWIW, Hetzner has two data centers in the US, in case you're just looking for "Hetzner quality but in the US", not for "American/Canadian companies similar to Hetzner".
https://www.hostpapa.ca/
https://www.cacloud.com/
https://www.keepsec.ca/
https://www.canspace.ca/
Years ago Broadberry has a similar thing with Supermicro, but not any more. You have to talk to a sales person about how they can rip you off. Then they don't give you what you specced anyway -- I spec 8x8G sticks of ram, they provide 2x32G etc.
For example, I got a dedicated server from Hetzner earlier this year with a consumer Ryzen CPU that had unstable SIMD (ZFS checksums would randomly fail, and mprime also reported errors). Opened a ticket about it and they basically told me it wasn't an issue because their diagnostics couldn't detect it.
And based on our different experiences, the quality of care you receive could differ too :)
In a thread two days ago https://ioflood.com/ was recommended as US-based alternative
I have ran services on bare metal, and VPSs, and I always got far better performance than I can get from AWS or GCP for a small fraction of the cost. To me "cloud" means vendor lock-in, terrible performance, and wild costs.
People do not realize for that fancy infinite storage scaling, that it means that AWS etc run network based storage. And that, like on a DB, can be a 10x performance hit.
Clouvider is available in alot of US DCs, 4GB ram/2cpu/80GB NVME and a 10Gb port for like $6 a month.
Hetzner, OVH, Leaseweb, and Scaleway (EU locations only).
I've used other providers as well, but I won't mention them because they were either too small or had issues.
When I've needed dedicated servers in the US I've used Vultr in the past, relatively nice pricing, only missing unmetered bandwidth for it to be my go-to. But all those US-specific cases been others paying for it, so hasn't bothered me, compared to personal/community stuff I host at Hetzner and pay for myself.
We ended up building a managed Postgres that runs directly on Hetzner. Same setup, but with HA, backups, and PITR handled for you. It’s open-source, runs close to the metal, and avoids the egress/I/O gotchas you get on AWS.
If anyone’s curious, I added here are some notes about our take [1], [2]. Always happy to talk about it if you have any questions.
[1] https://www.ubicloud.com/blog/difference-between-running-pos... [2] https://www.ubicloud.com/use-cases/postgresql
Not having an ops background I am nervous about:
* database backup+restore * applying security patches on time (at OS and runtime levels) * other security issues like making sure access to prod machines is restricted correctly, access is logged, ports are locked down, abnormal access patterns are detected * DoS and similar protections are not my responsibility
It feels like picking a popular cloud provider gives a lot of cover for these things - sometimes technically, and otherwise at least politically...
Most of the time you are good if you follow version updates for major releases as they come you do regression testing and put it on prod in your planned time.
Most problems come from not updating at all and having 2 or 3 year old versions because that’s what automated scanners will be looking for and after that much time someone much more likely wrote exploit code and shared it.
[1] https://pigsty.io/
Any advice on price / performance / availability is meaningless unless you explain where you're coming from. The reason we see people overcomplicating everything to do with the web is that they follow advice from people with radically different requirements.
Or they've had cloud account managers sneaking into your C-suite's lunchtime meetings.
Other comments in this thread say they get directives to use AWS from the top.
Strangely that directive often comes with AWS's own architects embedded into your team and even more strangely they seem to recommend the most expensive server-less options available.
What they don't tell is you you'll be rebuilding and redeploying your containerised app daily with new Docker OS base images to keep up with the security scanners just like patching an OS on a bare metal server.
you don't need that in 99.9999% of cases.
TL;DR: Think of hosting providers like a pricing grid (DIY, Get Started, Pro, Team, Enterprise) and if YAGNI, don't choose it.
My hosting bill is a fraction of what people pay at AWS or other similar providers, and my servers are much faster. This lets me use a simpler architecture and fewer servers.
When I need to scale, I can always add servers. The only difference is that with physical servers you don't scale up/down on demand within minutes, you have to plan for hours/days. But that's perfectly fine.
I use a distributed database (RethinkDB, switching to FoundationDB) for fault tolerance.
The reason for FoundationDB specifically is mostly correctness, it is pretty much the only distributed database out there that gives you strict serializability and delivers on that promise. Performance is #2 on the list.
Hetzner does provide free Private Networks, but they only work within a single region - I'm not aware of them providing anything (yet) to securely connect between regions.
You stop worrying about S3 vs EFS vs FSx, or Lambda cold starts, or EBS burst credits. You just deploy a Docker stacks on a fast NVMe box and it flies. The trade-off is you need a bit more DevOps discipline: monitoring, backups, patching, etc. But that's the kind of stuff that's easy to automate and doesn't really change week to week.
At Elestio we leaned into that simplicity, we provide fully managed open-source stacks for nearly 400 software and also cover CI/CD (from Git push to production) on any provider, including Hetzner.
More info here if you're curious: https://elest.io
(Disclosure: I work at Elestio, where we run managed open-source services on any cloud provider including your own infra.)
We support postgres but also MySQL, redis, opensearch, Clickhouse and many more.
About backups we offer differential snapshots and regular dumps that you can send to your own S3 bucket
https://docs.elest.io/books/databases/page/deploy-a-new-clus...
But in general if you don't need to scale crazy Hetzner is amazing, we still have a lot of stuff running on Hetzner but fan out to other services when we need to scale.
I'm not sure if this is a difference between other clouds, at least a few years ago this was a weekly or even daily problem in GCP; my experience is if you request hundreds of VMs rapidly during peak hours, all the clouds struggle.
At the scale of providers like AWS and even the smaller GCP, “hundreds of VMs” is not a large amount.
https://www.linkedin.com/posts/jeroen-jacobs-8209391_somethi...
I didn't know AWS and GCP also did it. Not surprised.
The problem is that European regulators do nothing about such anti-competitive dirty tricks. The big clouds hide behind "lots of spam coming from them", which is not true.
On the other hand, someone linked a report from last year[0]:
> 72% of BEC attacks in Q2 2024 used free webmail domains; within those, 72.4% used Gmail. Roughly ~52% of all BEC messages were sent from Gmail accounts that quarter.
[0] https://docs.apwg.org/reports/apwg_trends_report_q2_2024.pdf
So you have approx 1MM concurrent customers? That's a big number. You should definitely be able to get preferred pricing from AWS at that scale.
And just deleting it and starting again is just going to give you the exact same IP again!
I ended up having to buy a dozen or so IPs until I found one that wasn't blocked, and then I could delete all the blocked ones.