Readit News logoReadit News
xmodem · a year ago
> No reverse proxies required!

This is one that has always baffled me. If there's no specific reason that a reverse proxy is helpful, I will often hang an app with an embedded Jetty out on the internet without one. This has never lead to any problems.

Infra or security people will see this and ask why I don't have an nginx instance in front of it. When I ask why I need one, the answers are all hand-wavy security or performance, lacking any specifics. The most specific answer I received once was slow loris, which hasn't been an issue for years.

Is reverse proxying something we've collectively decided to cargo cult, or is there some reason why it's a good idea that applies in the general case that I'm missing?

codegeek · a year ago
For me, Reverse proxy helps me keep my origin server only for 1 purpose: Serve the Application. Everything else, I can handle with Reverse Proxy including TLS Termination, load balancing, URL rewrites, Security (WAF etc) if needed. Separation of duties for me.

Overall, the benefit is that you can keep your origin server protected and only serve relevant traffic. Also, lets say you offer custom domain to your own customers and in that case, you could always swap out the origin server (if needed) without worrying about DNS changes for your customers as they are pointing to the reverse proxy and not your origin server directly.

TZubiri · a year ago
TLS should be done with proxies, yes. The Stunnel approach is Gospel.

Similarly if you start load balancing, you can put some server in the middle yes. But the ideal solution is at the DNS level I think, unless there's some serious compute going on (which a website loading a page from disk is not).

URL rewrites should not be a thing unless you have a clusterfuck, and Security is best accomplished in my experience by removing, rather than by adding.

dartos · a year ago
I run many server programs on my homelab.

Each is running on a different port, but I want them all accessible publicly from different URLs and I only want to expose port 443 to the internet.

I also want to have TLS autorefresh for each domain.

I need a reverse proxy for the former and caddy does both.

If you’re running a single server and that server does TLS termination then you don’t really need a reverse proxy.

com2kid · a year ago
Every page off of my (static HTML file!) home page[1] is actually a distinct microservice sitting behind a reverse proxy. I can throw some new experiment together, built it with whatever tooling I want, give it a port number, and let nginx route to it.

It removes a lot of friction from "I wonder if making this service is a good idea?" and because I am self hosting I am not tying myself down to any of the "all in one" hosting platforms.

[1] https://www.generativestorytelling.ai/

tnolet · a year ago
e.g. Virtual hosting as we called it in the Apache days
MayeulC · a year ago
You forgot the original need: share a single IPv4 among different services.

If going IPv6-only, the need for a reverse proxy is seriously lowered. You could spin multiple servers up (even on different machines), listening to 443. Have each service handle its certificate renewal, etc.

cybrox · a year ago
For most of my deployments, the performance impact of a reverse proxy is negligible, I have the configs pre-prepared and it allows me to add TLS termination, URL rewrites or other shenanigans without much effort in the future. So for me, it's mostly a habit that has paid out so far.
cbm-vic-20 · a year ago
IME, using an Nginx or WAF layer lets the "ops people" make changes to the things you mention (TLS config, URL rewrites, etc.) without getting the "app people" involved. There's a bit of "Conway's Law" going on here, depending on the reporting structure and political makeup of the organization.
nickpsecurity · a year ago
My answer applies to a number of types of servers that sit in front of web applications. You asked about security and performance. I’ll give you a few ways that an extra box can help in those areas.

For security, you want a strong OS with this little code as possible in your overall system. Proxy-style apps can be very simple compared to web, application servers. They can filter incoming traffic, validate the input, or even change it to something safer (or faster) to parse. They can also run on OS’s that are harder to attack: OpenBSD; GenodeOS; INTEGRITY-178B. On availability, putting load-balancing, monitoring, and recovery in these systems is often safer since app servers are more likely to crash.

On performance, the first benefit is that the simple, focused app can have a highly-optimized implementation. From there, one can use hardware accelerators (CPU or PCI) to speed up compression or encryption. Also called offloading. The most, cost-effective setup has many commodity servers benefiting from a few, high-cost servers capable of offloading. Some have load-balancing to route incoming traffic to servers able to handle it best to minimize use of costly resources.

So, there’s a few ways that proxy-type servers can help in security and performance.

dartos · a year ago
I don’t really care think there is a general case for all servers.

For the minimal case you don’t need it, but in production (with a single host) it allows for rolling releases, compression, TLS, fast static file serving, potentially A/B testing capabilities.

The layer of indirection between the request and your server can be very useful.

lnenad · a year ago
> but in production (with a single host) it allows for rolling releases

I mean for me this is pretty much already enough of a reason to always put an rp ahead of my apps. It's requires minimal setup, most of the tools are fire and forget so I see no real downsides. But having the ability to just point it somewhere else, or to split traffic across app replicas, is more than enough.

mistrial9 · a year ago
caching -- google changed the expectations of millions
arielcostas · a year ago
I think people do it out of habit at this time. In many cases it makes sense to handle TLS termination and compression, but in other instances it really is there for no reason.

Proxying is always less-performing than serving directly since you add another layer in between, right? Or am I missing something?

xmodem · a year ago
Jetty implements both TLS and compression, though in environments where I don't already have automated certificate issuance infrastructure in place I have occasionally deployed caddy as a reverse proxy just for the TLS termination.
okasaki · a year ago
You're missing vhosts, TLS, caching, logging, and log analysis, access control, rate limiting, custom error messages, metrics, etc.
01HNNWZ0MV43FF · a year ago
At one job, Nginx facilitated blue-green deployments. I would spin up a 2nd app server and have Nginx cut-over to it with <1 second of downtime. If anything went wrong, the rollback plan was to only roll back the Nginx config.

I automated all that with a few scripts that included sanity checks with `nginx -t`. After the update looked good I would shut down the old app server without any time crunch. Only the Nginx config was time-sensitive.

I'm not sure if you can do that without some kind of reverse proxy as an abstraction layer. At least a TCP-level proxy.

And as everyone said, virtual hosting.

MayeulC · a year ago
In theory, you can do even better with no reverse proxy: hand down the open sockets to the new version of your application, zero downtime at all. (Nothing prevents you from having a reverse proxy in front while doing that).
sophacles · a year ago
> Is reverse proxying something we've collectively decided to cargo cult, or is there some reason why it's a good idea that applies in the general case that I'm missing?

It's a matter of risk management. On the one hand is your service that speaks http. Maybe it uses a good library for it, maybe not - but even if the library is good are we sure you used it correctly? Even if you used it correctly, has it been as thoroughly tested and proven as nginx?

On the other hand you have nginx - a deeply understood technology that has served trillions and trillions of web requests, has proven itself resillient against attacks again and again, and has been reviewed with a fine-toothed comb by security engineers deeply for years.

So just from the starting point, your software is riskier. Even if you're the best software engineer who's ever lived, it's a higher risk profile to deploy new unproven software than the one that's been battle tested for decades.

It's also a matter of mitigation - if your software does have a vuln, are you going to notice it? Even if you do notice it, how long til you understand the problem and fix it? What to do in the time between discovery and deploying the fix? On the other hand if there's an nginx vuln, there are almost certainly juicier targets than your software to exploit first, and the bug and the fix are far more likely to be found and deployed long before someone even tries it for your site.

pengaru · a year ago
It's a lot easier to isolate and de-privilege your reverse proxy that needs to do nothing more than speak http/https with the outside world and some local listeners.

The url-specific web servers you're proxying tend to need a whole lot more, at least filesystem access to serve html content, at most program execution like CGIs and interpreters.

Separating these concerns makes a lot of sense, and brings little to no overhead by modern standards.

jasonjayr · a year ago
Reverse proxy allows some operational flexibility:

1) you can share multiple apps or sites with one server listening on port 443/80. 2) You can redirect to another backend on your infrastrcture 3) You can enforce certain login/sso/restrictions 4) You can configure all these things in one place.

Of course, if you don't need all that, then it's somewhat moot.

Klonoar · a year ago
Amusingly, slowloris is still an issue for some Rust (hyper) based servers. There’s been some movement on it lately - and I’m typing this in a free moment, so maybe it’s finally fixed and someone can correct me - but it’s kind of lurking there and throwing Nginx in front of an e.g Axum deploy is still somewhat necessary.
paxys · a year ago
> I will often hang an app with an embedded Jetty out on the internet

So you are using a proxy server, just an embedded one. Most prefer simply prefer not to bundle their application with one.

didip · a year ago
Reverse proxy is the OG sidecar. You get N number of useful functionalities that doesn't need to live in your primary app, for example: TLS cert handling.
mp05 · a year ago
> Is reverse proxying something we've collectively decided to cargo cult

Yeah, that’s ridiculous. “Cargo culting” is when people imitate processes without understanding the underlying purpose, but reverse proxying is widely used for valid reasons—like security, load balancing, caching, SSL termination, etc. It’s not just mindless mimicry. Dismissing a best practice as “cargo culting” because they don’t understand it is lazy. Just because it’s common doesn’t mean it’s done without purpose. Worst case? You get people following a pretty good practice.

worik · a year ago
> slow loris,

Really? I am curious.

You are not talking of monkeys?

rwmj · a year ago
Cool! I also wrote my own C web server (sources linked below) which ran a commercial website for a while. It's amazing how small and light you can make an HTTP/1.1 webserver. The commercial site ran on a machine with 128MB of RAM and 1 CPU (sic) and routinely served a large proportion of schools in the UK with a closed source interactive, web-based chat system. However that was 20 years ago when the internet was a slightly less hostile place.

He mentions bots make great fuzzers, but I think he should also do a bit of actual fuzzing.

http://git.annexia.org/?p=rws.git;a=tree Requires: http://git.annexia.org/?p=c2lib.git;a=treehttp://git.annexia.org/?p=pthrlib.git;a=tree

nicoburns · a year ago
Rust is a good choice for webserver that will run in this footprint without having to worry so much about the hostile internet. My website https://blessed.rs runs on a VM with 256mb of RAM because that was the smallest I can find, but it typically uses ~60mb.
kragen · a year ago
this looks much more practical than my own small and lightweight http/1.0 webserver, but i'm guessing that rws is not nearly as small and lightweight: http://canonical.org/~kragen/sw/dev3/server.s http://canonical.org/~kragen/sw/dev3/httpdito-readme

the really surprising thing about that was that when your memory map only has five 4k pages in it, linux gets really fast at forking

rwmj · a year ago
It operated in the real world (of 20 years ago), and supported in-process dlopened modules which is how the web-chat was implemented, so it was somewhat non-trivial.
cozis · a year ago
httpdito looks incredible
cozis · a year ago
Hey, the code looks really good! Thanks for sharing. I'll probably go through it a bit later :)

P.S. Love the indentation

cozis · a year ago
Hello everyone! This is a fun little project I started in my spare time and thought you'd appreciate :)
sim7c00 · a year ago
I find it an interesting excersize to read through really old bugs and CvE for http servers to see what might affect my code too. and see how to fix it. nic3 going though =) fun to roll this kind of stuff yourself!
yazzku · a year ago
Appreciated indeed. I happened to want to mess around with the C11 concurrency API and write a server of sorts, mostly as a curiosity of how those constructs work out in C coming from C++.
theideaofcoffee · a year ago
Awesome! I used to think (well, I still do) that getting a barebones service up and running using the system APIs at the lowest level like this is so satisfying. It's sort of magical, really. And to see it serve real traffic! I'm kind of surprised that the vanilla poll() can put up numbers like you were seeing, but I guess it's been a while since I've had to do anything event related/benchmark at that level.

I love the connection-specific functions and related structs and arrays for your connection bookkeeping, as well as the poll fd arrays. It's very reminiscent of how it's done in lots of other open source packages known for high throughput numbers, like nginx, redis, memcached.

Great work!

yard2010 · a year ago
Working with c/cpp in uni exploded my mind. It's such a specific humbling experience that has a bit of anything I love - engineering, history, culture, linguistics, etc.

It made me think that anyone should know and try every possible language (programming or otherwise) - "thinking" in a language is such a unique experience. The different contexts make everything feel different, even though it's more of the same. The perspective change, and changes the subjective experience.

For example - to really understand the nature of linux or git, you have to speak its language and understand the nuances that are usually lost in translation. Tangibly, to understand the true subjective meaning of the word "forest" in russian one has to speak and understand russian.

The context changes the perspective, so sometimes it changes everything.

ryandrake · a year ago
It’s kind of sad how C has gotten the reputation as this dangerous and scary dark art that only wizards can successfully wield. C was my first love, it’s what we used throughout university, it’s what our operating systems and basic tools are all written in... If you go to your favorite language and step down into the actual implementation of, for example, your network calls, you’re eventually going to get to poll() and write() written in C. It’s useful to know and be fluent in regardless of whether you intend to work on large projects in C.
ggliv · a year ago
This is a neat perspective. I’ve heard conversation on how working with different programming languages affects how you code (“learn Haskell, it’ll make you think more functionally!”) but for some reason I never connected it to the linguistic side of things.

I remember learning about the effects of language on cognition in a psychology course I took a while ago, it’s interesting to think about how that could apply more broadly.

cozis · a year ago
> I used to think (well, I still do) that getting a barebones service up and running using the system APIs at the lowest level like this is so satisfying. It's sort of magical, really

Totally agree. And actually using them is even more satisfying. I'm starting to get curious about email protocols..

> I'm kind of surprised that the vanilla poll() can put up numbers like you were seeing

Me too. I assumed I was going to go with epoll at some point, but poll() is working great.

pdp11ty · a year ago
People seem to forget that all of their amazing, wonderful abstractions are, at their core, doing exactly this: opening sockets, reading from them, writing to them, etc. There is nothing new under the sun.
litbear2022 · a year ago
You may be interested in this https://news.ycombinator.com/item?id=27431910

> As of 2024, the althttpd instance for sqlite.org answers more than 500,000 HTTP requests per day (about 5 or 6 per second) delivering about 200GB of content per day (about 18 megabits/second) on a $40/month Linode. The load average on this machine normally stays around 0.5. About 19% of the HTTP requests are CGI to various Fossil source-code repositories.

cozis · a year ago
This post was of great inspiration! It made me realize something like this was doable
petee · a year ago
Aside, if you want to write C apps but aren't comfortable writing the public facing parts, 'Kore' is a great framework with some handy builtins like ACME cert management, Pgsql, curl, websockets, etc.

Essentially build and run modules, and they can be combined (including mixing Lua/Python + C.)

https://kore.io/

greenavocado · a year ago
Finally a website that doesn't crash when it shows up on the front page
afavour · a year ago
Any site with a CDN in front of it can do that.

Don’t get me wrong this is an awesome project but if you really care about this kind of thing in a production scenario and you’re serving mostly static content… just use a CDN. It’ll pretty much always outperform just about anything you write. It’s just boring.

chrismorgan · a year ago
Even caching is normally unnecessary.

Honestly, HN front page traffic isn’t much. For most, it probably peaks at about one page load¹ per second², and if your web server software can’t cope with that, it’s bad.

Even if your site uses PHP and MySQL and queries the database to handle every request, hopefully static resources bypass all that and are served straight from disk. CPU and memory usage will be negligible, and a 100Mbps uplink will handle it all easily. So then, hopefully you’re only left with one request that’s actually doing database work, and if it can’t answer in one whole, entire second, it’s bad.

(I’m talking about general web pages here, not web apps, which have a somewhat different balance; but still for most things HN traffic shouldn’t cause a sweat, even if you’ve completely ignored caching.)

Seriously, a not-too-awful WordPress installation on a Raspberry Pi could probably cope with HN traffic.

—⁂—

¹ Note this metric: page loads, not requests. Requests per second will scale with first-party requests per page.

² From a quick search, two sources from this year: https://marcotm.com/articles/stats-of-being-on-the-hacker-ne..., https://harrisonbroadbent.com/blog/hacker-news-traffic-spike.... Both use JS tracking, but even doubling the number to generously account for we sensible people who use content blockers has the hourly average under one load per second.

tazjin · a year ago
> Any site with a CDN in front of it can do that.

You are vastly overestimating HN front page traffic. Any reasonable system on any reasonable machine with any reasonable link can do this. And I really do mean reasonable: I've served front-page traffic from a dedicated server in a DC, and from a small NUC in a closet at home, and both handled it completely fine.

theideaofcoffee · a year ago
This sort of trivializes the effort and the fun of a project like this, doesn't it? Yes, you'll want to put all of your ducks in a row when you go to full production and you've reached full virality and your project is taking 5 million RPS globally and offloading all of that onto a CDN and making sure your clients requests are well respected in terms of cache control and making it secure and putting requests through a waf and and and and and. Yes we know. Lighten up. The comment you're replying to was meant to be lighthearted.
kqr · a year ago
Any site that consists of static files served by a professional-grade web server like nginx on a small VPS can also trivially do that.
interroboink · a year ago
If you're hosting static data, shouldn't HTTP cache flags be enough in most cases? Read-only cacheable data shouldn't be toppling even a modest server. Even without an explicit CDN, various nodes along the chain will be caching it.

(though I confess it's been some years since I've worked in this area)

nicoburns · a year ago
Pretty much anything that isn't Wordpress is ok these days I think.
rubyn00bie · a year ago
Uhh… doesn’t the link go to GitHub? I’m a little confused by this comment. I mean the project is neat and cool. But I imagine most folks go to GitHub and don’t go to the link showing the webpage. Am I missing something?
wilkystyle · a year ago
Link to the actual site is at the top of the GitHub page.
seumars · a year ago
>I enjoy making my own tools and I'm a bit tired of hearing that everything needs to be "battle-tested." So what it will crash? Bugs can be fixed :^)

I love it

Dead Comment