What Is a Reverse Proxy, and How Does It Work?:
A regular proxy, called a Forward Proxy, is a server through which a user's connection is routed through. In many ways, it's like a simple VPN, which sits in front of your internet connection. VPNs are a common example of these, but they also include things like school firewalls, which may block access to certain content.
A reverse proxy works a little differently. It's a backend tool used by system administrators. Instead of connecting directly to a website serving content, a reverse proxy like NGINX can sit in the middle. When it receives a request from a user, it will send forward, or "proxy," that request to the final server. This server is called the "origin server" since it's what will actually be responding to requests.
While a user will probably know if they're being routed through a forward proxy like a VPN or firewall, reverse proxies are backend tools. As far as the user knows, they're just connecting to a website. Everything behind the reverse proxy is hidden, and this has numerous benefits as well.
This effect also happens in reverse though. The origin server does not have a direct connection to the user and will only see a bunch of requests coming from the reverse proxy's IP. This can be a problem, but most proxy services like NGINX will add headers like X-Forwarded-For to the request. These headers will inform the origin server of the client's actual IP address.
Reverse proxies are pretty simple in concept but prove to be a surprisingly useful tool with many unexpected use cases. One of the main benefits of a reverse proxy is how lightweight they can be. Since they just forward requests, they don't have to do a ton of processing, especially in situations where a database needs to be queried.
[...] Since a reverse proxy is often much faster at responding than the origin server, a technique called caching is commonly used to speed up requests on common routes. Caching is when the page data is stored on the reverse proxy, and only requested from the origin server once every few seconds/minutes. This reduces the strain on the origin server dramatically.
I have been specifically asked to include topics for discussion that are suitable for those earlier on in their studies or careers. More experienced community members can expand the topic further if they so wish. [JR]
(Score: 3, Informative) by bradley13 on Tuesday August 09 2022, @08:37AM
Or for those of us who work in other areas of IT. It's nice to have a simple explanation of something we all use, everyday, probably without even knowing it...
Everyone is somebody else's weirdo.
(Score: 1, Redundant) by c0lo on Tuesday August 09 2022, @08:38AM (12 children)
1. even me, a software engineer with a low interest in web/sys admining know what a reverse proxy/cache is
2. the Wikipedia entry [wikipedia.org] has a lot more info than the linked FA - granted, less pictures, but what it would be so hard to understand?
https://www.youtube.com/watch?v=aoFiw2jMy-0
(Score: 4, Insightful) by janrinok on Tuesday August 09 2022, @08:49AM (4 children)
You don't have to read every story.
(Score: 3, Funny) by c0lo on Tuesday August 09 2022, @09:08AM (3 children)
Indeed, I don't. As I don't need to do many thing in this life. One of them is to actually make any use of S/N.
Thanks for pointing this out, I'm going to correct my behavior tout de suite.
Until next time I'll be in need for a reminder, take care and all the best the life can offer you.
https://www.youtube.com/watch?v=aoFiw2jMy-0
(Score: 2) by janrinok on Tuesday August 09 2022, @09:31AM (1 child)
I wasn't trying to offend you, and if I have then I will willingly apologise. I thought that I had made the purpose of this particular story clear, and the comment before you had a completely different view of it. Not every story will excite or even interest every member of our community.
(Score: 3, Interesting) by RS3 on Tuesday August 09 2022, @12:55PM
I don't think you need to apologize. We're able to skim over article titles and decide which articles we want to read, or maybe just skim.
I appreciate _all_ of the stories posted, and thank you for the uptick in these kinds of tech articles. As I've mentioned before I'm a server admin (the only one for the particular server site) and I wasn't totally sure what a "reverse-proxy" is. Or read up on it and didn't retain it. I tend to focus on things I need to know and do- being an engineer I'm focused on solving problems.
The one I'm most familiar with is "Squid", I just didn't know it was called "reverse proxy"- I just thought of it as an intermediary cache. My servers have enough caching and other optimizations such that an intermediary is unnecessary.
Thanks for all you do here!
(Score: 3, Funny) by Mykl on Tuesday August 09 2022, @10:43PM
Bye. So sorry that you're leaving. See you here tomorrow.
(Score: 5, Interesting) by Rich on Tuesday August 09 2022, @10:56AM (6 children)
Some have forgotten what a reverse proxy is. During the dotcom bubble I held the lovely title of "Chief Scientist" for a web catalog, where most of my work was dedicated to automatically categorize pages. But, IIRC, I also was in charge of the squid. It's SO long ago that my memory is fuzzy, but they had their backend on Windows in Delphi and didn't want to directly expose that to the wild wild web. I think I was the only one among the crew who could get hands-on with a Linux box, so yes, I probably have actual reverse proxy work experience. But if it hadn't been for the article, I would have forgotten. Nice reminder of what happened in the meantime, like that Apache and NGINX seem to have one built in these days.
Initially, I was confused, because right now, I'm supposed to come up with a scheme to VNC (from outside) into otherwise air-gapped machines in areas where the only net supply is heavily locked-down NAT. Without compromising too much of security, and the machines get really chatty and fuck up their precious validated state once they sense the slightest bit of internet (read: major vendor modern desktop OS...) That will probably need a little box which also might be called "reverse proxy", but is something entirely different.
(Score: 2) by DannyB on Tuesday August 09 2022, @03:07PM (2 children)
I love that description.
Let me guess . . . managers.
Can VNC be technically considered to be a RAT or used as such?
How often should I have my memory checked? I used to know but...
(Score: 2) by Rich on Tuesday August 09 2022, @04:20PM (1 child)
Using VNC as a verb?
Well, for some situations like the described one, antigens might be tested more rapidly, if support has a tool to administrate remotely :)
Not this time. Actual real-life requirements. Well, sort of. The pain hasn't been big enough for the managers to move the entire system to Linux or something where we would be in control of the entire machine and all updates. If you have a controlled environment, where every setup has to undergo validation tests for a year and then has to be frozen in that state, you can't let the system screw it up with weekly mandatory updates. The KISS solution is to simply not let the device access the network, which is what is being done now. However, if it weren't for the stupid "push every new marketing shit on the users" behaviour of the OSs, nothing would be in the way of opening a reasonably secure (certificates) SSH connection and do a bit of remote servicing. When you have to factor every on-site support call at a grand including all expenses, a bit of remote assistance sounds like a good idea.
(Score: 2) by DannyB on Wednesday August 10 2022, @05:40PM
Alexander Haig, secretary of state under Regan introduced and frequently demonstrated the technique of verbing a noun.
How often should I have my memory checked? I used to know but...
(Score: 5, Interesting) by dx3bydt3 on Tuesday August 09 2022, @10:14PM (2 children)
Two bits of software which may, or may not be of use to you, but I have found really handy are tsocks and autossh. I'm using starlink internet, which is carrier grade NAT, so I can't directly tunnel in to my home network anymore. When I need some home access I keep a dynamic port forward to another server outside my network, which allows me to use tsocks from there to handle whatever other sorts of tunnel I want to do, in reverse through the dynamic port.
Now, I've been asked, in my case by more than one person "why don't you just use a VPN?", but I like my way because it's all on hardware that I control, costs nothing and only routes the specific traffic I want where I want. Disclaimer: I'm not a networking security expert, and my methods may be both inadvisable and irrational, but I've found they worked well enough for me so far.
(Score: 2) by Rich on Wednesday August 10 2022, @12:13AM (1 child)
Thanks for the leads, I'll keep those in mind. The "carrier grade NAT" is pretty much similar to my situation, I might not even be able to poke UPnP or even UDP holes, so it's got to be a single callout to 443 and everything in reverse from there.
The feasibility study is upcoming and I haven't dug in to deep, I had my eyes on plain ssh voodoo plus a bit of port forwarding magic, which, going back and forth through a completely whitelist-based process would allow a remote technician to pick from a list of online machines and then initiate what amounts to a local connect on port 5900. We'll have to see. I don't have the full requirements on my desk yet.
(Score: 2) by dx3bydt3 on Wednesday August 10 2022, @10:45AM
Based on what your sketch of a plan, another option you might consider is RustDesk, in lieu of VNC. You would set up your own server (pretty simple to do) and that server relays the remote control traffic. The interface looks identical to TeamViewer. What you're thinking will work too, I've done much the same, and it works fine, but it's good to have options.
(Score: 2) by Opportunist on Tuesday August 09 2022, @11:13AM (1 child)
Kinda like what the reverse tachyon impulse is for the Star Trek writers.
(Score: 2) by DannyB on Tuesday August 09 2022, @03:09PM
I remember an April fools joke prior to Babylon 5 season 3. A fake announcement that Babylon 5 was being acquired by Star Trek and would now be named Star Trek: Babylon 5. The first two seasons would be explained as a hallucination suffered by Data after passing through an inverse vertiron particle displacement field.
How often should I have my memory checked? I used to know but...
(Score: 2) by ElizabethGreene on Tuesday August 09 2022, @11:56AM
I worked at a company that used these as a "dmz" tier for a web services. They didn't provide caching or other functions, they just proxied requests. The only benefit I could see from it, beyond what could be accomplished with firewall port blocking, was it would have made submitting a malformed request harder for the attacker.
Iirc Apache httpd was the engine underlying the reverse proxy.
(Score: 3, Interesting) by hopdevil on Tuesday August 09 2022, @01:58PM (4 children)
Even though reverse proxies are used extensively, for a variety of reasons, in organizations large and small, you should avoid them if possible. Creating more "bumps" on the wire increases latency, makes security and resiliency more difficult. They will generally cause you grief while you attempt to trace the network breakage by unraveling the 10 reverse proxies that have made their way between your users and your server because you allowed one. Yes, they are like cockroaches, treat them as such
(Score: 5, Funny) by DECbot on Tuesday August 09 2022, @02:21PM
Your description of a reverse proxy seems to diverge from the BOFH manual. The users are the cockroaches that must be kept clear of the actual servers to maintain the server's uptime and availability. The reverse proxy is a clever, disposable, Rube Goldberg device designed to minimize the unwashed masses' exposure to the valuable production servers.
cats~$ sudo chown -R us /home/base
(Score: 4, Interesting) by isostatic on Tuesday August 09 2022, @02:58PM (1 child)
I proxy 305 internal services through a reverse proxy (well a pair of them). My internal services don't need to worry about SSL, or user authenticaiton. Given that half the things I expose via a proxy don't have the ability to do either there's no choice, and the overhead of dealing with the rest (including managing 100+ SSL certs with no ability to use ACME) would be excessive. The proxy handles authentication and auditing (via x509 certs or OIDC), and passes the username to the internal service as a header, which they can thus use for any authorization that's required (beyond simply being a member of our company).
(Score: 2) by Booga1 on Wednesday August 10 2022, @06:00PM
The SSL issue is part of why I run a reverse proxy for some services. I currently host a couple of services that pre-package their own NodeJS webserver and don't natively support SSL. After two weeks of trying to shoehorn SSL in through hacking at the code, I gave up and put them behind a reverse proxy. Headache solved.
The second problem was simplifying URLs. Now links can just point to https://monitor.example.com [example.com] instead of https://example.com:8081/status/monitor [example.com]
The third problem the reverse proxy solved was the associated firewall rules. No more custom open ports on the firewall, just port 443.
Reverse proxies can reduce downtime as well. Need to do a complex system upgrade or complete overhaul? Do it on a spare machine at your own pace. Leave the original server up until the new one is ready, then just update the reverse proxy to point to the new IP. It's like magic with nearly zero downtime. Users have to log in again, but that's about it for the stuff I run.
Smart use of reverse proxies can actually eliminate system administration hassles instead of adding to them.
(Score: 2) by maxwell demon on Tuesday August 09 2022, @06:06PM
Unless you're running a gaming or stock exchange trading server, I don't see how that latency would matter.
The Tao of math: The numbers you can count are not the real numbers.
(Score: 3, Interesting) by gnuman on Tuesday August 09 2022, @07:25PM
HAProxy is probably better choice as a proxy than Nginx. Nginx is more like Apache Web Server, which also has a proxy mod by the way.
https://cloudinfrastructureservices.co.uk/haproxy-vs-nginx-whats-the-difference/ [cloudinfrastructureservices.co.uk]
I prefer HAProxy over the rest for reverse proxying because that's what it was designed for. It doesn't really have other parts and things to go wrong.
(Score: 3, Interesting) by VLM on Tuesday August 09 2022, @07:47PM
Some I can think of that were unmentioned include:
a "http" level router. We're gonna send those URLs to the static webserver, those URLs to the legacy Scala "thing", those URLs to the Play framework app, etc. Also SLA level routing like logged in users get more resources than DDOS attacks from the internet.
logging at "http" level. Now yeah technically you can do special logging on the reverse proxy sure but similar to line above you can not only forward to special destination but log certain destinations. So all the "login" traffic goes to the special accounting server for billing purposes but normal boring traffic is not logged.
If you have a cluster: You can do unintelligent load balancing. Intelligent load balancing is usually called load balancing. Also you can implement A/B testing for rollouts, send 1% of incoming traffic to the next version of the software and then track user behavior on that 1% and decide if the new version is better or worse or you broke something you never considered testing, etc.
(Score: 4, Interesting) by Ken_g6 on Tuesday August 09 2022, @09:00PM
Reverse proxies can cause security problems if not set up correctly. If your application has a system for logging in and displaying user-specific pages, you need to make sure the application or the proxy or both are configured so that the proxy does not cache such pages. Otherwise more than one user might see your private data! Or you might see another user's private data when looking for your own.
The application can, in theory, prevent this by sending certain headers to the reverse proxy, which is probably a good idea anyway to prevent caching by any other proxy or browser cache. Of course, those headers have to be set up properly, and the proxy has to respect those headers, or the page might get cached anyway. So redundant prevention of caching private pages isn't a bad idea.
(Score: 2) by ledow on Wednesday August 10 2022, @06:16PM
I use reverse proxies all the time.
It's the best way to use, say, LetsEncrypt - you can have an Apache reverse proxy on a server that can be publicly connected. It will answer all ACME requests for new certificates and "prove" that it owns the necessary domain. Then just have it configured to reverse-proxy back to the site you want, wherever that may be.
For instance that could be a local site on the same machine, another site entirely running on another IP anywhere else (don't even need DNS), or it could even be over a VPN connected to that same machine into the internal network.
This lets you filter all responses and sanity-check the HTTP requests on the Apache at the boundary, reverse proxy to any destination, and yet claim to all be coming from one machine that handles the DNS, LetsEncrypt SSL, etc. for you so you don't have to open all those other systems up to the world.
I also used to use a domain name host who would let you buy your domain, and then they would reverse-proxy it to anywhere (e.g. at the time, my Geocities account!). So you could have a big flash domain name and yet the content was being pulled from elsewhere where you couldn't manage the DNS etc. at all, or hide the actual origin of the server content.
It would work even for legacy systems and those things that couldn't do TLS/SSL themselves for whatever reason, communicating with them and then wrapping the whole connection in a valid modern SSL encryption.
It can also do other stuff along the way such as caching and modifying content, reducing load on your internal servers (and bandwidth use in limited scenarios), and providing a "This service is offline" kind of page automatically if there was a failure of the backend.
Reverse proxy is a very cool usage of the system and I've done it in both Squid and Apache.
I still have certain servers that VPN out to an external server, and are exposed to the world only through a reverse proxy that talks back to the internal system over the VPN. Caching, proxying, TLS-wrapping, and somewhat sanity-checking everything before it goes back over the back-end. Also acts as a kind of middle man that if the outside server is compromised it's not automatic compromise of the data on the backend, and the company running the reverse proxy server have no way to just read the data off the reverse proxy's hard drive even after the event.
There is at least one major brand of firewall/webfilter (Smoothwall) that actually lets you use reverse proxy basically in replacement of things like direct port-forwards, very simply, logging and sanitising the data along the way to the back-end if you so wish.
Reverse proxies are now part of my bread and butter where I can set them up almost blind - they are running my Plex and other home servers (which run off things like Raspberry Pi's and Netgear NAS boxes in my house - but wrapping it in an up-to-date verified LetsEncrypt TLS cert for my domain (so the Pi/Netgear doesn't have to do the ACME verification or the SSL overhead) and all my domains just point to the same external server and the backend / origin can change as often as necessary - even to dynamic IPs and the like - without having to keep changing DNS or certs.