Post

Proxies

Commonly utilized by malicious hackers to mask their identity and disguise their location, these distinct proxy servers also serve a variety of essential real-world purposes, including caching, managing access control, and bypassing censorship, among other uses.

Let’s examine the potential applications of proxies in the realm of system design. It is crucial to note that there are two types of proxies: forward proxy and reverse proxy. In the industry, people tend to use the term “proxy” rather loosely, and when they refer to proxies, they often mean the forward proxy without explicitly mentioning the word ‘forward’. Nonetheless, understanding the distinction between the two is essential.


Forward Proxy

In simple terms, a forward proxy is a server that positions itself between a client or a set of clients and another server or multiple servers. The forward proxy essentially represents the client or clients, and can be considered part of the client’s team or the client side during interactions between the client and the server.

In essence, the client sends a request intended for the server to the forward proxy, almost as if the client is saying, “Hello, forward proxy. Please communicate with the server for me.” The forward proxy conveys the request to the server, which receives it from the forward proxy rather than the client directly. When the server sends a response, it addresses it to the forward proxy, which then passes the response back to the client.

As you can envision, a forward proxy can function as a method to conceal the identity of a client requesting information from a server. This occurs because when a client sends a request to the forward proxy, the forward proxy forwards the request to the server using its own IP address (‘P’) as the source IP address. In other words, the original client source IP address (‘A’) is replaced with the forward proxy’s IP address (‘P’) in the new request.

It is important to note that certain types of forward proxies may still render the client’s source IP address accessible or visible to the server in some form. However, in most cases, the original source IP address will not be visible. This basic principle is the foundation of how VPNs operate.

Forward proxies serve several practical purposes beyond anonymity:

  • Corporate firewalls and access control: Organizations route all employee internet traffic through a forward proxy to enforce security policies, block malicious websites, and log activity for compliance.
  • Content filtering: Schools and libraries use forward proxies to restrict access to certain categories of websites.
  • Bypassing geographic restrictions: Users can route their traffic through a forward proxy located in a different region to access content that is otherwise geo-blocked.

img


Reverse Proxy

Reverse proxies can be somewhat more complex. While forward proxies act on behalf of clients in an interaction between a client and a server, reverse proxies act on behalf of a server in the same interaction. The best way to think about it is that when a client wants to interact with a server and send a request, if the reverse proxy has been properly configured by the server or the entity owning the server, the client’s request will actually go to the reverse proxy, and the client won’t know that its request is going to the reverse proxy.

Although the visualization of the process would appear similar to the forward proxy, the underlying meaning is quite different, which is what makes it a bit tricky.

img

When the client issues a request to the server and the reverse proxy has been configured properly, the request will actually go to the reverse proxy. The reverse proxy then forwards the request to the server, and the server returns a response to the reverse proxy. Finally, the reverse proxy returns the response to the client. The key point here is that, in the context of a forward proxy, the server has no idea that the client and the forward proxy are essentially one and the same. However, in the case of a reverse proxy, it’s the client that has no idea that the reverse proxy and the server are effectively one and the same.

In other words, from the client’s perspective, there are no two separate entities (the reverse proxy and the server); they are considered one and the same, with the client assuming it is communicating directly with the server.

For example, if a website has properly configured a reverse proxy and you want to access the site by simply typing the URL in the browser (e.g., google.com), the browser will make a DNS query to obtain the IP address of the site, and it will return the IP address of the reverse proxy.


Use cases

Reverse proxies are incredibly useful, especially when designing complex systems. They offer various benefits and are a valuable tool to have in your toolkit.

For instance, you can configure your reverse proxy to filter out requests you want to ignore, preventing your servers from dealing with certain types of requests. You can also use your reverse proxy for logging, gathering metrics, or caching items such as HTML pages, reducing the burden on your server.

Perhaps one of the best use cases for a reverse proxy is using it as a load balancer. This can also have security implications. For example, imagine a malicious client attempting to bring down a server by issuing a large number of requests. The reverse proxy can act as a shield, distributing the requests evenly among various servers as a load balancer, ensuring no single server is overwhelmed by the malicious client. While this is a simplified example, it demonstrates the numerous use cases for reverse proxies.


Transparent Proxy

A transparent proxy intercepts network traffic without the client being aware of its existence. Unlike a traditional forward proxy, the client does not need to be configured to use it – the proxy sits inline on the network path and intercepts requests automatically. Transparent proxies are commonly deployed by ISPs for caching popular content, by organizations for monitoring network usage, and in CDN architectures to route requests to the nearest edge server.

Proxy Software Examples

Several widely used software solutions implement proxy functionality:

  • Nginx: A high-performance web server frequently used as a reverse proxy and load balancer. It excels at serving static content and terminating TLS connections.
  • HAProxy: A dedicated, high-performance TCP/HTTP load balancer and reverse proxy known for its reliability in production environments.
  • Squid: A caching and forwarding HTTP proxy commonly used for web content caching and access control.
  • Envoy: A modern, cloud-native proxy originally built at Lyft. It is widely used as a sidecar proxy in service mesh architectures (e.g., Istio) and provides advanced observability, traffic management, and security features.


In summary, proxies are both very simple and powerful tools that offer a range of benefits and use cases in various contexts.

This post is licensed under CC BY 4.0 by the author.