Introduction to Nginx Reverse Proxy and Load Balancing¶
Why Reverse Proxy and Load Balancing?¶
Imagine you run a small restaurant with just one chef. As more customers come, you need to hire more chefs (add servers), but it would be chaotic if customers approached each chef directly. Instead, you need a receptionist (reverse proxy server) to take orders and distribute them to different chefs (backend servers). This is the core role of reverse proxy and load balancing: hide backend servers while distributing request load across multiple servers.
1. What is a Reverse Proxy?¶
A reverse proxy acts as a “receptionist” that receives user requests, forwards them to backend servers for processing, and returns the results to the user. Users only interact with the “receptionist” and don’t need to know about the backend servers.
Example:
When a user visits https://example.com, Nginx (the receptionist) receives the request, forwards it to a backend web server (e.g., 192.168.1.101), and returns the processed page to the user. The user never knows the exact backend server behind Nginx.
2. What is Load Balancing?¶
As user traffic grows, a single backend server may become overloaded (e.g., full CPU/memory). Load balancing solves this by having the “receptionist” (Nginx) distribute requests evenly across multiple backend servers to prevent overload.
How Nginx Implements Load Balancing:
Nginx uses the upstream module to define backend server groups and forwards requests to this group. By default, Nginx uses a round-robin strategy (requests are distributed in rotation).
3. Step-by-Step: Configure Nginx Reverse Proxy + Load Balancing¶
Assume Nginx is already installed (Ubuntu/Debian: sudo apt install nginx; CentOS: sudo yum install nginx). We’ll set up a simple load-balanced environment.
Step 1: Define Backend Server Group¶
Suppose you have two backend web servers with IPs 192.168.1.101 and 192.168.1.102 (ensure they’re running and accessible).
Open the Nginx configuration file (e.g., /etc/nginx/conf.d/default.conf or /etc/nginx/nginx.conf) and add:
# Define backend server group named "backend_servers"
upstream backend_servers {
server 192.168.1.101; # Backend Server 1
server 192.168.1.102; # Backend Server 2
}
Step 2: Configure Reverse Proxy Rules¶
Add reverse proxy rules to forward user requests to the backend group:
server {
listen 80; # Nginx listening port (user access port)
server_name example.com; # Your domain or server IP
location / { # Match all requests
proxy_pass http://backend_servers; # Forward to backend group
proxy_set_header Host $host; # Pass request headers (backend sees the real domain)
proxy_set_header X-Real-IP $remote_addr; # Pass user IP (backend sees the real IP)
}
}
Step 3: Test the Configuration¶
- Check Syntax:
Runsudo nginx -tto verify no errors. Success if output shows:
syntax is ok
test is successful
-
Restart Nginx:
Apply changes withsudo systemctl restart nginx. -
Test Load Balancing:
Access Nginx’s IP (e.g.,http://192.168.1.100) in a browser. Requests will alternate between192.168.1.101and192.168.1.102.
4. Advanced: Adjust Load Balancing Strategies¶
Nginx uses round-robin by default, but you can customize distribution:
1. Weighted Round-Robin (Distribute by Weight)¶
For servers with different performance (e.g., one is stronger):
upstream backend_servers {
server 192.168.1.101 weight=5; # 50% probability
server 192.168.1.102 weight=3; # 30% probability
}
2. IP Hash (Consistent User Session)¶
To ensure a user always connects to the same server (e.g., for shopping cart data):
upstream backend_servers {
ip_hash; # Distribute by IP hash (same user → same backend)
server 192.168.1.101;
server 192.168.1.102;
}
5. Summary¶
- Reverse Proxy: Hides backend servers, centralizes access, and improves security/management.
- Load Balancing: Distributes traffic across servers to prevent overload and ensure stability.
- Nginx Core: Use
upstreamfor backend groups andproxy_passfor request forwarding.
For beginners, start with round-robin and basic configuration. As needs grow, explore advanced strategies like health checks or URL hashing, but the core principle remains: distribute requests efficiently.
Now, try setting up your own load-balanced environment on Linux!