NGINX Production Guides
Most NGINX problems don't announce themselves. A 502 Bad Gateway that only fires under load. An nginx upstream keepalive misconfiguration that passes all tests in staging, then exhausts ephemeral ports at 400 req/s in production. A proxy_read_timeout left at the 60-second default — invisible until a single slow endpoint starts causing cascading 504s.
NGINX configuration covers a deceptively wide surface area. As a reverse proxy, it handles connection pooling, header forwarding, and timeout negotiation between clients and upstreams. As a TLS terminator, it controls protocol versions, cipher selection, and session resumption. As a rate limiter, it enforces access policy at the edge — before requests touch your application. Getting any of these wrong has consequences that rarely show up in synthetic tests.
These guides focus on nginx performance tuning and debugging for real production deployments: reverse proxy in front of Node.js, Django, Rails, or microservice backends. Every guide includes the specific directives involved, the failure mode it addresses, and the commands to verify the configuration is behaving as expected. No intro-level explanations of what a web server is. No copy-paste configs without explanation. Just the failure modes, the root causes, and the fixes.
5 articles
All articles
Nginx Rate Limiting Configuration: Practical Guide With Examples
Configure nginx rate limiting with limit_req_zone and limit_req — with real examples for API protection, login endpoints, and burst handling in production.
Nginx 502 Bad Gateway Fix: Root Causes and Solutions
Fix nginx 502 bad gateway errors — identify the root cause from the error log, fix upstream connection issues, keepalive config, and timeout problems in production.
NGINX Troubleshooting Guide: Complete Production Reference
The complete NGINX troubleshooting reference — 502/504 errors, upstream failures, SSL issues, connection limits, keepalive misconfiguration, and production debugging workflows with real commands.
NGINX Upstream Keepalive Explained: Why Missing It Causes 502 Errors
Missing keepalive in your NGINX upstream block silently kills connections under load. Here's exactly what keepalive does, how TCP connection reuse works, and the production-ready config that stops 502s before they start.
NGINX 502 Bad Gateway Under Load: Causes, Debugging, and Fixes
NGINX returning 502 Bad Gateway only under high load? This guide covers every root cause — ephemeral port exhaustion, missing keepalive, proxy timeouts, worker limits — with step-by-step debugging commands and production-ready config fixes.
Core Concepts
NGINX uses an asynchronous, event-driven architecture — a single worker process handles thousands of concurrent connections without blocking. Understanding the worker model, connection handling, and how upstreams are managed is the prerequisite for diagnosing anything that goes wrong at scale.
- —Worker processes and the event-driven connection model
- —How NGINX handles upstream connections vs client connections
- —The request processing phases: rewrite, access, content, log
- —Upstream blocks — defining backend pools and load balancing
- —Location matching order: exact, prefix, regex
- —Connection limits: worker_connections, worker_rlimit_nofile, and OS ulimits
- —How keep-alive differs between client-side and upstream-side
Performance Tuning
The default nginx configuration is conservative. Under production load — especially in reverse proxy deployments — several defaults become bottlenecks: upstream connection reuse is disabled by default, buffer sizes assume small responses, and worker limits are often set lower than the system can handle. These are the settings worth tuning.
- —upstream keepalive — connection reuse to prevent ephemeral port exhaustion
- —proxy_read_timeout / proxy_send_timeout / proxy_connect_timeout — tuning for real backend latency
- —worker_processes auto and worker_cpu_affinity for multi-core systems
- —sendfile, tcp_nopush, tcp_nodelay — reducing syscall overhead for static assets
- —Buffer sizing: proxy_buffers, proxy_buffer_size, proxy_busy_buffers_size
- —gzip and Brotli compression — when to enable, what to compress
- —open_file_cache — reducing stat() calls under high request volume
Troubleshooting & Debugging
NGINX error log messages are specific about what failed but rarely obvious about why. A 502 under load looks identical whether the cause is a crashed upstream process, an exhausted keepalive pool, or a misconfigured proxy_pass. The diagnostic workflow matters as much as knowing the fixes.
- —502 Bad Gateway — upstream process down, keepalive exhausted, or proxy_pass misconfigured
- —504 Gateway Timeout — proxy_read_timeout too short for slow upstream responses
- —Connection reset by peer — upstream closed connection before NGINX finished reading
- —499 Client Closed Request — client timeout shorter than upstream processing time
- —upstream timed out (110: Operation timed out) — what the error log is actually telling you
- —connect() failed (111: Connection refused) — port not listening or firewall drop
- —no live upstreams while connecting to upstream — all backends failed health checks
- —SSL handshake errors — certificate chain issues and TLS version mismatches
Recommended NGINX Articles
About these guides
These guides are written by a Senior L3 engineer with production experience across high-traffic deployments running NGINX as a reverse proxy, SSL terminator, and API gateway. The content comes from real incidents — debugging 502s under sustained load, auditing nginx configurations against PCI-DSS requirements, tuning upstream keepalive on systems handling tens of thousands of concurrent connections, and writing the runbooks that actually get used during incidents. Not documentation rewrites. Not synthetic examples.
Paste any nginx configuration — server block, reverse proxy config, or full nginx.conf — and get an instant scored report: missing security headers, TLS issues, proxy header gaps, rate limiting coverage, and timeout settings. Runs in-browser. Nothing is sent to a server.