Security Ops & Infrastructure Engineering
Documenting the real world — NGINX misconfigs, Docker log floods, midnight incident responses, and everything in between.
Featured Posts
all posts →NGINX Upstream Keepalive Explained: Why Missing It Causes 502 Errors
Missing keepalive in your NGINX upstream block silently kills connections under load. Here's exactly what keepalive does, how TCP connection reuse works, and the production-ready config that stops 502s before they start.
November 28, 2024
NGINX 502 Bad Gateway Under Load: Causes, Debugging, and Fixes
NGINX returning 502 Bad Gateway only under high load? This guide covers every root cause — ephemeral port exhaustion, missing keepalive, proxy timeouts, worker limits — with step-by-step debugging commands and production-ready config fixes.
November 14, 2024
Docker Ate My Disk: Fixing Log Rotation Before It Kills Production
How a single verbose container filled a 500GB disk in 72 hours, and the exact daemon.json config that stops it from ever happening again.
October 3, 2024
NGINX SSL Hardening: From C Grade to A+ on SSL Labs
A step-by-step walkthrough of the NGINX TLS configuration changes that take you from a mediocre SSL rating to a perfect score — without breaking compatibility.
September 20, 2024
Recent Posts
view all →Linux TIME_WAIT Explained: Why It Causes Connection Failures and How to Fix It
Linux TIME_WAIT exhausts ephemeral ports and causes ECONNREFUSED under load — even when your app is healthy. Learn what TIME_WAIT is, how to detect port exhaustion with ss and netstat, and the exact sysctl fixes that resolve it.
NGINX Upstream Keepalive Explained: Why Missing It Causes 502 Errors
Missing keepalive in your NGINX upstream block silently kills connections under load. Here's exactly what keepalive does, how TCP connection reuse works, and the production-ready config that stops 502s before they start.
NGINX 502 Bad Gateway Under Load: Causes, Debugging, and Fixes
NGINX returning 502 Bad Gateway only under high load? This guide covers every root cause — ephemeral port exhaustion, missing keepalive, proxy timeouts, worker limits — with step-by-step debugging commands and production-ready config fixes.
Docker Ate My Disk: Fixing Log Rotation Before It Kills Production
How a single verbose container filled a 500GB disk in 72 hours, and the exact daemon.json config that stops it from ever happening again.
NGINX SSL Hardening: From C Grade to A+ on SSL Labs
A step-by-step walkthrough of the NGINX TLS configuration changes that take you from a mediocre SSL rating to a perfect score — without breaking compatibility.
Reading Logs Like a Detective: A Field Guide to Incident Triage
The exact commands and mental models I use to go from 'something is wrong' to 'I know exactly what happened' in under 15 minutes.