Docker Ate My Disk: Fixing Log Rotation Before It Kills Production

How a single verbose container filled a 500GB disk in 72 hours, and the exact daemon.json config that stops it from ever happening again.

October 3, 2024ยท6 min readยทDamon

3am, Disk Full, Everything Down

The alert came in at 3:17am: disk utilization at 100%, multiple services failing. SSH into the host:

df -h
# Filesystem      Size  Used Avail Use% Mounted on
# /dev/sda1       500G  500G     0 100% /

Find the culprit:

du -sh /* 2>/dev/null | sort -rh | head -10
# 487G  /var
du -sh /var/* 2>/dev/null | sort -rh | head -5
# 487G  /var/lib/docker
du -sh /var/lib/docker/* 2>/dev/null | sort -rh | head -5
# 484G  /var/lib/docker/containers

There it was. Docker container logs โ€” unrotated, unbounded, growing forever.

ls -lah /var/lib/docker/containers/a3f9b1c.../
# -rw-r----- 1 root root 484G Nov 3 03:17 a3f9b1c...-json.log

484 gigabytes. One log file. One container. 72 hours of verbose output with no rotation configured.

The Emergency Fix

You can't just rm the file while Docker holds it open โ€” the inode stays allocated. The right move:

# Truncate the file (Docker keeps the handle, disk space is freed immediately)
truncate -s 0 /var/lib/docker/containers/<container-id>/<container-id>-json.log

Services came back up within seconds. But this was a symptom, not the problem.

Why This Happens

By default, Docker's json-file logging driver has no size limit and no rotation. Every byte your container writes to stdout/stderr goes into that file and stays there forever. On a verbose app, that's a disaster.

The dangerous default:

{
  "log-driver": "json-file"
}

That's it. No max-size. No max-file. No expiry. Pure chaos at scale.

The Permanent Fix: daemon.json

Edit /etc/docker/daemon.json:

{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "50m",
    "max-file": "5",
    "compress": "true"
  }
}

This sets a global default for all containers on the host:

  • max-size: 50m โ€” rotate when the log hits 50MB
  • max-file: 5 โ€” keep 5 rotated files (250MB max total per container)
  • compress: true โ€” gzip rotated files to save space

Restart Docker to apply:

systemctl restart docker

Warning: This restarts all running containers. Do this during a maintenance window or roll it out carefully.

Per-Container Override in Compose

For containers that need different limits, set it per-service in docker-compose.yml:

services:
  app:
    image: my-app:latest
    logging:
      driver: json-file
      options:
        max-size: "100m"
        max-file: "10"
        compress: "true"

  debug-service:
    image: my-debug:latest
    logging:
      driver: json-file
      options:
        max-size: "200m"   # verbose in dev, generous limit
        max-file: "3"

Monitoring Disk Usage by Container

Add this to your monitoring toolkit:

#!/bin/bash
# check-docker-logs.sh
# Alerts when any container log exceeds threshold

THRESHOLD_GB=5

find /var/lib/docker/containers -name '*-json.log' | while read logfile; do
  size_gb=$(du -BG "$logfile" | awk '{print $1}' | tr -d 'G')
  container_id=$(echo "$logfile" | cut -d'/' -f6 | cut -c1-12)
  
  if [ "$size_gb" -gt "$THRESHOLD_GB" ]; then
    echo "ALERT: Container $container_id log is ${size_gb}GB"
    docker inspect --format='{{.Name}}' "$container_id" 2>/dev/null
  fi
done

Run it from cron every 30 minutes until you have proper observability set up.

Consider a Centralized Logging Driver

For production, json-file with rotation is a band-aid. The real solution is shipping logs somewhere:

{
  "log-driver": "fluentd",
  "log-opts": {
    "fluentd-address": "localhost:24224",
    "fluentd-async": "true",
    "tag": "docker.{{.Name}}"
  }
}

Or use the loki driver if you're in the Grafana ecosystem. Central log aggregation means:

  • No local disk pressure from logs
  • Queryable log history across all containers
  • Retention policies enforced centrally

Quick Reference

Problem Fix
Log file too large right now truncate -s 0 /path/to/container.log
Global log limits Edit /etc/docker/daemon.json
Per-container limits Use logging: block in compose
Ongoing monitoring Script + cron or Prometheus node exporter

Don't wait for 3am to learn this one.


Centralized Log Collection: The Right Long-Term Fix

json-file with rotation is a band-aid. For production Docker fleets, you want logs shipped off the host entirely โ€” no local disk pressure, queryable history, retention policies enforced centrally.

Option 1: Fluentd / Fluent Bit

{
  "log-driver": "fluentd",
  "log-opts": {
    "fluentd-address": "localhost:24224",
    "fluentd-async": "true",
    "tag": "docker.{{.Name}}"
  }
}

Fluent Bit is lighter than Fluentd โ€” use it as a DaemonSet on Kubernetes or as a sidecar on each Docker host.

Option 2: Grafana Loki

{
  "log-driver": "loki",
  "log-opts": {
    "loki-url": "http://loki:3100/loki/api/v1/push",
    "loki-external-labels": "host={{.Host}},container={{.Name}}"
  }
}

Requires the Loki Docker driver plugin:

docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions

Option 3: AWS CloudWatch (for ECS/EC2)

{
  "log-driver": "awslogs",
  "log-opts": {
    "awslogs-group": "/docker/myapp",
    "awslogs-region": "us-east-1",
    "awslogs-stream": "{{.Name}}"
  }
}

Monitoring Log Size in Production

Add this to your monitoring regardless of which driver you use:

#!/bin/bash
# /usr/local/bin/docker-log-monitor.sh
THRESHOLD_GB=2

docker ps -q | while read cid; do
  NAME=$(docker inspect --format='{{.Name}}' "$cid" | tr -d '/')
  LOG_PATH=$(docker inspect --format='{{.LogPath}}' "$cid")
  if [ -f "$LOG_PATH" ]; then
    SIZE_MB=$(du -m "$LOG_PATH" | cut -f1)
    if [ "$SIZE_MB" -gt $((THRESHOLD_GB * 1024)) ]; then
      echo "ALERT: Container $NAME log is ${SIZE_MB}MB"
    fi
  fi
done

Run from cron every 15 minutes. Alert before the disk fills โ€” not after.


FAQ

Does max-size apply to existing containers? No. Log driver options apply at container creation time. You must recreate existing containers for the new limits to take effect. For docker-compose, run docker-compose down && docker-compose up -d after updating logging: config.

What happens if I truncate a log file while Docker is running? truncate -s 0 /path/to/container.log frees disk space immediately โ€” Docker keeps the file handle open and continues writing. The next write goes to offset 0. This is safe as an emergency fix. The permanent fix is rotation configuration.

Can I set different log limits for different services in Compose? Yes โ€” per-service logging: blocks override the global daemon.json defaults. See the "Per-Container Override in Compose" section above.

What is the difference between json-file and local log driver? local is a newer driver that uses a more efficient binary format and automatically compresses rotated files. It has less tooling support (you cannot docker logs rotated files). Use json-file with rotation for most cases โ€” it is more portable and compatible with log shipping tools.


Related reading: Linux Log Analysis: How to Debug Issues Like a Senior Engineer โ€” once logs are centralized, how to query them effectively. Linux Debugging Tools Every Engineer Should Know โ€” broader debugging toolkit.