Docker Ate My Disk: Fixing Log Rotation Before It Kills Production

Q: What happens if I truncate a log file while Docker is running?

truncate -s 0 /path/to/container.log frees disk space immediately — Docker keeps the file handle open and continues writing. The next write goes to offset 0. This is safe as an emergency fix. The permanent fix is rotation configuration.

Q: Can I set different log limits for different services in Compose?

Yes — per-service logging: blocks override the global daemon.json defaults. See the "Per-Container Override in Compose" section above.

3am, Disk Full, Everything Down

The alert came in at 3:17am: disk utilization at 100%, multiple services failing. SSH into the host:

df -h
# Filesystem      Size  Used Avail Use% Mounted on
# /dev/sda1       500G  500G     0 100% /

Find the culprit:

du -sh /* 2>/dev/null | sort -rh | head -10
# 487G  /var

du -sh /var/* 2>/dev/null | sort -rh | head -5
# 487G  /var/lib/docker

du -sh /var/lib/docker/* 2>/dev/null | sort -rh | head -5
# 484G  /var/lib/docker/containers

There it was. Docker container logs — unrotated, unbounded, growing forever.

ls -lah /var/lib/docker/containers/a3f9b1c.../
# -rw-r----- 1 root root 484G Nov 3 03:17 a3f9b1c...-json.log

484 gigabytes. One log file. One container. 72 hours of verbose output with no rotation configured.

The Emergency Fix

You can't just rm the file while Docker holds it open — the inode stays allocated. The right move:

# Truncate the file (Docker keeps the handle, disk space is freed immediately)
truncate -s 0 /var/lib/docker/containers/<container-id>/<container-id>-json.log

Services came back up within seconds. But this was a symptom, not the problem.

Why This Happens

By default, Docker's json-file logging driver has no size limit and no rotation. Every byte your container writes to stdout/stderr goes into that file and stays there forever. On a verbose app, that's a disaster.

The dangerous default:

{
  "log-driver": "json-file"
}

That's it. No max-size. No max-file. No expiry. Pure chaos at scale.

The Permanent Fix: daemon.json

Edit /etc/docker/daemon.json:

{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "50m",
    "max-file": "5",
    "compress": "true"
  }
}

This sets a global default for all containers on the host:

max-size: 50m — rotate when the log hits 50MB
max-file: 5 — keep 5 rotated files (250MB max total per container)
compress: true — gzip rotated files to save space

Restart Docker to apply:

systemctl restart docker

Warning: This restarts all running containers. Do this during a maintenance window or roll it out carefully.

Per-Container Override in Compose

For containers that need different limits, set it per-service in docker-compose.yml:

services:
  app:
    image: my-app:latest
    logging:
      driver: json-file
      options:
        max-size: "100m"
        max-file: "10"
        compress: "true"

  debug-service:
    image: my-debug:latest
    logging:
      driver: json-file
      options:
        max-size: "200m"   # verbose in dev, generous limit
        max-file: "3"

Monitoring Disk Usage by Container

Add this to your monitoring toolkit:

#!/bin/bash
# check-docker-logs.sh
# Alerts when any container log exceeds threshold

THRESHOLD_GB=5

find /var/lib/docker/containers -name '*-json.log' | while read logfile; do
  size_gb=$(du -BG "$logfile" | awk '{print $1}' | tr -d 'G')
  container_id=$(echo "$logfile" | cut -d'/' -f6 | cut -c1-12)
  
  if [ "$size_gb" -gt "$THRESHOLD_GB" ]; then
    echo "ALERT: Container $container_id log is ${size_gb}GB"
    docker inspect --format='{{.Name}}' "$container_id" 2>/dev/null
  fi
done

Run it from cron every 30 minutes until you have proper observability set up.

Consider a Centralized Logging Driver

For production, json-file with rotation is a band-aid. The real solution is shipping logs somewhere:

{
  "log-driver": "fluentd",
  "log-opts": {
    "fluentd-address": "localhost:24224",
    "fluentd-async": "true",
    "tag": "docker.{{.Name}}"
  }
}

Or use the loki driver if you're in the Grafana ecosystem. Central log aggregation means:

No local disk pressure from logs
Queryable log history across all containers
Retention policies enforced centrally

Quick Reference

Problem	Fix
Log file too large right now	`truncate -s 0 /path/to/container.log`
Global log limits	Edit `/etc/docker/daemon.json`
Per-container limits	Use `logging:` block in compose
Ongoing monitoring	Script + cron or Prometheus node exporter

Don't wait for 3am to learn this one.

Centralized Log Collection: The Right Long-Term Fix

json-file with rotation is a band-aid. For production Docker fleets, you want logs shipped off the host entirely — no local disk pressure, queryable history, retention policies enforced centrally.

Option 1: Fluentd / Fluent Bit

{
  "log-driver": "fluentd",
  "log-opts": {
    "fluentd-address": "localhost:24224",
    "fluentd-async": "true",
    "tag": "docker.{{.Name}}"
  }
}

Fluent Bit is lighter than Fluentd — use it as a DaemonSet on Kubernetes or as a sidecar on each Docker host.

Option 2: Grafana Loki

{
  "log-driver": "loki",
  "log-opts": {
    "loki-url": "http://loki:3100/loki/api/v1/push",
    "loki-external-labels": "host={{.Host}},container={{.Name}}"
  }
}

Requires the Loki Docker driver plugin:

docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions

Option 3: AWS CloudWatch (for ECS/EC2)

{
  "log-driver": "awslogs",
  "log-opts": {
    "awslogs-group": "/docker/myapp",
    "awslogs-region": "us-east-1",
    "awslogs-stream": "{{.Name}}"
  }
}

Monitoring Log Size in Production

Add this to your monitoring regardless of which driver you use:

#!/bin/bash
# /usr/local/bin/docker-log-monitor.sh
THRESHOLD_GB=2

docker ps -q | while read cid; do
  NAME=$(docker inspect --format='{{.Name}}' "$cid" | tr -d '/')
  LOG_PATH=$(docker inspect --format='{{.LogPath}}' "$cid")
  if [ -f "$LOG_PATH" ]; then
    SIZE_MB=$(du -m "$LOG_PATH" | cut -f1)
    if [ "$SIZE_MB" -gt $((THRESHOLD_GB * 1024)) ]; then
      echo "ALERT: Container $NAME log is ${SIZE_MB}MB"
    fi
  fi
done

Run from cron every 15 minutes. Alert before the disk fills — not after.

FAQ

Does max-size apply to existing containers? No. Log driver options apply at container creation time. You must recreate existing containers for the new limits to take effect. For docker-compose, run docker-compose down && docker-compose up -d after updating logging: config.

What happens if I truncate a log file while Docker is running? truncate -s 0 /path/to/container.log frees disk space immediately — Docker keeps the file handle open and continues writing. The next write goes to offset 0. This is safe as an emergency fix. The permanent fix is rotation configuration.

Can I set different log limits for different services in Compose? Yes — per-service logging: blocks override the global daemon.json defaults. See the "Per-Container Override in Compose" section above.

What is the difference between json-file and local log driver? local is a newer driver that uses a more efficient binary format and automatically compresses rotated files. It has less tooling support (you cannot docker logs rotated files). Use json-file with rotation for most cases — it is more portable and compatible with log shipping tools.

Related reading: Linux Log Analysis: How to Debug Issues Like a Senior Engineer — once logs are centralized, how to query them effectively. Linux Debugging Tools Every Engineer Should Know — broader debugging toolkit.