Getting NVIDIA GPU access inside Docker containers is more finicky than it should be. The configuration has changed across Docker versions, NVIDIA driver versions, and Compose file formats — and the old ways of doing it still show up in tutorials that are now wrong. Here's the current correct approach.
Prerequisites — check these first
# Verify NVIDIA drivers are installed on the host: nvidia-smi # Verify the NVIDIA Container Toolkit is installed: nvidia-ctk --version # If nvidia-ctk is not installed: curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list sudo apt update && sudo apt install -y nvidia-container-toolkit sudo nvidia-ctk runtime configure --runtime=docker sudo systemctl restart docker
If nvidia-smi fails — the problem is your host drivers, not Docker. Fix the host drivers first before debugging Docker GPU config.
The current correct config (Docker Compose v3+)
# docker-compose.yml — correct approach for 2024+:
services:
inference:
image: ollama/ollama:latest
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
volumes:
- ollama_data:/root/.ollama
ports:
- "127.0.0.1:11434:11434"
volumes:
ollama_data:
The deploy.resources.reservations.devices block is the current correct way to request GPU access. Use count: all to use all available GPUs, or count: 1device_ids: ["0"] for a specific GPU by index.
The old way — deprecated but still everywhere in tutorials
# OLD — do not use this:
services:
inference:
image: ollama/ollama:latest
runtime: nvidia
environment:
- NVIDIA_VISIBLE_DEVICES=all
- NVIDIA_DRIVER_CAPABILITIES=compute,utility
The runtime: nvidia approach worked in older Docker versions but is deprecated. The NVIDIA_VISIBLE_DEVICES environment variable approach still works but is the legacy method. If you're seeing tutorials using these — they're probably written before 2022.
Conflict warning: Do not combine runtime: nvidia with deploy.resources.reservations.devices — they conflict. Pick one approach and use it consistently. The deploy.resources approach is preferred.
Verify GPU access inside the container
# Test GPU access: docker run --rm --gpus all nvidia/cuda:12.0-base nvidia-smi # If using Compose, exec into the running container: docker compose exec inference nvidia-smi
If nvidia-smi works inside the container — GPU access is configured correctly. If it fails with "no devices found" — check the Container Toolkit installation. If it fails with "command not found" — the container image doesn't have NVIDIA tools installed (that's fine for inference images that use CUDA without the CLI tools).
Common errors and fixes
Error: could not select device driver with capabilities: [[gpu]]
# The NVIDIA Container Toolkit is not configured for Docker: sudo nvidia-ctk runtime configure --runtime=docker sudo systemctl restart docker # Verify the runtime is registered: docker info | grep -i runtime
Error: unknown flag: --gpus
# Docker version is too old — update Docker: docker --version # Need 19.03+ # Install latest Docker: curl -fsSL https://get.docker.com | sh
Error: NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver
# Host driver issue — check driver status: sudo dkms status sudo nvidia-smi # If host nvidia-smi fails — reinstall host drivers first
Specific GPU selection
# Use all GPUs:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
# Use exactly 1 GPU:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
# Use specific GPU by ID:
devices:
- driver: nvidia
device_ids: ["0"]
capabilities: [gpu]
# Use specific GPU by UUID:
devices:
- driver: nvidia
device_ids: ["GPU-abc123"]
capabilities: [gpu]
# Find your GPU IDs: nvidia-smi -L
Ollama-specific config
services:
ollama:
image: ollama/ollama:latest
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
volumes:
- ollama_data:/root/.ollama
ports:
- "127.0.0.1:11434:11434"
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:11434/api/tags"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
volumes:
ollama_data:
Paste your docker-compose.yml to detect deprecated GPU runtime config, missing healthchecks, and exposed ports.
Open Docker Auditor →