A DevOps Technical Deep Dive
The Problem Statement
We encountered an issue with the Open WebUI Docker container (ghcr.io/open-webui/open-webui:main
) failing to automatically start after system reboots, despite having the --restart always
flag configured. The container would show a “health: starting” status but fail to fully initialize.
Initial Container State
Initial container status revealed partial startup:
docker container ls
CONTAINER ID IMAGE COMMAND STATUS PORTS NAMES
a0755c947063 ghcr.io/open-webui/open-webui:main "bash start.sh" Up Less than a second (health: starting) 0.0.0.0:3000->8080/tcp open-webui
Diagnostic Process
1. Health Check Analysis
docker inspect open-webui | grep -A 10 "Healthcheck"
Output revealed the health check configuration:
"Healthcheck": {
"Test": [
"CMD-SHELL",
"curl --silent --fail http://localhost:${PORT:-8080}/health | jq -ne 'input.status == true' || exit 1"
],
"Interval": 30000000000,
"Timeout": 10000000000,
"StartPeriod": 30000000000,
"Retries": 3
}
2. Network Connectivity Check
docker exec open-webui netstat -tulpn | grep 8080
Output confirmed process binding:
tcp 0 0 0.0.0.0:8080 0.0.0.0:* LISTEN 1/python
3. Container Configuration Analysis
Full inspection output revealed crucial details:
docker inspect open-webui
Key findings:
{
"RestartPolicy": {
"Name": "always",
"MaximumRetryCount": 0
},
"Mounts": [],
"Binds": null,
"NetworkMode": "bridge",
"PortBindings": {
"8080/tcp": [
{
"HostIp": "",
"HostPort": "3000"
}
]
}
}
4. Environment Variables Check
"Env": [
"PORT=8080",
"ENV=prod",
"USE_OLLAMA_DOCKER=false",
"USE_CUDA_DOCKER=false",
"OLLAMA_BASE_URL=/ollama",
"WEBUI_BUILD_VERSION=7228b39064ac28e1240bf8998f2a35535c6f7ef5",
"DOCKER=true"
]
Technical Analysis
- Network Configuration
- Bridge network mode with IP: 172.17.0.2
- Port mapping: 3000->8080 (both IPv4 and IPv6)
- Health check endpoint: localhost:8080/health
- Container Health Metrics
- 30s health check interval
- 10s timeout
- 30s start period
- 3 retry attempts
- Critical Issues Identified
- No persistent volume mounts
- Potential race condition during system startup
- Missing data persistence between restarts
- No defined startup dependencies
Implementation of Solution
1. Create Persistent Storage
docker volume create open-webui-data
2. Container Recreation with Enhanced Configuration
# Stop and remove existing container
docker stop open-webui
docker rm open-webui
# Deploy with persistent storage and optimized configuration
docker run -d \
--name open-webui \
--restart always \
-p 3000:8080 \
-e PORT=8080 \
-v open-webui-data:/app/backend/data \
--health-start-period=30s \
--health-interval=30s \
--health-timeout=10s \
--health-retries=3 \
ghcr.io/open-webui/open-webui:main
3. System-Level Configuration
# Enable Docker service
sudo systemctl enable docker
# Verify service dependencies
systemctl list-dependencies docker.service
# Optional: Add startup delay
docker update --restart-delay=10s open-webui
Technical Considerations
Volume Management
# Verify volume creation
docker volume ls | grep open-webui-data
# Inspect volume details
docker volume inspect open-webui-data
# Monitor volume usage
docker system df -v
Network Diagnostics
# Check container networking
docker network inspect bridge
# Verify port bindings
docker port open-webui
# Monitor container network metrics
docker stats open-webui --no-stream
Health Check Monitoring
# Monitor health check logs
docker inspect --format "{{json .State.Health }}" open-webui | jq .
# Watch container events
docker events --filter container=open-webui --filter type=container
DevOps Best Practices
1. Container Configuration
- Use named volumes for data persistence
- Implement comprehensive health checks
- Define appropriate resource limits
- Configure logging drivers
2. Monitoring Setup
# Set up container logging
docker run ... --log-driver json-file --log-opt max-size=10m --log-opt max-file=3
# Monitor container metrics
docker stats open-webui
# Watch container events
docker events --filter container=open-webui
3. Startup Management
# Create systemd override
sudo systemctl edit docker.service
# Content:
[Service]
ExecStartPost=/bin/sleep 10
4. Backup Considerations
# Volume backup
docker run --rm -v open-webui-data:/data -v /backup:/backup \
ubuntu tar czf /backup/open-webui-backup.tar.gz -C /data .
Troubleshooting Checklist
- [x] Verify container status and health check
- [x] Check network connectivity and port bindings
- [x] Inspect volume mounts and data persistence
- [x] Review Docker service configuration
- [x] Monitor system logs for startup issues
- [x] Validate environment variables
- [x] Test application endpoints
- [x] Verify resource availability
Monitoring Recommendations
- Set up container monitoring (Prometheus/Grafana)
- Configure alerting for health check failures
- Monitor volume usage and cleanup
- Track container restart counts
- Log aggregation setup
Conclusion
Through systematic debugging and implementation of DevOps best practices, we resolved the Open WebUI container auto-start issues. The solution involves proper volume management, health check configuration, and system-level optimizations.
This case demonstrates the importance of thorough diagnostics and the implementation of Docker best practices in a production environment. For mission-critical applications, consider implementing additional monitoring and alerting solutions to proactively identify similar issues.