# Production Deployment Guide Complete guide for running the Movie Scheduler in production environments. ## Table of Contents 1. [Prerequisites](#prerequisites) 2. [Deployment Options](#deployment-options) 3. [Installation Methods](#installation-methods) 4. [Configuration](#configuration) 5. [Security](#security) 6. [Monitoring](#monitoring) 7. [Backup & Recovery](#backup--recovery) 8. [Maintenance](#maintenance) 9. [Troubleshooting](#troubleshooting) --- ## Prerequisites ### Hardware Requirements **Minimum:** - CPU: 4 cores (for whisper.cpp and encoding) - RAM: 4GB - Storage: 100GB+ (depends on video library size) - GPU: Intel/AMD with VAAPI support (optional but recommended) **Recommended:** - CPU: 8+ cores - RAM: 8GB+ - Storage: 500GB+ SSD - GPU: Modern Intel/AMD GPU with VAAPI ### Software Requirements - **OS**: Linux (Ubuntu 20.04+, Debian 11+, RHEL 8+, or compatible) - **Python**: 3.7+ - **FFmpeg**: With VAAPI support - **whisper.cpp**: Compiled and in PATH - **Network**: Stable connection to NocoDB and RTMP server --- ## Deployment Options ### Option 1: Systemd Service (Recommended for bare metal) ✅ Direct hardware access (best VAAPI performance) ✅ Low overhead ✅ System integration ❌ Manual dependency management ### Option 2: Docker Container (Recommended for most users) ✅ Isolated environment ✅ Easy updates ✅ Portable configuration ⚠️ Slight performance overhead ⚠️ Requires GPU passthrough for VAAPI ### Option 3: Kubernetes/Orchestration ✅ High availability ✅ Auto-scaling ✅ Cloud-native ❌ Complex setup ❌ Overkill for single-instance deployment --- ## Installation Methods ### Method 1: Systemd Service Installation #### 1. Create Scheduler User ```bash # Create dedicated user for security sudo useradd -r -s /bin/bash -d /opt/scheduler -m scheduler # Add to video group for GPU access sudo usermod -aG video,render scheduler ``` #### 2. Install Dependencies ```bash # Install system packages sudo apt-get update sudo apt-get install -y python3 python3-pip python3-venv ffmpeg git build-essential # Install whisper.cpp sudo -u scheduler git clone https://github.com/ggerganov/whisper.cpp.git /tmp/whisper.cpp cd /tmp/whisper.cpp make sudo cp main /usr/local/bin/whisper.cpp sudo chmod +x /usr/local/bin/whisper.cpp # Download whisper model sudo mkdir -p /opt/models cd /opt/models sudo wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin sudo mv ggml-base.en.bin ggml-base.bin sudo chown -R scheduler:scheduler /opt/models ``` #### 3. Deploy Application ```bash # Create application directory sudo mkdir -p /opt/scheduler sudo chown scheduler:scheduler /opt/scheduler # Copy application files sudo -u scheduler cp agent.py /opt/scheduler/ sudo -u scheduler cp requirements.txt /opt/scheduler/ # Create Python virtual environment sudo -u scheduler python3 -m venv /opt/scheduler/venv sudo -u scheduler /opt/scheduler/venv/bin/pip install -r /opt/scheduler/requirements.txt ``` #### 4. Configure Storage ```bash # Create storage directories (adjust paths as needed) sudo mkdir -p /mnt/storage/raw_movies sudo mkdir -p /mnt/storage/final_movies sudo chown -R scheduler:scheduler /mnt/storage ``` #### 5. Configure Service ```bash # Copy service file sudo cp scheduler.service /etc/systemd/system/ # Create environment file with secrets sudo mkdir -p /etc/scheduler sudo nano /etc/scheduler/scheduler.env ``` Edit `/etc/scheduler/scheduler.env`: ```bash NOCODB_URL=https://your-nocodb.com/api/v2/tables/YOUR_TABLE_ID/records NOCODB_TOKEN=your_production_token RTMP_SERVER=rtmp://your-rtmp-server.com/live/stream RAW_DIR=/mnt/storage/raw_movies FINAL_DIR=/mnt/storage/final_movies WHISPER_MODEL=/opt/models/ggml-base.bin ``` Update `scheduler.service` to use the environment file: ```ini # Replace Environment= lines with: EnvironmentFile=/etc/scheduler/scheduler.env ``` #### 6. Enable and Start Service ```bash # Reload systemd sudo systemctl daemon-reload # Enable service (start on boot) sudo systemctl enable scheduler # Start service sudo systemctl start scheduler # Check status sudo systemctl status scheduler # View logs sudo journalctl -u scheduler -f ``` --- ### Method 2: Docker Deployment #### 1. Prepare Environment ```bash # Create project directory mkdir -p /opt/scheduler cd /opt/scheduler # Copy application files cp agent.py requirements.txt Dockerfile docker-compose.prod.yml ./ # Create production environment file cp .env.production.example .env.production nano .env.production # Fill in your values ``` #### 2. Configure Storage ```bash # Ensure storage directories exist mkdir -p /mnt/storage/raw_movies mkdir -p /mnt/storage/final_movies # Download whisper model mkdir -p /opt/models cd /opt/models wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin mv ggml-base.en.bin ggml-base.bin ``` #### 3. Deploy Container ```bash cd /opt/scheduler # Build image docker compose -f docker-compose.prod.yml build # Start service docker compose -f docker-compose.prod.yml up -d # Check logs docker compose -f docker-compose.prod.yml logs -f # Check status docker compose -f docker-compose.prod.yml ps ``` #### 4. Enable Auto-Start ```bash # Create systemd service for docker compose sudo nano /etc/systemd/system/scheduler-docker.service ``` ```ini [Unit] Description=Movie Scheduler (Docker) Requires=docker.service After=docker.service [Service] Type=oneshot RemainAfterExit=yes WorkingDirectory=/opt/scheduler ExecStart=/usr/bin/docker compose -f docker-compose.prod.yml up -d ExecStop=/usr/bin/docker compose -f docker-compose.prod.yml down TimeoutStartSec=0 [Install] WantedBy=multi-user.target ``` ```bash sudo systemctl daemon-reload sudo systemctl enable scheduler-docker ``` --- ## Configuration ### Essential Configuration #### NocoDB Connection ```bash # Get your table ID from NocoDB URL # https://nocodb.com/nc/YOUR_BASE_ID/table_NAME # API endpoint: https://nocodb.com/api/v2/tables/TABLE_ID/records # Generate API token in NocoDB: # Account Settings → Tokens → Create Token ``` #### RTMP Server ```bash # For nginx-rtmp: RTMP_SERVER=rtmp://your-server.com:1935/live/stream # For other RTMP servers, use their endpoint format ``` #### Storage Paths ```bash # Use separate fast storage for final videos (streaming) RAW_DIR=/mnt/storage/raw_movies # Can be slower storage FINAL_DIR=/mnt/fast-storage/final # Should be fast SSD # Ensure proper permissions chown -R scheduler:scheduler /mnt/storage chmod 755 /mnt/storage/raw_movies chmod 755 /mnt/fast-storage/final ``` ### Performance Tuning #### Sync Intervals ```bash # High-load scenario (many jobs, frequent updates) NOCODB_SYNC_INTERVAL_SECONDS=120 # Check less often WATCHDOG_CHECK_INTERVAL_SECONDS=15 # Check streams less often # Low-latency scenario (need fast response) NOCODB_SYNC_INTERVAL_SECONDS=30 WATCHDOG_CHECK_INTERVAL_SECONDS=5 # Default (balanced) NOCODB_SYNC_INTERVAL_SECONDS=60 WATCHDOG_CHECK_INTERVAL_SECONDS=10 ``` #### FFmpeg VAAPI ```bash # Find your VAAPI device ls -la /dev/dri/ # Common devices: # renderD128 - Primary GPU # renderD129 - Secondary GPU # Test VAAPI ffmpeg -hwaccel vaapi -vaapi_device /dev/dri/renderD128 -i test.mp4 -f null - # Set in config VAAPI_DEVICE=/dev/dri/renderD128 ``` --- ## Security ### Secrets Management **DO NOT hardcode secrets in files tracked by git!** #### Option 1: Environment Files (Simple) ```bash # Store secrets in protected file sudo nano /etc/scheduler/scheduler.env sudo chmod 600 /etc/scheduler/scheduler.env sudo chown scheduler:scheduler /etc/scheduler/scheduler.env ``` #### Option 2: Secrets Management Tools ```bash # Using Vault export NOCODB_TOKEN=$(vault kv get -field=token secret/scheduler/nocodb) # Using AWS Secrets Manager export NOCODB_TOKEN=$(aws secretsmanager get-secret-value --secret-id scheduler/nocodb --query SecretString --output text) # Using Docker Secrets (Swarm/Kubernetes) # Mount secrets as files and read in application ``` ### Filesystem Permissions ```bash # Application directory chown -R scheduler:scheduler /opt/scheduler chmod 750 /opt/scheduler # Storage directories chown -R scheduler:scheduler /mnt/storage chmod 755 /mnt/storage/raw_movies # Read-only for scheduler chmod 755 /mnt/storage/final_movies # Read-write for scheduler # Database file chmod 600 /opt/scheduler/scheduler.db chown scheduler:scheduler /opt/scheduler/scheduler.db ``` ### Network Security ```bash # Firewall rules (if scheduler runs on separate server) # Only allow outbound connections to NocoDB and RTMP sudo ufw allow out to YOUR_NOCODB_IP port 443 # HTTPS sudo ufw allow out to YOUR_RTMP_IP port 1935 # RTMP sudo ufw default deny outgoing # Deny all other outbound (optional) ``` ### Regular Updates ```bash # Update system packages weekly sudo apt-get update && sudo apt-get upgrade # Update Python dependencies sudo -u scheduler /opt/scheduler/venv/bin/pip install --upgrade requests # Rebuild whisper.cpp quarterly (for performance improvements) ``` --- ## Monitoring ### Service Health #### Systemd Monitoring ```bash # Check service status systemctl status scheduler # View recent logs journalctl -u scheduler -n 100 # Follow logs in real-time journalctl -u scheduler -f # Check for errors in last hour journalctl -u scheduler --since "1 hour ago" | grep ERROR # Service restart count systemctl show scheduler | grep NRestarts ``` #### Docker Monitoring ```bash # Container status docker compose -f docker-compose.prod.yml ps # Resource usage docker stats movie_scheduler # Logs docker compose -f docker-compose.prod.yml logs --tail=100 -f # Health check status docker inspect movie_scheduler | jq '.[0].State.Health' ``` ### Database Monitoring ```bash # Check job status sqlite3 /opt/scheduler/scheduler.db "SELECT prep_status, play_status, COUNT(*) FROM jobs GROUP BY prep_status, play_status;" # Active streams sqlite3 /opt/scheduler/scheduler.db "SELECT nocodb_id, title, play_status, stream_retry_count FROM jobs WHERE play_status='streaming';" # Failed jobs sqlite3 /opt/scheduler/scheduler.db "SELECT nocodb_id, title, prep_status, play_status, log FROM jobs WHERE prep_status='failed' OR play_status='failed';" # Database size ls -lh /opt/scheduler/scheduler.db ``` ### Automated Monitoring Script Create `/opt/scheduler/monitor.sh`: ```bash #!/bin/bash LOG_FILE="/var/log/scheduler-monitor.log" DB_PATH="/opt/scheduler/scheduler.db" echo "=== Scheduler Monitor - $(date) ===" >> "$LOG_FILE" # Check if service is running if systemctl is-active --quiet scheduler; then echo "✓ Service is running" >> "$LOG_FILE" else echo "✗ Service is DOWN" >> "$LOG_FILE" # Send alert (email, Slack, etc.) systemctl start scheduler fi # Check for failed jobs FAILED=$(sqlite3 "$DB_PATH" "SELECT COUNT(*) FROM jobs WHERE prep_status='failed' OR play_status='failed';") if [ "$FAILED" -gt 0 ]; then echo "⚠ Found $FAILED failed jobs" >> "$LOG_FILE" # Send alert fi # Check active streams STREAMING=$(sqlite3 "$DB_PATH" "SELECT COUNT(*) FROM jobs WHERE play_status='streaming';") echo "Active streams: $STREAMING" >> "$LOG_FILE" # Check disk space DISK_USAGE=$(df -h /mnt/storage | tail -1 | awk '{print $5}' | sed 's/%//') if [ "$DISK_USAGE" -gt 90 ]; then echo "⚠ Disk usage is ${DISK_USAGE}%" >> "$LOG_FILE" # Send alert fi echo "" >> "$LOG_FILE" ``` ```bash # Make executable chmod +x /opt/scheduler/monitor.sh # Add to crontab (check every 5 minutes) (crontab -l 2>/dev/null; echo "*/5 * * * * /opt/scheduler/monitor.sh") | crontab - ``` ### External Monitoring #### Prometheus + Grafana Export metrics using node_exporter or custom exporter: ```bash # Install node_exporter for system metrics # Create custom exporter for job metrics from database # Set up Grafana dashboard ``` #### Uptime Monitoring Use services like: - UptimeRobot - Pingdom - Datadog Monitor: - Service availability - RTMP server connectivity - NocoDB API accessibility --- ## Backup & Recovery ### What to Backup 1. **Database** (scheduler.db) - Critical 2. **Configuration** (.env.production or /etc/scheduler/scheduler.env) - Critical 3. **Final videos** (if you want to keep processed videos) 4. **Logs** (optional, for forensics) ### Backup Script Create `/opt/scheduler/backup.sh`: ```bash #!/bin/bash BACKUP_DIR="/backup/scheduler" DATE=$(date +%Y%m%d_%H%M%S) DB_PATH="/opt/scheduler/scheduler.db" CONFIG_PATH="/etc/scheduler/scheduler.env" # Create backup directory mkdir -p "$BACKUP_DIR" # Backup database cp "$DB_PATH" "$BACKUP_DIR/scheduler_${DATE}.db" # Backup config (careful with secrets!) cp "$CONFIG_PATH" "$BACKUP_DIR/config_${DATE}.env" # Compress old backups find "$BACKUP_DIR" -name "*.db" -mtime +7 -exec gzip {} \; # Delete backups older than 30 days find "$BACKUP_DIR" -name "*.db.gz" -mtime +30 -delete find "$BACKUP_DIR" -name "*.env" -mtime +30 -delete # Optional: Upload to S3/cloud storage # aws s3 sync "$BACKUP_DIR" s3://your-bucket/scheduler-backups/ echo "Backup completed: $BACKUP_DIR/scheduler_${DATE}.db" ``` ```bash # Make executable chmod +x /opt/scheduler/backup.sh # Run daily at 2 AM (crontab -l 2>/dev/null; echo "0 2 * * * /opt/scheduler/backup.sh") | crontab - ``` ### Recovery Procedure #### 1. Restore from Backup ```bash # Stop service sudo systemctl stop scheduler # Restore database cp /backup/scheduler/scheduler_YYYYMMDD_HHMMSS.db /opt/scheduler/scheduler.db chown scheduler:scheduler /opt/scheduler/scheduler.db # Restore config if needed cp /backup/scheduler/config_YYYYMMDD_HHMMSS.env /etc/scheduler/scheduler.env # Start service sudo systemctl start scheduler ``` #### 2. Disaster Recovery (Full Rebuild) If server is completely lost: 1. Provision new server 2. Follow installation steps above 3. Restore database and config from backup 4. Restart service 5. Verify jobs are picked up **Recovery Time Objective (RTO):** 30-60 minutes **Recovery Point Objective (RPO):** Up to 24 hours (with daily backups) --- ## Maintenance ### Routine Tasks #### Daily - ✓ Check service status - ✓ Review error logs - ✓ Check failed jobs #### Weekly - ✓ Review disk space - ✓ Check database size - ✓ Clean up old processed videos (if not needed) #### Monthly - ✓ Update system packages - ✓ Review and optimize database - ✓ Test backup restoration - ✓ Review and rotate logs ### Database Maintenance ```bash # Vacuum database (reclaim space, optimize) sqlite3 /opt/scheduler/scheduler.db "VACUUM;" # Analyze database (update statistics) sqlite3 /opt/scheduler/scheduler.db "ANALYZE;" # Clean up old completed jobs (optional) sqlite3 /opt/scheduler/scheduler.db "DELETE FROM jobs WHERE play_status='done' AND datetime(run_at) < datetime('now', '-30 days');" ``` ### Log Rotation For systemd (automatic via journald): ```bash # Configure in /etc/systemd/journald.conf SystemMaxUse=1G RuntimeMaxUse=100M ``` For Docker: ```yaml # Already configured in docker-compose.prod.yml logging: driver: "json-file" options: max-size: "10m" max-file: "5" ``` ### Video Cleanup ```bash # Clean up old final videos (adjust retention as needed) find /mnt/storage/final_movies -name "*.mp4" -mtime +7 -delete # Or move to archive find /mnt/storage/final_movies -name "*.mp4" -mtime +7 -exec mv {} /mnt/archive/ \; ``` --- ## Troubleshooting ### Service Won't Start ```bash # Check service status systemctl status scheduler # Check logs for errors journalctl -u scheduler -n 50 # Common issues: # 1. Missing environment variables grep -i "error" /var/log/syslog | grep scheduler # 2. Permission issues ls -la /opt/scheduler ls -la /mnt/storage # 3. GPU access issues ls -la /dev/dri/ groups scheduler # Should include 'video' and 'render' ``` ### Streams Keep Failing ```bash # Test RTMP server manually ffmpeg -re -i test.mp4 -c copy -f flv rtmp://your-server/live/stream # Check network connectivity ping your-rtmp-server.com telnet your-rtmp-server.com 1935 # Review stream logs journalctl -u scheduler | grep -A 10 "Stream crashed" # Check retry count sqlite3 /opt/scheduler/scheduler.db "SELECT nocodb_id, title, stream_retry_count FROM jobs WHERE play_status='streaming';" ``` ### High CPU/Memory Usage ```bash # Check resource usage top -u scheduler # Or for Docker docker stats movie_scheduler # Common causes: # 1. Large video file encoding - normal, wait for completion # 2. whisper.cpp using all cores - normal # 3. Multiple prep jobs running - adjust or wait # Limit resources if needed (systemd) systemctl edit scheduler # Add: [Service] CPUQuota=200% MemoryMax=4G ``` ### Database Locked Errors ```bash # Check for stale locks lsof /opt/scheduler/scheduler.db # Kill stale processes if needed # Restart service systemctl restart scheduler ``` ### VAAPI Not Working ```bash # Verify VAAPI support vainfo # Test FFmpeg VAAPI ffmpeg -hwaccels # Check permissions ls -la /dev/dri/renderD128 groups scheduler # Should include 'video' or 'render' # Fallback to software encoding # Comment out VAAPI_DEVICE in config # Encoding will use CPU (slower but works) ``` --- ## Performance Optimization ### Hardware Acceleration ```bash # Verify GPU usage during encoding intel_gpu_top # For Intel GPUs radeontop # For AMD GPUs # If GPU not being used, check: # 1. VAAPI device path correct # 2. User has GPU permissions # 3. FFmpeg compiled with VAAPI support ``` ### Storage Performance ```bash # Use SSD for final videos (they're streamed frequently) # Use HDD for raw videos (accessed once for processing) # Test disk performance dd if=/dev/zero of=/mnt/storage/test bs=1M count=1024 oflag=direct rm /mnt/storage/test ``` ### Network Optimization ```bash # For better streaming reliability # 1. Use dedicated network for RTMP # 2. Enable QoS for streaming traffic # 3. Consider local RTMP relay # Test network throughput iperf3 -c your-rtmp-server.com ``` --- ## Production Checklist Before going live: - [ ] Secrets stored securely (not in git) - [ ] Service auto-starts on boot - [ ] Backups configured and tested - [ ] Monitoring configured - [ ] Logs being rotated - [ ] Disk space alerts configured - [ ] Test recovery procedure - [ ] Document runbook for on-call - [ ] GPU permissions verified - [ ] RTMP connectivity tested - [ ] NocoDB API tested - [ ] Process one test video end-to-end - [ ] Verify streaming watchdog works - [ ] Test service restart during streaming - [ ] Configure alerting for failures --- ## Support & Updates ### Getting Updates ```bash # Git-based deployment cd /opt/scheduler git pull origin main systemctl restart scheduler # Docker-based deployment cd /opt/scheduler docker compose -f docker-compose.prod.yml pull docker compose -f docker-compose.prod.yml up -d ``` ### Reporting Issues Include in bug reports: - Service logs: `journalctl -u scheduler -n 100` - Database state: `sqlite3 scheduler.db ".dump jobs"` - System info: `uname -a`, `python3 --version`, `ffmpeg -version` - Configuration (redact secrets!) --- ## Additional Resources - FFmpeg VAAPI Guide: https://trac.ffmpeg.org/wiki/Hardware/VAAPI - whisper.cpp: https://github.com/ggerganov/whisper.cpp - NocoDB API: https://docs.nocodb.com - Systemd Documentation: https://www.freedesktop.org/software/systemd/man/ --- ## License This production guide is provided as-is. Test thoroughly in staging before production deployment.