19 KiB

Raw Blame History

Production Deployment Guide

Complete guide for running the Movie Scheduler in production environments.

Prerequisites
Deployment Options
Installation Methods
Configuration
Security
Monitoring
Backup & Recovery
Maintenance
Troubleshooting

Prerequisites

Hardware Requirements

Minimum:

CPU: 4 cores (for whisper.cpp and encoding)
RAM: 4GB
Storage: 100GB+ (depends on video library size)
GPU: Intel/AMD with VAAPI support (optional but recommended)

Recommended:

CPU: 8+ cores
RAM: 8GB+
Storage: 500GB+ SSD
GPU: Modern Intel/AMD GPU with VAAPI

Software Requirements

OS: Linux (Ubuntu 20.04+, Debian 11+, RHEL 8+, or compatible)
Python: 3.7+
FFmpeg: With VAAPI support
whisper.cpp: Compiled and in PATH
Network: Stable connection to NocoDB and RTMP server

Deployment Options

Option 1: Systemd Service (Recommended for bare metal)

✅ Direct hardware access (best VAAPI performance) ✅ Low overhead ✅ System integration ❌ Manual dependency management

Option 2: Docker Container (Recommended for most users)

✅ Isolated environment ✅ Easy updates ✅ Portable configuration ⚠️ Slight performance overhead ⚠️ Requires GPU passthrough for VAAPI

Option 3: Kubernetes/Orchestration

✅ High availability ✅ Auto-scaling ✅ Cloud-native ❌ Complex setup ❌ Overkill for single-instance deployment

Installation Methods

Method 1: Systemd Service Installation

1. Create Scheduler User

# Create dedicated user for security
sudo useradd -r -s /bin/bash -d /opt/scheduler -m scheduler

# Add to video group for GPU access
sudo usermod -aG video,render scheduler

2. Install Dependencies

# Install system packages
sudo apt-get update
sudo apt-get install -y python3 python3-pip python3-venv ffmpeg git build-essential

# Install whisper.cpp
sudo -u scheduler git clone https://github.com/ggerganov/whisper.cpp.git /tmp/whisper.cpp
cd /tmp/whisper.cpp
make
sudo cp main /usr/local/bin/whisper.cpp
sudo chmod +x /usr/local/bin/whisper.cpp

# Download whisper model
sudo mkdir -p /opt/models
cd /opt/models
sudo wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin
sudo mv ggml-base.en.bin ggml-base.bin
sudo chown -R scheduler:scheduler /opt/models

3. Deploy Application

# Create application directory
sudo mkdir -p /opt/scheduler
sudo chown scheduler:scheduler /opt/scheduler

# Copy application files
sudo -u scheduler cp agent.py /opt/scheduler/
sudo -u scheduler cp requirements.txt /opt/scheduler/

# Create Python virtual environment
sudo -u scheduler python3 -m venv /opt/scheduler/venv
sudo -u scheduler /opt/scheduler/venv/bin/pip install -r /opt/scheduler/requirements.txt

4. Configure Storage

# Create storage directories (adjust paths as needed)
sudo mkdir -p /mnt/storage/raw_movies
sudo mkdir -p /mnt/storage/final_movies
sudo chown -R scheduler:scheduler /mnt/storage

5. Configure Service

# Copy service file
sudo cp scheduler.service /etc/systemd/system/

# Create environment file with secrets
sudo mkdir -p /etc/scheduler
sudo nano /etc/scheduler/scheduler.env

Edit /etc/scheduler/scheduler.env:

NOCODB_URL=https://your-nocodb.com/api/v2/tables/YOUR_TABLE_ID/records
NOCODB_TOKEN=your_production_token
RTMP_SERVER=rtmp://your-rtmp-server.com/live/stream
RAW_DIR=/mnt/storage/raw_movies
FINAL_DIR=/mnt/storage/final_movies
WHISPER_MODEL=/opt/models/ggml-base.bin

Update scheduler.service to use the environment file:

# Replace Environment= lines with:
EnvironmentFile=/etc/scheduler/scheduler.env

6. Enable and Start Service

# Reload systemd
sudo systemctl daemon-reload

# Enable service (start on boot)
sudo systemctl enable scheduler

# Start service
sudo systemctl start scheduler

# Check status
sudo systemctl status scheduler

# View logs
sudo journalctl -u scheduler -f

Method 2: Docker Deployment

1. Prepare Environment

# Create project directory
mkdir -p /opt/scheduler
cd /opt/scheduler

# Copy application files
cp agent.py requirements.txt Dockerfile docker-compose.prod.yml ./

# Create production environment file
cp .env.production.example .env.production
nano .env.production  # Fill in your values

2. Configure Storage

# Ensure storage directories exist
mkdir -p /mnt/storage/raw_movies
mkdir -p /mnt/storage/final_movies

# Download whisper model
mkdir -p /opt/models
cd /opt/models
wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin
mv ggml-base.en.bin ggml-base.bin

3. Deploy Container

cd /opt/scheduler

# Build image
docker compose -f docker-compose.prod.yml build

# Start service
docker compose -f docker-compose.prod.yml up -d

# Check logs
docker compose -f docker-compose.prod.yml logs -f

# Check status
docker compose -f docker-compose.prod.yml ps

4. Enable Auto-Start

# Create systemd service for docker compose
sudo nano /etc/systemd/system/scheduler-docker.service

[Unit]
Description=Movie Scheduler (Docker)
Requires=docker.service
After=docker.service

[Service]
Type=oneshot
RemainAfterExit=yes
WorkingDirectory=/opt/scheduler
ExecStart=/usr/bin/docker compose -f docker-compose.prod.yml up -d
ExecStop=/usr/bin/docker compose -f docker-compose.prod.yml down
TimeoutStartSec=0

[Install]
WantedBy=multi-user.target

sudo systemctl daemon-reload
sudo systemctl enable scheduler-docker

Configuration

Essential Configuration

NocoDB Connection

# Get your table ID from NocoDB URL
# https://nocodb.com/nc/YOUR_BASE_ID/table_NAME
# API endpoint: https://nocodb.com/api/v2/tables/TABLE_ID/records

# Generate API token in NocoDB:
# Account Settings → Tokens → Create Token

RTMP Server

# For nginx-rtmp:
RTMP_SERVER=rtmp://your-server.com:1935/live/stream

# For other RTMP servers, use their endpoint format

Storage Paths

# Use separate fast storage for final videos (streaming)
RAW_DIR=/mnt/storage/raw_movies      # Can be slower storage
FINAL_DIR=/mnt/fast-storage/final    # Should be fast SSD

# Ensure proper permissions
chown -R scheduler:scheduler /mnt/storage
chmod 755 /mnt/storage/raw_movies
chmod 755 /mnt/fast-storage/final

Performance Tuning

Sync Intervals

# High-load scenario (many jobs, frequent updates)
NOCODB_SYNC_INTERVAL_SECONDS=120  # Check less often
WATCHDOG_CHECK_INTERVAL_SECONDS=15  # Check streams less often

# Low-latency scenario (need fast response)
NOCODB_SYNC_INTERVAL_SECONDS=30
WATCHDOG_CHECK_INTERVAL_SECONDS=5

# Default (balanced)
NOCODB_SYNC_INTERVAL_SECONDS=60
WATCHDOG_CHECK_INTERVAL_SECONDS=10

FFmpeg VAAPI

# Find your VAAPI device
ls -la /dev/dri/

# Common devices:
# renderD128 - Primary GPU
# renderD129 - Secondary GPU

# Test VAAPI
ffmpeg -hwaccel vaapi -vaapi_device /dev/dri/renderD128 -i test.mp4 -f null -

# Set in config
VAAPI_DEVICE=/dev/dri/renderD128

Security

Secrets Management

DO NOT hardcode secrets in files tracked by git!

Option 1: Environment Files (Simple)

# Store secrets in protected file
sudo nano /etc/scheduler/scheduler.env
sudo chmod 600 /etc/scheduler/scheduler.env
sudo chown scheduler:scheduler /etc/scheduler/scheduler.env

Option 2: Secrets Management Tools

# Using Vault
export NOCODB_TOKEN=$(vault kv get -field=token secret/scheduler/nocodb)

# Using AWS Secrets Manager
export NOCODB_TOKEN=$(aws secretsmanager get-secret-value --secret-id scheduler/nocodb --query SecretString --output text)

# Using Docker Secrets (Swarm/Kubernetes)
# Mount secrets as files and read in application

Filesystem Permissions

# Application directory
chown -R scheduler:scheduler /opt/scheduler
chmod 750 /opt/scheduler

# Storage directories
chown -R scheduler:scheduler /mnt/storage
chmod 755 /mnt/storage/raw_movies      # Read-only for scheduler
chmod 755 /mnt/storage/final_movies    # Read-write for scheduler

# Database file
chmod 600 /opt/scheduler/scheduler.db
chown scheduler:scheduler /opt/scheduler/scheduler.db

Network Security

# Firewall rules (if scheduler runs on separate server)
# Only allow outbound connections to NocoDB and RTMP

sudo ufw allow out to YOUR_NOCODB_IP port 443  # HTTPS
sudo ufw allow out to YOUR_RTMP_IP port 1935   # RTMP
sudo ufw default deny outgoing  # Deny all other outbound (optional)

Regular Updates

# Update system packages weekly
sudo apt-get update && sudo apt-get upgrade

# Update Python dependencies
sudo -u scheduler /opt/scheduler/venv/bin/pip install --upgrade requests

# Rebuild whisper.cpp quarterly (for performance improvements)

Monitoring

Service Health

Systemd Monitoring

# Check service status
systemctl status scheduler

# View recent logs
journalctl -u scheduler -n 100

# Follow logs in real-time
journalctl -u scheduler -f

# Check for errors in last hour
journalctl -u scheduler --since "1 hour ago" | grep ERROR

# Service restart count
systemctl show scheduler | grep NRestarts

Docker Monitoring

# Container status
docker compose -f docker-compose.prod.yml ps

# Resource usage
docker stats movie_scheduler

# Logs
docker compose -f docker-compose.prod.yml logs --tail=100 -f

# Health check status
docker inspect movie_scheduler | jq '.[0].State.Health'

Database Monitoring

# Check job status
sqlite3 /opt/scheduler/scheduler.db "SELECT prep_status, play_status, COUNT(*) FROM jobs GROUP BY prep_status, play_status;"

# Active streams
sqlite3 /opt/scheduler/scheduler.db "SELECT nocodb_id, title, play_status, stream_retry_count FROM jobs WHERE play_status='streaming';"

# Failed jobs
sqlite3 /opt/scheduler/scheduler.db "SELECT nocodb_id, title, prep_status, play_status, log FROM jobs WHERE prep_status='failed' OR play_status='failed';"

# Database size
ls -lh /opt/scheduler/scheduler.db

Automated Monitoring Script

Create /opt/scheduler/monitor.sh:

#!/bin/bash

LOG_FILE="/var/log/scheduler-monitor.log"
DB_PATH="/opt/scheduler/scheduler.db"

echo "=== Scheduler Monitor - $(date) ===" >> "$LOG_FILE"

# Check if service is running
if systemctl is-active --quiet scheduler; then
    echo "✓ Service is running" >> "$LOG_FILE"
else
    echo "✗ Service is DOWN" >> "$LOG_FILE"
    # Send alert (email, Slack, etc.)
    systemctl start scheduler
fi

# Check for failed jobs
FAILED=$(sqlite3 "$DB_PATH" "SELECT COUNT(*) FROM jobs WHERE prep_status='failed' OR play_status='failed';")
if [ "$FAILED" -gt 0 ]; then
    echo "⚠ Found $FAILED failed jobs" >> "$LOG_FILE"
    # Send alert
fi

# Check active streams
STREAMING=$(sqlite3 "$DB_PATH" "SELECT COUNT(*) FROM jobs WHERE play_status='streaming';")
echo "Active streams: $STREAMING" >> "$LOG_FILE"

# Check disk space
DISK_USAGE=$(df -h /mnt/storage | tail -1 | awk '{print $5}' | sed 's/%//')
if [ "$DISK_USAGE" -gt 90 ]; then
    echo "⚠ Disk usage is ${DISK_USAGE}%" >> "$LOG_FILE"
    # Send alert
fi

echo "" >> "$LOG_FILE"

# Make executable
chmod +x /opt/scheduler/monitor.sh

# Add to crontab (check every 5 minutes)
(crontab -l 2>/dev/null; echo "*/5 * * * * /opt/scheduler/monitor.sh") | crontab -

External Monitoring

Prometheus + Grafana

Export metrics using node_exporter or custom exporter:

# Install node_exporter for system metrics
# Create custom exporter for job metrics from database
# Set up Grafana dashboard

Uptime Monitoring

Use services like:

UptimeRobot
Pingdom
Datadog

Monitor:

Service availability
RTMP server connectivity
NocoDB API accessibility

Backup & Recovery

What to Backup

Database (scheduler.db) - Critical
Configuration (.env.production or /etc/scheduler/scheduler.env) - Critical
Final videos (if you want to keep processed videos)
Logs (optional, for forensics)

Backup Script

Create /opt/scheduler/backup.sh:

#!/bin/bash

BACKUP_DIR="/backup/scheduler"
DATE=$(date +%Y%m%d_%H%M%S)
DB_PATH="/opt/scheduler/scheduler.db"
CONFIG_PATH="/etc/scheduler/scheduler.env"

# Create backup directory
mkdir -p "$BACKUP_DIR"

# Backup database
cp "$DB_PATH" "$BACKUP_DIR/scheduler_${DATE}.db"

# Backup config (careful with secrets!)
cp "$CONFIG_PATH" "$BACKUP_DIR/config_${DATE}.env"

# Compress old backups
find "$BACKUP_DIR" -name "*.db" -mtime +7 -exec gzip {} \;

# Delete backups older than 30 days
find "$BACKUP_DIR" -name "*.db.gz" -mtime +30 -delete
find "$BACKUP_DIR" -name "*.env" -mtime +30 -delete

# Optional: Upload to S3/cloud storage
# aws s3 sync "$BACKUP_DIR" s3://your-bucket/scheduler-backups/

echo "Backup completed: $BACKUP_DIR/scheduler_${DATE}.db"

# Make executable
chmod +x /opt/scheduler/backup.sh

# Run daily at 2 AM
(crontab -l 2>/dev/null; echo "0 2 * * * /opt/scheduler/backup.sh") | crontab -

Recovery Procedure

1. Restore from Backup

# Stop service
sudo systemctl stop scheduler

# Restore database
cp /backup/scheduler/scheduler_YYYYMMDD_HHMMSS.db /opt/scheduler/scheduler.db
chown scheduler:scheduler /opt/scheduler/scheduler.db

# Restore config if needed
cp /backup/scheduler/config_YYYYMMDD_HHMMSS.env /etc/scheduler/scheduler.env

# Start service
sudo systemctl start scheduler

2. Disaster Recovery (Full Rebuild)

If server is completely lost:

Provision new server
Follow installation steps above
Restore database and config from backup
Restart service
Verify jobs are picked up

Recovery Time Objective (RTO): 30-60 minutes Recovery Point Objective (RPO): Up to 24 hours (with daily backups)

Maintenance

Routine Tasks

Daily

✓ Check service status
✓ Review error logs
✓ Check failed jobs

Weekly

✓ Review disk space
✓ Check database size
✓ Clean up old processed videos (if not needed)

Monthly

✓ Update system packages
✓ Review and optimize database
✓ Test backup restoration
✓ Review and rotate logs

Database Maintenance

# Vacuum database (reclaim space, optimize)
sqlite3 /opt/scheduler/scheduler.db "VACUUM;"

# Analyze database (update statistics)
sqlite3 /opt/scheduler/scheduler.db "ANALYZE;"

# Clean up old completed jobs (optional)
sqlite3 /opt/scheduler/scheduler.db "DELETE FROM jobs WHERE play_status='done' AND datetime(run_at) < datetime('now', '-30 days');"

Log Rotation

For systemd (automatic via journald):

# Configure in /etc/systemd/journald.conf
SystemMaxUse=1G
RuntimeMaxUse=100M

For Docker:

# Already configured in docker-compose.prod.yml
logging:
  driver: "json-file"
  options:
    max-size: "10m"
    max-file: "5"

Video Cleanup

# Clean up old final videos (adjust retention as needed)
find /mnt/storage/final_movies -name "*.mp4" -mtime +7 -delete

# Or move to archive
find /mnt/storage/final_movies -name "*.mp4" -mtime +7 -exec mv {} /mnt/archive/ \;

Troubleshooting

Service Won't Start

# Check service status
systemctl status scheduler

# Check logs for errors
journalctl -u scheduler -n 50

# Common issues:
# 1. Missing environment variables
grep -i "error" /var/log/syslog | grep scheduler

# 2. Permission issues
ls -la /opt/scheduler
ls -la /mnt/storage

# 3. GPU access issues
ls -la /dev/dri/
groups scheduler  # Should include 'video' and 'render'

Streams Keep Failing

# Test RTMP server manually
ffmpeg -re -i test.mp4 -c copy -f flv rtmp://your-server/live/stream

# Check network connectivity
ping your-rtmp-server.com
telnet your-rtmp-server.com 1935

# Review stream logs
journalctl -u scheduler | grep -A 10 "Stream crashed"

# Check retry count
sqlite3 /opt/scheduler/scheduler.db "SELECT nocodb_id, title, stream_retry_count FROM jobs WHERE play_status='streaming';"

High CPU/Memory Usage

# Check resource usage
top -u scheduler

# Or for Docker
docker stats movie_scheduler

# Common causes:
# 1. Large video file encoding - normal, wait for completion
# 2. whisper.cpp using all cores - normal
# 3. Multiple prep jobs running - adjust or wait

# Limit resources if needed (systemd)
systemctl edit scheduler
# Add:
[Service]
CPUQuota=200%
MemoryMax=4G

Database Locked Errors

# Check for stale locks
lsof /opt/scheduler/scheduler.db

# Kill stale processes if needed
# Restart service
systemctl restart scheduler

VAAPI Not Working

# Verify VAAPI support
vainfo

# Test FFmpeg VAAPI
ffmpeg -hwaccels

# Check permissions
ls -la /dev/dri/renderD128
groups scheduler  # Should include 'video' or 'render'

# Fallback to software encoding
# Comment out VAAPI_DEVICE in config
# Encoding will use CPU (slower but works)

Performance Optimization

Hardware Acceleration

# Verify GPU usage during encoding
intel_gpu_top  # For Intel GPUs
radeontop      # For AMD GPUs

# If GPU not being used, check:
# 1. VAAPI device path correct
# 2. User has GPU permissions
# 3. FFmpeg compiled with VAAPI support

Storage Performance

# Use SSD for final videos (they're streamed frequently)
# Use HDD for raw videos (accessed once for processing)

# Test disk performance
dd if=/dev/zero of=/mnt/storage/test bs=1M count=1024 oflag=direct
rm /mnt/storage/test

Network Optimization

# For better streaming reliability
# 1. Use dedicated network for RTMP
# 2. Enable QoS for streaming traffic
# 3. Consider local RTMP relay

# Test network throughput
iperf3 -c your-rtmp-server.com

Production Checklist

Before going live:

Secrets stored securely (not in git)
Service auto-starts on boot
Backups configured and tested
Monitoring configured
Logs being rotated
Disk space alerts configured
Test recovery procedure
Document runbook for on-call
GPU permissions verified
RTMP connectivity tested
NocoDB API tested
Process one test video end-to-end
Verify streaming watchdog works
Test service restart during streaming
Configure alerting for failures

Support & Updates

Getting Updates

# Git-based deployment
cd /opt/scheduler
git pull origin main
systemctl restart scheduler

# Docker-based deployment
cd /opt/scheduler
docker compose -f docker-compose.prod.yml pull
docker compose -f docker-compose.prod.yml up -d

Reporting Issues

Include in bug reports:

Service logs: journalctl -u scheduler -n 100
Database state: sqlite3 scheduler.db ".dump jobs"
System info: uname -a, python3 --version, ffmpeg -version
Configuration (redact secrets!)

Additional Resources

FFmpeg VAAPI Guide: https://trac.ffmpeg.org/wiki/Hardware/VAAPI
whisper.cpp: https://github.com/ggerganov/whisper.cpp
NocoDB API: https://docs.nocodb.com
Systemd Documentation: https://www.freedesktop.org/software/systemd/man/

License

This production guide is provided as-is. Test thoroughly in staging before production deployment.

19 KiB Raw Blame History