Server Contract
This document defines the contract between the platform infrastructure (bootstrapped by bootstrap-server.sh) and the app deployment workflows (deploy.yml, preview.yml). It explains the directory layout, lifecycle, and assumptions that connect them.
Server Directory Layout
/opt/platform/ # Platform root (created by bootstrap-server.sh)
docker-compose.yml # Platform services (7 core + 3 optional metrics)
.env # Platform credentials (POSTGRES_PASSWORD, MINIO_ROOT_*, GRAFANA_ADMIN_PASSWORD, ACME_EMAIL, OPS_DOMAIN, ALERT_REPO, COMPOSE_PROFILES)
.bootstrapped # Timestamp marker from last bootstrap run
Caddyfile # Global Caddyfile — imports /etc/caddy/apps/*.caddy
caddy-apps/ # Per-app Caddyfile fragments (written by deploy/preview workflows)
<app-name>.caddy # Production route for an app
<app-name>-pr-<N>.caddy # Preview route for a PR
ops.caddy # Grafana route (created by bootstrap)
credentials/ # Per-app credential files (created by create-app-credentials.sh)
<app-name>.env # DB_USER, DB_PASSWORD, S3_ACCESS_KEY, S3_SECRET_KEY
infrastructure/ # Ops scripts (copied from platform repo by bootstrap)
deploy-blue-green.sh # Blue-green zero-downtime deploy
verify-backup.sh # Backup restore verification
rotate-credentials.sh # Credential rotation without downtime
infrastructure/ # Ops scripts (copied from platform repo by bootstrap)
backup-postgres.sh
restore-postgres.sh
check-alerts.sh
update-images.sh
create-app-credentials.sh
usage-report.sh
loki-config.yml # Loki configuration
promtail-config.yml # Promtail configuration
prometheus.yml # Prometheus scrape config (always created)
grafana/ # Grafana provisioning files and dashboards
provisioning/datasources/
provisioning/dashboards/
dashboards/
/opt/apps/ # Application root (created by bootstrap-server.sh)
<app-name>/ # Cloned app repo (created manually by admin)
.deploy-slot # Active blue-green slot ("blue" or "green")
deploy/
.env # App runtime config (created manually from env.template)
docker-compose.yml # App services (joins towlion network)
<app-name>-pr-<N>/ # Preview clone (created/destroyed by preview.yml)
/data/ # Persistent data volumes (created by bootstrap-server.sh)
postgres/ # PostgreSQL data directory
redis/ # Redis data directory
minio/ # MinIO object storage
caddy/data/ # Caddy TLS certificates
caddy/config/ # Caddy config state
loki/ # Loki log storage
grafana/ # Grafana state
prometheus/ # Prometheus data (created always, used when metrics enabled)
backups/postgres/ # pg_dump backup files (7-day retention, optional .dump.enc encryption)
Bootstrap to Deploy Lifecycle
-
Bootstrap the server — Run
sudo bash infrastructure/bootstrap-server.shon a fresh Debian machine. This creates the directory layout above, installs Docker, creates thedeployuser, generates platform credentials, starts the 7 core platform services (plus 3 optional metrics services if enabled), copies infrastructure scripts, and installs cron jobs. -
Configure DNS — Point app domains and
*.preview.<app>.<domain>(one per app) to the server IP. -
Clone the app repo — SSH in as
deployand clone the app to/opt/apps/<name>/. -
Create
deploy/.env— Copydeploy/env.templateand fill in values (DATABASE_URL, S3 credentials, etc.). -
Provision per-app credentials (optional) — Run
create-app-credentials.sh <name>to create an isolated PostgreSQL user and MinIO bucket. Credentials are written to/opt/platform/credentials/<name>.env. -
Configure GitHub secrets — Set
SERVER_HOST,SERVER_USER,SERVER_SSH_KEY, andAPP_DOMAINon the app repo. AddPREVIEW_DOMAINfor preview environments. -
Push to main — Triggers
deploy.yml: - SSHes into the server as
deploy - Calls
deploy-blue-green.shwhich performs a zero-downtime blue-green deploy:- Reads current slot from
/opt/apps/<name>/.deploy-slot(defaults to "blue") - Pulls latest code, injects credentials, builds and starts the next slot
- Waits for Docker healthcheck, runs Alembic migrations
- Swaps Caddyfile to point to the new slot, reloads Caddy
- Verifies external health, then stops the old slot
- On any failure: tears down the new slot, old slot keeps serving traffic
- Reads current slot from
App Workflow Server Assumptions
The deploy.yml and preview.yml workflows SSH into the server and depend on the following structure being in place:
| Path / Resource | Purpose | Created By |
|---|---|---|
/opt/platform/docker-compose.yml |
Platform services (postgres, redis, caddy, etc.) | bootstrap-server.sh |
/opt/platform/.env |
POSTGRES_PASSWORD for database operations |
bootstrap-server.sh |
/opt/platform/caddy-apps/ |
Writable directory for per-app Caddyfile fragments | bootstrap-server.sh |
/opt/platform/credentials/<name>.env |
Per-app DB/S3 credentials (optional) | create-app-credentials.sh |
/opt/apps/<name>/ |
Cloned app repo with deploy/.env configured |
Admin (manual) |
towlion Docker network |
Shared network connecting platform services and app containers | bootstrap-server.sh |
deploy user |
SSH user with Docker group membership | bootstrap-server.sh |
If any of these are missing, the workflow will fail. The bootstrap script is idempotent and can be re-run to restore missing structure.
Infrastructure Scripts Reference
All scripts live in the platform repo under infrastructure/ and are copied to /opt/platform/infrastructure/ during bootstrap.
| Script | Purpose | Invocation |
|---|---|---|
bootstrap-server.sh |
Transform fresh Debian into running platform | Manual (sudo bash) |
verify-server.sh |
Read-only health check of server state | Manual (bash) |
create-app-credentials.sh |
Provision per-app PostgreSQL user + MinIO bucket | Manual (bash <script> <app-name>) |
backup-postgres.sh |
Per-database pg_dump with 7-day retention, optional AES-256 encryption |
Cron: daily at 02:00 |
restore-postgres.sh |
Restore a database from backup | Manual (bash <script>) |
check-alerts.sh |
Check container health, disk, memory; create GitHub Issues | Cron: every 5 minutes |
update-images.sh |
Pull latest Docker images and recreate containers | Cron: weekly Sunday at 03:00 |
usage-report.sh |
Generate 6-section resource usage report | Manual (bash) |
scan-images.sh |
Scan running container images for vulnerabilities (Trivy), create GitHub Issues | Cron: weekly Sunday at 04:00 |
deploy-blue-green.sh |
Zero-downtime blue-green deploy with automatic rollback | Called by deploy.yml workflow |
verify-backup.sh |
Restore backups to temp DB and verify integrity | Cron: weekly Sunday at 05:00 |
rotate-credentials.sh |
Rotate per-app or platform master credentials without downtime | Manual (bash <script> <app-name> or --platform) |
Server Hardening
The bootstrap script applies several security measures automatically. Self-hosters get these out of the box; no manual hardening steps are needed beyond running the bootstrap.
Firewall — UFW is configured to deny all incoming traffic except ports 22 (SSH), 80 (HTTP), and 443 (HTTPS). All other ports, including PostgreSQL (5432), Redis (6379), and MinIO (9000), are only reachable via the internal Docker network.
Automatic security updates — unattended-upgrades is installed and configured for the Debian security channel. Security patches are applied automatically without manual intervention.
Non-root deployment — A deploy user with Docker group membership handles all deployments. Workflows SSH in as deploy, never as root. The deploy user cannot modify system packages or firewall rules.
Credential isolation — The platform .env file is mode 600 (readable only by its owner). Per-app credentials are generated by create-app-credentials.sh and stored in separate files under /opt/platform/credentials/, each also mode 600.
Container resource limits — Every platform service and every app container has explicit CPU and memory limits in its Docker Compose file. This prevents any single container from exhausting server resources. The 7 core services use ~2.66G / 3.25 CPU. The 3 optional metrics services add 448M / 1.00 CPU when enabled.
Brute-force protection — fail2ban is installed with an SSH jail (systemd backend, since Debian 12 uses journald). Configuration: maxretry=5, bantime=3600s. IPs that exceed the retry limit are banned for one hour.
SSH hardening — A drop-in config at /etc/ssh/sshd_config.d/99-towlion-hardening.conf enforces: PermitRootLogin no, PasswordAuthentication no, MaxAuthTries 3, X11Forwarding no. Only key-based authentication as the deploy user is permitted.
Security headers — The platform Caddyfile includes a (security_headers) snippet that sets: Strict-Transport-Security (HSTS, max-age=31536000, includeSubDomains), X-Content-Type-Options nosniff, X-Frame-Options DENY, Referrer-Policy strict-origin-when-cross-origin, Permissions-Policy (camera, microphone, and geolocation denied), and strips the Server header. All app and ops Caddy routes import this snippet.
Rate limiting — Application-level rate limiting via slowapi (FastAPI middleware). Default limit: 60 requests/minute per IP. The /health endpoint is exempt from rate limiting since Docker healthchecks and monitoring hit it frequently.
Read-only container filesystems — App containers, Caddy, and Promtail run with read_only: true in their Docker Compose configuration. Writable areas (/tmp, /app/__pycache__) are mounted as tmpfs. Database services (PostgreSQL, Redis, MinIO) are not read-only due to PID files and temp tables.
Docker event audit logging — A systemd service (docker-audit.service) runs docker events with JSON output to /var/log/docker-audit.log. Promtail scrapes this file and forwards events to Loki (label: job=docker-audit). All container start, stop, die, and health_status events are captured.
Backup encryption — Backups can be encrypted at rest using AES-256-CBC. Set the BACKUP_ENCRYPTION_KEY environment variable to the path of a key file. When set, backup-postgres.sh pipes pg_dump output through openssl enc and produces .dump.enc files. restore-postgres.sh and verify-backup.sh automatically detect encrypted backups and decrypt them before restoring. If the key file is not set, backups are stored unencrypted (with a warning).
Log rotation — A logrotate config at /etc/logrotate.d/towlion rotates /var/log/towlion-*.log and /var/log/docker-audit.log daily, retaining 90 compressed copies. The docker-audit.service is restarted after rotation since it holds the log file open.
Log retention — Loki retains logs for 90 days (retention_period: 2160h). The compactor runs retention enforcement with a 2-hour delete delay.
Platform credential rotation — rotate-credentials.sh --platform rotates the PostgreSQL superuser password and/or MinIO root password. After rotation, all app health checks are verified. Use --yes to skip the confirmation prompt.
Image vulnerability scanning — Trivy is installed via the Aqua Security apt repository. Every deploy runs a non-blocking trivy image scan of the newly built app image (HIGH/CRITICAL severity). A weekly cron job (scan-images.sh, Sunday 04:00) scans all running container images.
Mandatory Access Control (AppArmor) — Debian 12 ships with AppArmor enabled by default. Docker automatically applies the docker-default AppArmor profile to all containers, which restricts capabilities like writing to /proc and /sys, mounting filesystems, and accessing raw sockets. No configuration is needed — this works out of the box.
SELinux is not used. While SELinux is the standard MAC system on RHEL/Fedora, it is not well-suited for Debian:
- AppArmor is Debian's native MAC system, maintained by the Debian security team
- SELinux policies on Debian are incomplete and poorly maintained — the
selinux-policy-defaultpackage lags far behind RHEL equivalents - Docker + SELinux on Debian causes bind-mount labeling issues (
:z/:Zvolume flags) with no community support for troubleshooting - Enabling SELinux on Debian requires switching from AppArmor, losing Docker's automatic profile enforcement
Since AppArmor is already active and Docker integrates with it automatically, the platform's MAC requirements are met without any additional configuration.
Caddyfile Generation
The platform Caddyfile at /opt/platform/Caddyfile contains a security headers snippet and an import directive:
{
email {$ACME_EMAIL:admin@localhost}
}
(security_headers) {
header {
Strict-Transport-Security "max-age=31536000; includeSubDomains"
X-Content-Type-Options "nosniff"
X-Frame-Options "DENY"
Referrer-Policy "strict-origin-when-cross-origin"
Permissions-Policy "camera=(), microphone=(), geolocation=()"
-Server
}
}
import /etc/caddy/apps/*.caddy
The caddy-apps/ directory is bind-mounted into the Caddy container at /etc/caddy/apps/. App workflows write per-app .caddy files into this directory.
Production (deploy.yml) writes /opt/platform/caddy-apps/<name>.caddy via deploy-blue-green.sh. Container names include the active slot:
Preview (preview.yml) writes /opt/platform/caddy-apps/<name>-pr-<N>.caddy:
pr-<N>.preview.<name>.example.com {
import security_headers
reverse_proxy <name>-pr-<N>-app-1:8000
}
After writing the file, both workflows reload Caddy:
docker compose -f /opt/platform/docker-compose.yml exec -T caddy caddy reload --config /etc/caddy/Caddyfile
Preview cleanup removes the .caddy file and reloads Caddy again.
Per-App Credentials
By default, apps connect to PostgreSQL as the postgres superuser (credentials from deploy/.env). For credential isolation, run:
This creates:
- PostgreSQL: A dedicated user (
<app_name>_user) with access restricted to<app_name>_db - MinIO: A dedicated user (
<app-name>-user) with a scoped policy limiting access to the<app-name>-uploadsbucket - Credentials file:
/opt/platform/credentials/<app-name>.envcontainingDB_USER,DB_PASSWORD,S3_ACCESS_KEY,S3_SECRET_KEY(mode 600, owned bydeploy)
On subsequent deploys, deploy.yml checks for this credentials file and, if found, updates deploy/.env with the per-app values via sed:
CREDENTIALS_FILE="/opt/platform/credentials/${APP_NAME}.env"
if [ -f "$CREDENTIALS_FILE" ]; then
source "$CREDENTIALS_FILE"
sed -i "s|^DATABASE_URL=.*|DATABASE_URL=postgresql://${DB_USER}:${DB_PASSWORD}@postgres:5432/${APP_DB}|" deploy/.env
sed -i "s|^S3_ACCESS_KEY=.*|S3_ACCESS_KEY=${S3_ACCESS_KEY}|" deploy/.env
sed -i "s|^S3_SECRET_KEY=.*|S3_SECRET_KEY=${S3_SECRET_KEY}|" deploy/.env
sed -i "s|^S3_BUCKET=.*|S3_BUCKET=${APP_NAME}-uploads|" deploy/.env
fi
If no credentials file exists, the workflow falls back to whatever is already in deploy/.env and logs a warning.
Resource Metrics (Optional)
Three additional services provide real-time resource visibility in Grafana. They are off by default and use Docker Compose profiles to control startup.
| Service | Image | Memory Limit | CPU Limit | Purpose |
|---|---|---|---|---|
| prometheus | prom/prometheus:v2.53.0 | 256M | 0.50 | Metrics storage and queries |
| cadvisor | gcr.io/cadvisor/cadvisor:v0.49.1 | 128M | 0.25 | Container metrics |
| node-exporter | prom/node-exporter:v1.8.1 | 64M | 0.25 | Host metrics |
To enable — add COMPOSE_PROFILES=metrics to /opt/platform/.env, then run docker compose up -d in /opt/platform/. Or bootstrap with sudo ENABLE_METRICS=true bash bootstrap-server.sh.
To disable — remove the COMPOSE_PROFILES=metrics line from .env, then stop the metrics services with docker compose --profile metrics down.
The Prometheus config (/opt/platform/prometheus.yml), Grafana datasource, and dashboard JSON are always created during bootstrap — they are harmless without running services and avoid needing to re-bootstrap to enable metrics later. Only the service startup is conditional.
The "Resource Metrics" dashboard in Grafana includes:
- Host overview: CPU %, memory %, disk %, uptime (stat panels)
- Host time series: CPU, memory, disk I/O, network I/O over time
- Container overview: table of all containers with CPU, memory, network
- Per-container detail: filterable CPU and memory time series per container