Aller au contenu principal

Ops Subdomains Runbook

This runbook documents production access for operations services:

  • https://queues.aaperture.com
  • https://grafana.aaperture.com
  • https://prometheus.aaperture.com

It also clarifies host separation so ops vhosts do not accidentally capture app/public tenant traffic:

  • marketing: https://aaperture.com
  • app: https://app.aaperture.com
  • tenant public hosts: https://{slug}.aaperture.com

Tenant hosts are only meant for users whose plan includes the paid Studio page capability. DNS and TLS stay in place globally, but application URL generation must fall back to app.aaperture.com when that permission is not granted.

Goals

  • Keep ops UIs outside the main app routes.
  • Protect access with authentication.
  • Make TLS/DNS troubleshooting quick and repeatable.
  • Avoid wildcard collisions between ops hosts and tenant hosts.

Nginx Files

  • Main app vhost: infra/nginx-aaperture.conf
  • Ops subdomains vhost: infra/nginx-ops-subdomains.conf

Main app vhost must explicitly cover:

  • aaperture.com
  • www.aaperture.com
  • app.aaperture.com
  • *.aaperture.com

Ops vhost must stay explicit and must not use a wildcard such as *.aaperture.com.

Install both files and ensure symlink names match your include pattern in nginx.conf:

  • If includes are sites-enabled/*.com, symlinks must end with .com.
  • If includes are sites-enabled/*.conf, symlinks must end with .conf.

Required DNS

Create A records pointing to the production server IP:

  • queues.aaperture.com
  • grafana.aaperture.com
  • prometheus.aaperture.com

TLS Certificates

Issue certs after DNS propagation (example with certbot/nginx plugin):

sudo certbot --nginx -d queues.aaperture.com
sudo certbot --nginx -d grafana.aaperture.com
sudo certbot --nginx -d prometheus.aaperture.com

Security

  • Bull Board auth in API:
    • QUEUE_BOARD_BASIC_AUTH_ENABLED=true
    • QUEUE_BOARD_BASIC_AUTH_USER
    • QUEUE_BOARD_BASIC_AUTH_PASSWORD
  • Prometheus can also be protected at Nginx level via auth_basic.
  • Grafana auth is managed by Grafana admin credentials:
    • GRAFANA_ADMIN_USER
    • GRAFANA_ADMIN_PASSWORD

Admin Ops Checks (Application)

Admin users can validate ops health and alerting directly from the app:

  • GET /ops/status: checks internal reachability of queues/grafana/prometheus and returns up/down.
  • POST /observability/alerts/test: sends a synthetic alert to configured Slack/email channels.

UI location:

  • Admin panel → Server Status tab → "Ops Status" card.

Grafana Persistence Permissions

infra/docker-compose.prod.yml mounts ${APP_DIR}/.data/grafana to /var/lib/grafana. This host folder must be writable by UID/GID 472:

sudo mkdir -p /var/www/aaperture/.data/grafana
sudo chown -R 472:472 /var/www/aaperture/.data/grafana
sudo chmod -R u+rwX,g+rwX /var/www/aaperture/.data/grafana

Validation Commands

sudo nginx -t
sudo systemctl reload nginx

curl -I https://queues.aaperture.com
curl -I https://grafana.aaperture.com
curl -I https://prometheus.aaperture.com

Certificate/SAN check:

echo | openssl s_client -connect queues.aaperture.com:443 -servername queues.aaperture.com 2>/dev/null | openssl x509 -noout -subject -ext subjectAltName
echo | openssl s_client -connect grafana.aaperture.com:443 -servername grafana.aaperture.com 2>/dev/null | openssl x509 -noout -subject -ext subjectAltName
echo | openssl s_client -connect prometheus.aaperture.com:443 -servername prometheus.aaperture.com 2>/dev/null | openssl x509 -noout -subject -ext subjectAltName

Troubleshooting

ERR_CERT_COMMON_NAME_INVALID

  • Verify correct server_name blocks are loaded: sudo nginx -T | grep -n "server_name".
  • Verify include pattern (*.com vs *.conf) matches symlink names.
  • Recheck cert paths in each TLS server block.

502 on Grafana

  • Check container status/logs:
    • docker ps | grep grafana
    • docker logs --tail=200 <grafana-container>
  • Most common cause is /var/lib/grafana permission denied.

Certbot HTTP-01 challenge returns 404

  • Ensure port 80 block serves /.well-known/acme-challenge/ from a reachable webroot.
  • Confirm DNS points to the same server handling Nginx.