Ops Subdomains Runbook
This runbook documents production access for operations services:
https://queues.aaperture.comhttps://grafana.aaperture.comhttps://prometheus.aaperture.com
Goals
- Keep ops UIs outside the main app routes.
- Protect access with authentication.
- Make TLS/DNS troubleshooting quick and repeatable.
Nginx Files
- Main app vhost:
infra/nginx-aaperture.conf - Ops subdomains vhost:
infra/nginx-ops-subdomains.conf
Install both files and ensure symlink names match your include pattern in nginx.conf:
- If includes are
sites-enabled/*.com, symlinks must end with.com. - If includes are
sites-enabled/*.conf, symlinks must end with.conf.
Required DNS
Create A records pointing to the production server IP:
queues.aaperture.comgrafana.aaperture.comprometheus.aaperture.com
TLS Certificates
Issue certs after DNS propagation (example with certbot/nginx plugin):
sudo certbot --nginx -d queues.aaperture.com
sudo certbot --nginx -d grafana.aaperture.com
sudo certbot --nginx -d prometheus.aaperture.com
Security
- Bull Board auth in API:
QUEUE_BOARD_BASIC_AUTH_ENABLED=trueQUEUE_BOARD_BASIC_AUTH_USERQUEUE_BOARD_BASIC_AUTH_PASSWORD
- Prometheus can also be protected at Nginx level via
auth_basic. - Grafana auth is managed by Grafana admin credentials:
GRAFANA_ADMIN_USERGRAFANA_ADMIN_PASSWORD
Admin Ops Checks (Application)
Admin users can validate ops health and alerting directly from the app:
GET /ops/status: checks internal reachability of queues/grafana/prometheus and returnsup/down.POST /observability/alerts/test: sends a synthetic alert to configured Slack/email channels.
UI location:
- Admin panel → Server Status tab → "Ops Status" card.
Grafana Persistence Permissions
infra/docker-compose.prod.yml mounts ${APP_DIR}/.data/grafana to /var/lib/grafana.
This host folder must be writable by UID/GID 472:
sudo mkdir -p /var/www/aaperture/.data/grafana
sudo chown -R 472:472 /var/www/aaperture/.data/grafana
sudo chmod -R u+rwX,g+rwX /var/www/aaperture/.data/grafana
Validation Commands
sudo nginx -t
sudo systemctl reload nginx
curl -I https://queues.aaperture.com
curl -I https://grafana.aaperture.com
curl -I https://prometheus.aaperture.com
Certificate/SAN check:
echo | openssl s_client -connect queues.aaperture.com:443 -servername queues.aaperture.com 2>/dev/null | openssl x509 -noout -subject -ext subjectAltName
echo | openssl s_client -connect grafana.aaperture.com:443 -servername grafana.aaperture.com 2>/dev/null | openssl x509 -noout -subject -ext subjectAltName
echo | openssl s_client -connect prometheus.aaperture.com:443 -servername prometheus.aaperture.com 2>/dev/null | openssl x509 -noout -subject -ext subjectAltName
Troubleshooting
ERR_CERT_COMMON_NAME_INVALID
- Verify correct
server_nameblocks are loaded:sudo nginx -T | grep -n "server_name". - Verify include pattern (
*.comvs*.conf) matches symlink names. - Recheck cert paths in each TLS server block.
502 on Grafana
- Check container status/logs:
docker ps | grep grafanadocker logs --tail=200 <grafana-container>
- Most common cause is
/var/lib/grafanapermission denied.
Certbot HTTP-01 challenge returns 404
- Ensure port 80 block serves
/.well-known/acme-challenge/from a reachable webroot. - Confirm DNS points to the same server handling Nginx.