Repository: ~/writing/ansible/
Inventory: inventory.ini
Vault: ~/vault-pass
1. Playbook Structure
ansible/
├── ansible.cfg
├── inventory.ini
├── site.yml # main playbook
├── playbooks/
│ ├── deploy-pg-router.yml # binary deploy
│ ├── remediate-audit.yml # audit remediation
│ └── ssh-mesh-ca.yml # SSH certificate rollout
├── roles/
│ ├── common/ # base packages, timezone, NTP
│ ├── firewall/ # UFW rules from host_vars
│ ├── ssh_hardening/ # sshd config hardening
│ ├── fail2ban/ # fail2ban + sshd jail
│ ├── wg_mesh_keygen/ # WireGuard key generation
│ ├── wg_mesh_apply/ # WireGuard interface config
│ ├── postgres_primary/ # PostgreSQL primary setup
│ ├── postgres_replica/ # streaming replication
│ ├── pg_router/ # pg-router binary, service, dashboard timer
│ ├── ssh_mesh_ca/ # SSH CA certificate management
│ ├── chatmail_relay/ # chatmail config + service verification
│ └── chatmail_hardening/ # systemd drop-in overrides
├── group_vars/
├── host_vars/
└── tasks/Execution Order (site.yml)
Phase 1: all hosts → common, firewall, ssh_hardening, fail2ban
Phase 2: wg_mesh → wg_mesh_keygen, wg_mesh_apply
Phase 3: primary → postgres_primary
Phase 4: replica → postgres_replica (serial: 1)
Phase 5: pg_router → pg_router (binary, service, dashboard timer)
Phase 6: verification → wg0 up, pg-router active, postgres listening, replication
Phase 7: chatmail → firewall, ssh_hardening, fail2ban, chatmail_relay, chatmail_hardeningTag Reference
| Tag | Scope |
|---|---|
firewall | UFW rules on all hosts |
ssh | sshd hardening on all hosts |
fail2ban | fail2ban installation on all hosts |
mesh, wg | WireGuard keygen + interface apply |
postgres | PostgreSQL primary + replica setup |
pg_router | pg-router binary, service, dashboard |
chatmail | chatmail relay + monitoring hardening |
hardening | systemd service hardening (chatmail) |
verify | verification tasks only |
2. Inventory
Host Groups
| Group | Hosts | Purpose |
|---|---|---|
hubs | buf-01, mia-01, lax-01 | Mesh backbone, pg-router, PG |
chatmail | lca-01 | Delta Chat relay (eijo.im) |
ap | syd-01 | AP leaf node |
nameservers | ns-01, ns-02 | Authoritative DNS |
primary | buf-01 | PostgreSQL primary |
replica | mia-01, lax-01 | PostgreSQL streaming replicas |
pg_router | buf-01, mia-01, lax-01 | HTTP/DNS/SMTP dispatch |
wg_mesh | all nodes except localhost | WireGuard mesh participants |
Connection Details
| Group | ansible_user | SSH Key |
|---|---|---|
| hubs | root | bacon |
| chatmail | ans | bacon |
| ap | ans | bacon |
| nameservers | ans | bacon |
3. Role Details
3.1 fail2ban
Installed on all hosts. Protects SSH with a jail that bans IPs after 3 failed attempts in 10 minutes for 1 hour. The WireGuard mesh (fd53::/16) is whitelisted to prevent mesh peers from being banned during automated deployments.
# defaults
fail2ban_bantime: 1h
fail2ban_findtime: 10m
fail2ban_maxretry: 3
fail2ban_ignoreip: "127.0.0.1/8 ::1 fd53::/16"
Uses nftables as the ban action backend with systemd log backend.
3.2 pg_router (includes dashboard)
Deploys the pg-router binary, environment file, systemd service, and the dashboard rebuild timer.
The dashboard is a self-rebuilding HTML page stored in PostgreSQL:
- Two PL/pgSQL functions —
build_infra_dashboard()andbuild_command_center()— generate HTML from live database state (VPS inventory, risk register, compliance controls, node heartbeats, geo-redirect rules) - The HTML is upserted into the
http_routestable at/and/command-center - pg-router serves
http_routesrows as HTTP responses on all hubs - A systemd timer calls these functions periodically via
psql
Dashboard rebuild chain:
pg-router-dashboard.timer (systemd, on primary only)
→ pg-router-dashboard.service (oneshot)
→ psql "$DATABASE_URL" -c "SELECT build_infra_dashboard(); SELECT build_command_center()"
→ Functions generate HTML from live tables
→ INSERT INTO http_routes (path_pattern='/command-center', ...)
→ PostgreSQL streaming replication → all replicas
→ pg-router on every hub serves the same HTML
Key defaults:
pg_router_dashboard_enabled: false # override in host_vars
pg_router_dashboard_calendar: "*:00:00" # hourly (or *:0/5:00 for every 5 min)
Only buf-01 has pg_router_dashboard_enabled: true because it’s the primary — only the primary can write to the database. Replicas (mia-01, lax-01) serve the dashboard from their replicated http_routes table automatically.
To enable on a new hub:
# host_vars/new-hub.kmsp42.com.yml
pg_router_dashboard_enabled: true # only if this is the primary
pg_router_dashboard_calendar: "*:0/5:00"
Then run:
ansible-playbook site.yml --tags pg_router --limit new-hub.kmsp42.com
Replicating to a new environment:
The dashboard requires no separate deployment — it’s embedded in the pg-router SQL migrations. When pg-router runs migrate against a fresh database, the build_command_center() and build_infra_dashboard() functions are created automatically. The timer then calls them to populate http_routes. Any pg-router instance connected to the same database (or a replica) will serve the dashboard.
3.3 ssh_mesh_ca
Converts SSH authentication from static authorized_keys to CA-signed certificates. Deployed via a standalone playbook (playbooks/ssh-mesh-ca.yml), not site.yml.
How it works:
- A CA ed25519 keypair is generated on the Ansible controller (once, stored in
files/ssh-ca/) - Each node’s host key is signed → clients verify host identity without TOFU
- A per-node user keypair is generated and signed → nodes authenticate to each other
- sshd is configured to present the host cert and trust user certs from the CA
- A
meshservice user is created on each node with the signed cert
Certificate properties:
| Property | Value |
|---|---|
| Key type | ed25519 |
| Host validity | 52 weeks |
| User validity | 52 weeks |
| Principals | mesh |
| Revocation | KRL at /etc/ssh/mesh_revoked_keys |
Usage:
# Full deployment
ansible-playbook playbooks/ssh-mesh-ca.yml
# Single group
ansible-playbook playbooks/ssh-mesh-ca.yml --limit chatmail
# Verify only
ansible-playbook playbooks/ssh-mesh-ca.yml --tags verify
# Add a new node: add to inventory, then re-run
ansible-playbook playbooks/ssh-mesh-ca.yml --limit new-node.kmsp42.com
# Revoke a node
ssh-keygen -k -f revoked_keys -u <node_cert.pub>
# Copy to files/ssh-ca/mesh_revoked_keys, re-run playbook
sshd config deployed (/etc/ssh/sshd_config.d/10-mesh-ca.conf):
HostCertificate /etc/ssh/ssh_host_ed25519_key-cert.pub
TrustedUserCAKeys /etc/ssh/mesh_ca.pub
AuthorizedPrincipalsFile %h/.ssh/authorized_principals
RevokedKeys /etc/ssh/mesh_revoked_keys3.4 chatmail_relay
See the ChatMail eijo.im write-up for full details on this role.
3.5 chatmail_hardening
See the ChatMail eijo.im write-up for systemd drop-in configurations and security scores.
4. Deployment Recipes
Full mesh deploy
ansible-playbook site.yml -i inventory.ini --vault-password-file ~/vault-passSingle-role deploy
# Firewall only, all hosts
ansible-playbook site.yml --tags firewall
# fail2ban only, hubs
ansible-playbook site.yml --tags fail2ban --limit hubs
# pg-router only, single hub
ansible-playbook site.yml --tags pg_router --limit buf-01.kmsp42.com
# Chatmail relay + hardening
ansible-playbook site.yml --tags chatmail --limit chatmailVerification only
# Verify everything
ansible-playbook site.yml --tags verify
# Verify chatmail
ansible-playbook site.yml --tags verify --limit chatmail
# Verify SSH certs
ansible-playbook playbooks/ssh-mesh-ca.yml --tags verifyDeploy pg-router binary
cargo build --release -p pg-router
cp target/release/pg-router files/pg-router
ansible-playbook playbooks/deploy-pg-router.ymlForce dashboard rebuild
ssh -o IdentitiesOnly=yes -i ~/.ssh/bacon root@107.175.116.190 \
"systemctl start pg-router-dashboard.service"
5. Adding a New Node
-
Provision the VPS and note its public IP
-
Add to inventory.ini:
[hubs] # or appropriate group new-01.kmsp42.com ansible_user=root ansible_host=<IP> mesh_rr=<RR> mesh_ss=<SS> wg_addr=fd53:<RRSS>:1000::1 wg_endpoint="<IP>:51820" -
Create host_vars (
host_vars/new-01.kmsp42.com.yml) with firewall rules and any role-specific overrides -
Run the playbook:
ansible-playbook site.yml --limit new-01.kmsp42.com -
Deploy SSH certificates:
ansible-playbook playbooks/ssh-mesh-ca.yml --limit new-01.kmsp42.com -
Verify:
ansible-playbook site.yml --tags verify --limit new-01.kmsp42.com
6. Downloads
Roles
- fail2ban — tasks/main.yml
- fail2ban — defaults/main.yml
- fail2ban — jail.local
- chatmail_relay — tasks/main.yml
- chatmail_relay — defaults/main.yml
- chatmail_relay — handlers/main.yml
- chatmail_hardening — tasks/main.yml
- chatmail_hardening — defaults/main.yml