FMS mesh Ansible playbook: roles, dashboards, and certificate-based SSH

Repository: ~/writing/ansible/ Inventory: inventory.ini Vault: ~/vault-pass


1. Playbook Structure

ansible/
├── ansible.cfg
├── inventory.ini
├── site.yml                          # main playbook
├── playbooks/
│   ├── deploy-pg-router.yml          # binary deploy
│   ├── remediate-audit.yml           # audit remediation
│   └── ssh-mesh-ca.yml              # SSH certificate rollout
├── roles/
│   ├── common/                       # base packages, timezone, NTP
│   ├── firewall/                     # UFW rules from host_vars
│   ├── ssh_hardening/                # sshd config hardening
│   ├── fail2ban/                     # fail2ban + sshd jail
│   ├── wg_mesh_keygen/               # WireGuard key generation
│   ├── wg_mesh_apply/                # WireGuard interface config
│   ├── postgres_primary/             # PostgreSQL primary setup
│   ├── postgres_replica/             # streaming replication
│   ├── pg_router/                    # pg-router binary, service, dashboard timer
│   ├── ssh_mesh_ca/                  # SSH CA certificate management
│   ├── chatmail_relay/               # chatmail config + service verification
│   └── chatmail_hardening/           # systemd drop-in overrides
├── group_vars/
├── host_vars/
└── tasks/

Execution Order (site.yml)

Phase 1: all hosts       → common, firewall, ssh_hardening, fail2ban
Phase 2: wg_mesh         → wg_mesh_keygen, wg_mesh_apply
Phase 3: primary         → postgres_primary
Phase 4: replica         → postgres_replica (serial: 1)
Phase 5: pg_router       → pg_router (binary, service, dashboard timer)
Phase 6: verification    → wg0 up, pg-router active, postgres listening, replication
Phase 7: chatmail        → firewall, ssh_hardening, fail2ban, chatmail_relay, chatmail_hardening

Tag Reference

TagScope
firewallUFW rules on all hosts
sshsshd hardening on all hosts
fail2banfail2ban installation on all hosts
mesh, wgWireGuard keygen + interface apply
postgresPostgreSQL primary + replica setup
pg_routerpg-router binary, service, dashboard
chatmailchatmail relay + monitoring hardening
hardeningsystemd service hardening (chatmail)
verifyverification tasks only

2. Inventory

Host Groups

GroupHostsPurpose
hubsbuf-01, mia-01, lax-01Mesh backbone, pg-router, PG
chatmaillca-01Delta Chat relay (eijo.im)
apsyd-01AP leaf node
nameserversns-01, ns-02Authoritative DNS
primarybuf-01PostgreSQL primary
replicamia-01, lax-01PostgreSQL streaming replicas
pg_routerbuf-01, mia-01, lax-01HTTP/DNS/SMTP dispatch
wg_meshall nodes except localhostWireGuard mesh participants

Connection Details

Groupansible_userSSH Key
hubsrootbacon
chatmailansbacon
apansbacon
nameserversansbacon

3. Role Details

3.1 fail2ban

Installed on all hosts. Protects SSH with a jail that bans IPs after 3 failed attempts in 10 minutes for 1 hour. The WireGuard mesh (fd53::/16) is whitelisted to prevent mesh peers from being banned during automated deployments.

# defaults
fail2ban_bantime: 1h
fail2ban_findtime: 10m
fail2ban_maxretry: 3
fail2ban_ignoreip: "127.0.0.1/8 ::1 fd53::/16"

Uses nftables as the ban action backend with systemd log backend.

3.2 pg_router (includes dashboard)

Deploys the pg-router binary, environment file, systemd service, and the dashboard rebuild timer.

The dashboard is a self-rebuilding HTML page stored in PostgreSQL:

  1. Two PL/pgSQL functions — build_infra_dashboard() and build_command_center() — generate HTML from live database state (VPS inventory, risk register, compliance controls, node heartbeats, geo-redirect rules)
  2. The HTML is upserted into the http_routes table at / and /command-center
  3. pg-router serves http_routes rows as HTTP responses on all hubs
  4. A systemd timer calls these functions periodically via psql

Dashboard rebuild chain:

pg-router-dashboard.timer (systemd, on primary only)
  → pg-router-dashboard.service (oneshot)
    → psql "$DATABASE_URL" -c "SELECT build_infra_dashboard(); SELECT build_command_center()"
      → Functions generate HTML from live tables
        → INSERT INTO http_routes (path_pattern='/command-center', ...)
          → PostgreSQL streaming replication → all replicas
            → pg-router on every hub serves the same HTML

Key defaults:

pg_router_dashboard_enabled: false    # override in host_vars
pg_router_dashboard_calendar: "*:00:00"  # hourly (or *:0/5:00 for every 5 min)

Only buf-01 has pg_router_dashboard_enabled: true because it’s the primary — only the primary can write to the database. Replicas (mia-01, lax-01) serve the dashboard from their replicated http_routes table automatically.

To enable on a new hub:

# host_vars/new-hub.kmsp42.com.yml
pg_router_dashboard_enabled: true    # only if this is the primary
pg_router_dashboard_calendar: "*:0/5:00"

Then run:

ansible-playbook site.yml --tags pg_router --limit new-hub.kmsp42.com

Replicating to a new environment:

The dashboard requires no separate deployment — it’s embedded in the pg-router SQL migrations. When pg-router runs migrate against a fresh database, the build_command_center() and build_infra_dashboard() functions are created automatically. The timer then calls them to populate http_routes. Any pg-router instance connected to the same database (or a replica) will serve the dashboard.

3.3 ssh_mesh_ca

Converts SSH authentication from static authorized_keys to CA-signed certificates. Deployed via a standalone playbook (playbooks/ssh-mesh-ca.yml), not site.yml.

How it works:

  1. A CA ed25519 keypair is generated on the Ansible controller (once, stored in files/ssh-ca/)
  2. Each node’s host key is signed → clients verify host identity without TOFU
  3. A per-node user keypair is generated and signed → nodes authenticate to each other
  4. sshd is configured to present the host cert and trust user certs from the CA
  5. A mesh service user is created on each node with the signed cert

Certificate properties:

PropertyValue
Key typeed25519
Host validity52 weeks
User validity52 weeks
Principalsmesh
RevocationKRL at /etc/ssh/mesh_revoked_keys

Usage:

# Full deployment
ansible-playbook playbooks/ssh-mesh-ca.yml

# Single group
ansible-playbook playbooks/ssh-mesh-ca.yml --limit chatmail

# Verify only
ansible-playbook playbooks/ssh-mesh-ca.yml --tags verify

# Add a new node: add to inventory, then re-run
ansible-playbook playbooks/ssh-mesh-ca.yml --limit new-node.kmsp42.com

# Revoke a node
ssh-keygen -k -f revoked_keys -u <node_cert.pub>
# Copy to files/ssh-ca/mesh_revoked_keys, re-run playbook

sshd config deployed (/etc/ssh/sshd_config.d/10-mesh-ca.conf):

HostCertificate /etc/ssh/ssh_host_ed25519_key-cert.pub
TrustedUserCAKeys /etc/ssh/mesh_ca.pub
AuthorizedPrincipalsFile %h/.ssh/authorized_principals
RevokedKeys /etc/ssh/mesh_revoked_keys

3.4 chatmail_relay

See the ChatMail eijo.im write-up for full details on this role.

3.5 chatmail_hardening

See the ChatMail eijo.im write-up for systemd drop-in configurations and security scores.


4. Deployment Recipes

Full mesh deploy

ansible-playbook site.yml -i inventory.ini --vault-password-file ~/vault-pass

Single-role deploy

# Firewall only, all hosts
ansible-playbook site.yml --tags firewall

# fail2ban only, hubs
ansible-playbook site.yml --tags fail2ban --limit hubs

# pg-router only, single hub
ansible-playbook site.yml --tags pg_router --limit buf-01.kmsp42.com

# Chatmail relay + hardening
ansible-playbook site.yml --tags chatmail --limit chatmail

Verification only

# Verify everything
ansible-playbook site.yml --tags verify

# Verify chatmail
ansible-playbook site.yml --tags verify --limit chatmail

# Verify SSH certs
ansible-playbook playbooks/ssh-mesh-ca.yml --tags verify

Deploy pg-router binary

cargo build --release -p pg-router
cp target/release/pg-router files/pg-router
ansible-playbook playbooks/deploy-pg-router.yml

Force dashboard rebuild

ssh -o IdentitiesOnly=yes -i ~/.ssh/bacon root@107.175.116.190 \
  "systemctl start pg-router-dashboard.service"

5. Adding a New Node

  1. Provision the VPS and note its public IP

  2. Add to inventory.ini:

    [hubs]  # or appropriate group
    new-01.kmsp42.com  ansible_user=root  ansible_host=<IP>  mesh_rr=<RR>  mesh_ss=<SS>  wg_addr=fd53:<RRSS>:1000::1  wg_endpoint="<IP>:51820"
  3. Create host_vars (host_vars/new-01.kmsp42.com.yml) with firewall rules and any role-specific overrides

  4. Run the playbook:

    ansible-playbook site.yml --limit new-01.kmsp42.com
  5. Deploy SSH certificates:

    ansible-playbook playbooks/ssh-mesh-ca.yml --limit new-01.kmsp42.com
  6. Verify:

    ansible-playbook site.yml --tags verify --limit new-01.kmsp42.com

6. Downloads

Roles

Playbooks

Hardening Configs