Skip to content

HomeNet Docker Swarm Infrastructure

[!abstract] Overview Production-grade, multi-node Docker Swarm infrastructure managing 70+ services across 5 nodes, with comprehensive monitoring, logging, and automation. This documentation combines manual architecture guides with auto-generated live status powered by agentic documentation patterns.

Quick Stats

Metric Value
Nodes 5 (1 manager, 4 workers)
Stacks 15 deployed
Services 70+ total
Storage ~3TB NFS
Docker Version 29.1.3
Host OS Ubuntu 24.04.3 LTS

[!tip] Live Status Available For real-time metrics, see the Live Status section below.


Core Infrastructure

Services & Stacks

Operations

Monitoring & Observability

Storage

Troubleshooting

Templates


Live Status (Auto-Generated)

[!info] Agentic Documentation These pages are automatically generated from live system state using the agentic documentation system. They update every 6 hours via cron.

Report Description Pattern
Service Health Current status of all Docker Swarm services Tool Use
Node Status Cluster node health and resources Tool Use
Stack Status Deployed stack information Tool Use
Storage Capacity NFS and local storage usage Tool Use
Recent Changes Git commits and file modifications Tool Use
Doc Drift Report Documentation accuracy validation Reflection

Manual regeneration:

./scripts/docs/sh-update-docs.sh


Critical Alerts

[!danger] ELK Stack Offline Elasticsearch, Logstash, and Kibana are all scaled to 0/0 replicas. No centralized logging infrastructure operational - blind operations mode.

See: ELK Stack Offline

[!warning] Storage Near Capacity Multiple NFS mounts approaching capacity. Risk of service failures.

See: Storage Capacity (Live) | Capacity Planning


Architecture Diagram

graph TB
    subgraph "Proxmox Hypervisors"
        PVE1[Proxmox-1<br/>100.1.100.10]
        PVE2[Proxmox-2<br/>100.1.100.15]
    end

    subgraph "Docker Swarm Cluster"
        MGR[Node 201 Manager<br/>8 CPU, 10GB RAM<br/>Databases & Logging]
        W1[Node 202 Worker<br/>12 CPU, 16GB RAM<br/>Media & Photos]
        W2[Node 203 Worker<br/>4 CPU, 3GB RAM<br/>Surveillance]
        W3[Node 204 Worker<br/>4 CPU, 4GB RAM<br/>Dashboards]
        W4[Node 205 Worker<br/>8 CPU, 8GB RAM<br/>General]
    end

    subgraph "Infrastructure Services"
        DNS[Pi DNS<br/>100.1.100.11<br/>AdGuard/Pi-hole]
        NFS[OMV NFS Server<br/>100.1.100.199<br/>3TB Storage]
    end

    PVE1 --> MGR
    PVE1 --> W1
    PVE2 --> W2
    PVE2 --> W3
    PVE2 --> W4

    MGR -.-> DNS
    W1 -.-> DNS
    W2 -.-> DNS
    W3 -.-> DNS
    W4 -.-> DNS

    MGR --> NFS
    W1 --> NFS
    W2 --> NFS
    W3 --> NFS
    W4 --> NFS

Quick Access

Service URL Purpose
Grafana http://100.1.100.201:3010 Metrics visualization
Prometheus http://100.1.100.201:9090 Metrics storage
Uptime Kuma http://100.1.100.204:3001 Service monitoring
Traefik Port 8080 Reverse proxy dashboard
Swarmpit - Cluster management UI

Documentation System

This documentation uses an agentic pattern for continuous accuracy:

Pattern Implementation Purpose
Tool Use Shell scripts query Docker API, df, git Generate from live state
Reflection Validation compares docs vs reality Detect drift
Planning Orchestrator runs generators in sequence Coordinate updates

Key files: - scripts/docs/sh-update-docs.sh - Main orchestrator - scripts/docs/sh-validate-docs.sh - Drift detection - config/mkdocs/mkdocs.yml - Site configuration


Repository: /home/cjustin/homenet-docker-services/ Primary Docs: CLAUDE.md, README.md Last Updated: 2026-01-17 Documentation Version: 2.0 (Agentic)