Docker Swarm Cluster Overview¶
[!info] Cluster Status State: ✅ Healthy - All 5 nodes Active/Ready Docker Version: 29.1.3 (all nodes) Host OS: Ubuntu 24.04.3 LTS Architecture: x86_64
Cluster Topology¶
graph LR
MGR[Node 201<br/>MANAGER]
W1[Node 202<br/>WORKER]
W2[Node 203<br/>WORKER]
W3[Node 204<br/>WORKER]
W4[Node 205<br/>WORKER]
MGR -.Swarm.-> W1
MGR -.Swarm.-> W2
MGR -.Swarm.-> W3
MGR -.Swarm.-> W4
Nodes Summary¶
| Node | IP | Role | Resources | Status | Primary Function |
|---|---|---|---|---|---|
| [[Node-201-Manager|homenet-ubuntu1]] | 100.1.100.201 | Manager (Leader) | 8 CPU, 10GB RAM | ✅ Active | Critical Infrastructure |
| [[Node-202-Worker|homenet-ubuntu2]] | 100.1.100.202 | Worker | 12 CPU, 16GB RAM | ✅ Active | Media & Photos |
| [[Node-203-Worker|homenet-ubuntu3]] | 100.1.100.203 | Worker | 4 CPU, 3GB RAM | ✅ Active | Surveillance |
| [[Node-204-Worker|homenet-ubuntu4]] | 100.1.100.204 | Worker | 4 CPU, 4GB RAM | ✅ Active | Dashboards & Automation |
| [[Node-205-Worker|homenet-ubuntu5]] | 100.1.100.205 | Worker | 8 CPU, 8GB RAM | ✅ Active | General Workloads |
Node IDs¶
# Manager
homenet-ubuntu1: y8yu1d46pv8gh8w4v7cyzi4cj
# Workers
homenet-ubuntu2: jjbyr8m4xffdgzypbsz8nzqua
homenet-ubuntu3: ytxjh4ba2wrxfp7vk3ohxicbk
homenet-ubuntu4: k05ar70cavs4wkc4846axyj2y
homenet-ubuntu5: gytpo6oaql553za0wfkxt2jsu
Resource Distribution¶
Total Cluster Resources¶
- Total CPUs: 40 cores
- Total RAM: 41GB
- Network: 10Gbps internal (Docker overlay)
Resource Allocation by Node¶
pie title CPU Distribution
"Node 201 (Manager)" : 8
"Node 202 (Worker)" : 12
"Node 203 (Worker)" : 4
"Node 204 (Worker)" : 4
"Node 205 (Worker)" : 8
pie title RAM Distribution
"Node 201 (Manager)" : 10
"Node 202 (Worker)" : 16
"Node 203 (Worker)" : 3
"Node 204 (Worker)" : 4
"Node 205 (Worker)" : 8
Service Placement Strategy¶
[!note] Node Label System Services are pinned to specific nodes using Docker node labels. Labels are managed via
sh-label-nodes.shscript.
Placement Rules¶
| Service Type | Preferred Node(s) | Reason |
|---|---|---|
| Databases | Node 201 | Manager node, high availability |
| Media Services | Node 202 | NVIDIA GPU for transcoding |
| Photo Services | Node 202 | Large storage needs, GPU for ML |
| Cameras | Node 203 | Dedicated node for video processing |
| Dashboards | Node 204 | User-facing services |
| General Apps | Node 205 | Overflow and general workloads |
Label Configuration¶
See individual node pages for complete label assignments: - [[Node-201-Manager#Node Labels|Node 201 Labels]] - [[Node-202-Worker#Node Labels|Node 202 Labels]] - [[Node-203-Worker#Node Labels|Node 203 Labels]] - [[Node-204-Worker#Node Labels|Node 204 Labels]] - [[Node-205-Worker#Node Labels|Node 205 Labels]]
Infrastructure Dependencies¶
Hypervisor Layer¶
Both Proxmox hosts run multiple VMs including all Docker nodes:
| Host | IP | Nodes Hosted | Management URL |
|---|---|---|---|
| proxmox-1 | 100.1.100.10 | Nodes 201, 202 | https://100.1.100.10:8006 |
| proxmox-2 | 100.1.100.15 | Nodes 203, 204, 205 | https://100.1.100.15:8006 |
Network Services¶
Pi DNS Server (100.1.100.11) - Primary DNS for entire network - AdGuard/Pi-hole DNS filtering - Critical for domain resolution - All nodes use as primary nameserver
Storage Layer¶
OpenMediaVault NFS Server (100.1.100.199) - Provides all persistent storage - 6 NFS shares mounted on each node - Single point of failure for storage - See [[05-Storage/NFS-Architecture|NFS Architecture]]
Overlay Networks¶
Docker Swarm uses encrypted overlay networks for inter-service communication:
| Network | Driver | Purpose | Services |
|---|---|---|---|
homenet |
overlay | Primary service network | Most services |
traefik-public |
overlay | Reverse proxy | Public-facing services |
elastic |
overlay | ELK stack | Elasticsearch cluster |
logs-network |
overlay | Log aggregation | Logstash, collectors |
swarmpit_net |
overlay | Cluster management | Swarmpit services |
See [[Network-Architecture|Network Architecture]] for details.
Stack Distribution¶
15 Stacks Deployed across the cluster:
| Stack | Primary Node | Services | Status |
|---|---|---|---|
| [[02-Services/Stack-Homenet1|homenet1]] | 201 | 7 | ⚠️ 3/7 running |
| [[02-Services/Stack-Homenet2|homenet2]] | 204 | 6 | ✅ 6/6 running |
| [[02-Services/Stack-Homenet3|homenet3]] | 203 | 1 | ✅ 1/1 running |
| [[02-Services/Stack-Homenet4|homenet4]] | Mixed | 15 | ⚠️ 11/15 running |
| [[02-Services/Stack-Traefik|traefik]] | 201 | 2 | ✅ 2/2 running |
| [[02-Services/Monitoring-Stack|monitoring]] | All | 12 | ⚠️ 8/12 running |
| swarmpit | 201 | 4 | ⚠️ 3/4 running |
| immich | 202 | 3 | ✅ 3/3 running |
| librephotos | 202 | 5 | ⚠️ 4/5 running |
| photoprism | 202 | 2 | ✅ 2/2 running |
| paperless | 201 | 5 | ✅ 5/5 running |
| rxresume | 205 | 3 | ✅ 3/3 running |
| crm | 201 | 5 | ✅ 5/5 running |
| backup | Mixed | 1 | ❌ 0/1 running |
High Availability Considerations¶
[!warning] Single Manager Node The cluster has only 1 manager node. For production HA, consider adding 2 more manager nodes (total of 3) to maintain quorum during failures.
Current HA Status¶
- ✅ Service replicas can reschedule to healthy workers
- ✅ Global services (cAdvisor, node-exporter) run on all nodes
- ❌ Single manager = single point of control plane failure
- ❌ No automatic manager failover
Failure Scenarios¶
Manager Node (201) Failure: - ❌ Cannot deploy new services or stacks - ❌ Cannot update existing services - ✅ Existing services continue running - ✅ Workers remain operational
Worker Node Failure: - ✅ Services reschedule to healthy nodes - ⚠️ May cause service disruption if node had unique labels
Useful Commands¶
Cluster Status¶
# View all nodes
docker node ls
# Detailed node info
docker node inspect homenet-ubuntu1
# View node labels
docker node inspect homenet-ubuntu1 --format '{{.Spec.Labels}}'
Service Distribution¶
# List all services and their placement
docker service ls
# See which node runs a service
docker service ps homenet1_elasticsearch
# View services on specific node
docker node ps homenet-ubuntu2
Maintenance Operations¶
# Drain node for maintenance
docker node update --availability drain homenet-ubuntu2
# Return node to active
docker node update --availability active homenet-ubuntu2
# Apply node labels
./sh-label-nodes.sh
Related Documentation¶
- [[Node-201-Manager|Node 201 Details]]
- [[Node-202-Worker|Node 202 Details]]
- [[Node-203-Worker|Node 203 Details]]
- [[Node-204-Worker|Node 204 Details]]
- [[Node-205-Worker|Node 205 Details]]
- [[Network-Architecture|Network Architecture]]
- [[02-Services/Service-Catalog|Service Catalog]]
- [[03-Operations/Stack-Deployment|Stack Deployment]]
Last Updated: 2026-01-11 Health Status: ✅ Healthy (all nodes active) Next Review: Monitor storage capacity and consider manager HA