- HA Concepts
- Pacemaker & Corosync
- HAProxy β Load Balancing
- Keepalived β Floating IP
- Database HA
- GlusterFS β Distributed Storage
- Practice Exercises
Active-Passive (Failover):
ββββββββββββ ββββββββββββ
β Node A ββββββββββΆβ Node B β
β (active) β sync β(standby) β
ββββββββββββ ββββββββββββ
β VIP: 10.0.0.100
Clients
Active-Active (Load Balanced):
ββββββββββββ
Clients βββββΆ β HAProxy β
βββββββ¬βββββ
βββββββββΌββββββββ
βββββββΌβββ ββββΌβββββ ββΌββββββββ
β Node A β βNode B β β Node C β
ββββββββββ βββββββββ ββββββββββ
| Term | Description |
|---|---|
| Uptime | Percentage of time system is available |
| RTO | Recovery Time Objective β max downtime allowed |
| RPO | Recovery Point Objective β max data loss allowed |
| SLA | Service Level Agreement β guaranteed uptime |
| Five 9s | 99.999% uptime = ~5 minutes downtime/year |
| Split-brain | Both nodes think they're primary |
| STONITH | Shoot The Other Node In The Head (fencing) |
| Quorum | Majority agreement (prevents split-brain) |
The standard Linux HA stack.
# Install (on all nodes)
sudo apt install pacemaker corosync pcs
# Set password for cluster user
sudo passwd hacluster
# Start PCS daemon
sudo systemctl enable --now pcsd
# Authenticate nodes (from one node)
sudo pcs host auth node1 node2 node3
# Create cluster
sudo pcs cluster setup mycluster node1 node2 node3
# Start cluster
sudo pcs cluster start --all
sudo pcs cluster enable --all
# Status
sudo pcs status
sudo pcs cluster status
sudo crm_mon -1 # One-shot status# Add a floating IP
sudo pcs resource create cluster_vip ocf:heartbeat:IPaddr2 \
ip=10.0.0.100 cidr_netmask=24 op monitor interval=30s
# Add Nginx resource
sudo pcs resource create webserver ocf:heartbeat:nginx \
configfile=/etc/nginx/nginx.conf op monitor interval=10s
# Colocation (keep resources together)
sudo pcs constraint colocation add webserver with cluster_vip INFINITY
# Order (start VIP before web server)
sudo pcs constraint order cluster_vip then webserver
# Prefer specific node
sudo pcs constraint location webserver prefers node1=100
# STONITH/Fencing (required for production!)
sudo pcs stonith create fence_node1 fence_ipmilan \
ipaddr=192.168.1.200 login=admin passwd=secret \
pcmk_host_list=node1
# Disable STONITH (testing only!)
sudo pcs property set stonith-enabled=falsesudo apt install haproxy
sudo vim /etc/haproxy/haproxy.cfgglobal
log /dev/log local0
maxconn 4096
user haproxy
group haproxy
daemon
stats socket /var/run/haproxy.sock mode 660 level admin
defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
retries 3
frontend http_front
bind *:80
bind *:443 ssl crt /etc/ssl/certs/site.pem
redirect scheme https code 301 if !{ ssl_fc }
default_backend http_back
backend http_back
balance roundrobin
option httpchk GET /health
server web1 10.0.0.11:8080 check
server web2 10.0.0.12:8080 check
server web3 10.0.0.13:8080 check backup
listen stats
bind *:8404
stats enable
stats uri /stats
stats auth admin:password
stats refresh 10s
sudo haproxy -c -f /etc/haproxy/haproxy.cfg # Validate
sudo systemctl enable --now haproxy
# Stats dashboard: http://server:8404/stats| Algorithm | Description |
|---|---|
roundrobin |
Rotate evenly |
leastconn |
Fewest active connections |
source |
Same client β same server (sticky) |
uri |
Same URI β same server (cache friendly) |
sudo apt install keepalived
# /etc/keepalived/keepalived.conf (Primary)
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass mysecret
}
virtual_ipaddress {
10.0.0.100/24
}
track_script {
check_haproxy
}
}
vrrp_script check_haproxy {
script "killall -0 haproxy"
interval 2
weight 2
}
# On Backup node: change state to BACKUP, priority to 99# Primary: /etc/postgresql/16/main/postgresql.conf
wal_level = replica
max_wal_senders = 3
synchronous_standby_names = 'standby1'
# Replica setup
pg_basebackup -h primary -D /var/lib/postgresql/16/main -U replicator -R
# -R creates standby.signal and connection info# Multi-master synchronous replication
sudo apt install mariadb-server galera-4
# /etc/mysql/mariadb.conf.d/60-galera.cnf
[mysqld]
binlog_format=ROW
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
wsrep_on=ON
wsrep_cluster_name="my_cluster"
wsrep_cluster_address="gcomm://node1,node2,node3"
wsrep_node_address="10.0.0.11"
wsrep_provider=/usr/lib/galera/libgalera_smm.so
# Bootstrap first node
sudo galera_new_cluster
# Start remaining nodes normallysudo apt install glusterfs-server
sudo systemctl enable --now glusterd
# From one node, probe peers
sudo gluster peer probe node2
sudo gluster peer probe node3
sudo gluster peer status
# Create replicated volume
sudo gluster volume create gvol0 replica 3 \
node1:/data/brick node2:/data/brick node3:/data/brick
# Start volume
sudo gluster volume start gvol0
sudo gluster volume info gvol0
# Mount on clients
sudo mount -t glusterfs node1:/gvol0 /mnt/shared- (VMs) Set up a 2-node Pacemaker cluster with a floating IP
- HAProxy: Configure load balancing across 2 web servers
- Keepalived: Implement floating IP failover
- Stats: Access the HAProxy stats dashboard
- Failover: Test failover by stopping a service on the active node
- PostgreSQL: Set up primary-replica streaming replication
- GlusterFS: Create a replicated volume across 2 nodes
β Previous: Embedded Linux Β· π Home Β· Next: Disaster Recovery β