Skip to content

Clustering and High Availability

Architecture

PeSIT Wizard server supports clustering for high availability:

┌─────────────────────────────────────────────────────────┐
│                   LoadBalancer                          │
│              (selector: pesitwizard-leader=true)        │
└─────────────────────┬───────────────────────────────────┘

        ┌─────────────┼─────────────┐
        │             │             │
        ▼             ▼             ▼
   ┌─────────┐   ┌─────────┐   ┌─────────┐
   │  Pod 1  │   │  Pod 2  │   │  Pod 3  │
   │ LEADER  │   │ Standby │   │ Standby │
   │ ✓       │   │         │   │         │
   └─────────┘   └─────────┘   └─────────┘
        │             │             │
        └─────────────┼─────────────┘


              ┌──────────────┐
              │  PostgreSQL  │
              │  (shared)    │
              └──────────────┘

How It Works

Leader Election

  • Uses JGroups for discovery and leader election
  • The first pod to join the cluster becomes the leader
  • If the leader is lost, a new one is automatically elected

Kubernetes Labeling

The leader pod is automatically labeled pesitwizard-leader=true:

bash
# View the current leader
kubectl get pods -l pesitwizard-leader=true

# View all pods with their labels
kubectl get pods --show-labels

Traffic Routing

The Kubernetes Service uses a selector to route traffic only to the leader:

yaml
spec:
  selector:
    app: pesitwizard-server
    pesitwizard-leader: "true"

Configuration

Enable Clustering

yaml
pesitwizard:
  cluster:
    enabled: true
    name: pesitwizard-cluster

Required Environment Variables

yaml
env:
- name: POD_NAME
  valueFrom:
    fieldRef:
      fieldPath: metadata.name
- name: POD_NAMESPACE
  valueFrom:
    fieldRef:
      fieldPath: metadata.namespace
- name: PESIT_CLUSTER_ENABLED
  value: "true"

Required RBAC

The ServiceAccount must be able to modify pod labels:

yaml
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "patch"]

Failover Behavior

Scenario: Leader Goes Down

  1. JGroups detects the leader loss (timeout ~10s)
  2. A new leader is elected among the remaining pods
  3. The new leader:
    • Adds the label pesitwizard-leader=true to its pod
    • Starts the configured PeSIT servers
  4. The LoadBalancer routes to the new leader
  5. In-flight connections are lost (clients must reconnect)

Scenario: Standby Pod Goes Down

  1. JGroups detects the member loss
  2. The cluster continues operating normally
  3. Kubernetes recreates the pod automatically
  4. The new pod joins the cluster in standby mode

Cluster Monitoring

Status API

bash
curl http://localhost:8080/api/cluster/status -u admin:admin

Response:

json
{
  "clusterName": "pesitwizard-cluster",
  "isLeader": true,
  "members": [
    {
      "name": "pesitwizard-server-abc123",
      "address": "10.42.0.100",
      "isLeader": true
    },
    {
      "name": "pesitwizard-server-def456",
      "address": "10.42.0.101",
      "isLeader": false
    },
    {
      "name": "pesitwizard-server-ghi789",
      "address": "10.42.0.102",
      "isLeader": false
    }
  ],
  "memberCount": 3
}

Clustering Logs

bash
# View leader logs
kubectl logs -l pesitwizard-leader=true -f

# Filter JGroups logs
kubectl logs -l app=pesitwizard-server | grep -i "cluster\|leader\|jgroups"

Metrics

  • pesitwizard_cluster_members: Number of cluster members
  • pesitwizard_cluster_is_leader: 1 if this pod is leader, 0 otherwise

Best Practices

Number of Replicas

EnvironmentReplicasRationale
Dev/Test1No HA needed
Staging2Failover testing
Production3Tolerates 1 failure
Critical Production5Tolerates 2 failures

Anti-affinity

Spread pods across different nodes:

yaml
spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchLabels:
              app: pesitwizard-server
          topologyKey: kubernetes.io/hostname

PodDisruptionBudget

Ensure a minimum number of available pods:

yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: pesitwizard-server-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: pesitwizard-server

PeSIT Wizard Enterprise - Console d'administration