Monitoring
Métriques disponibles
Le serveur PeSIT Wizard expose des métriques Prometheus sur /actuator/prometheus.
Métriques clés
| Métrique | Description |
|---|---|
pesitwizard_connections_active | Connexions PeSIT Wizard actives |
pesitwizard_connections_total | Total des connexions |
pesitwizard_transfers_total | Nombre de transferts |
pesitwizard_transfers_bytes_total | Volume transféré (bytes) |
pesitwizard_transfers_duration_seconds | Durée des transferts |
pesitwizard_errors_total | Nombre d'erreurs |
pesitwizard_cluster_members | Membres du cluster |
pesitwizard_cluster_is_leader | 1 si leader, 0 sinon |
Intégration Prometheus
Configuration Prometheus
yaml
# prometheus.yml
scrape_configs:
- job_name: 'pesitwizard-server'
kubernetes_sd_configs:
- role: pod
namespaces:
names: ['pesitwizard']
relabel_configs:
- source_labels: [__meta_kubernetes_pod_label_app]
regex: pesitwizard-server
action: keep
- source_labels: [__meta_kubernetes_pod_container_port_number]
regex: "8080"
action: keepRequêtes utiles
promql
# Taux de transferts par minute
rate(pesitwizard_transfers_total[5m]) * 60
# Volume transféré par heure
increase(pesitwizard_transfers_bytes_total[1h])
# Taux d'erreur
rate(pesitwizard_errors_total[5m]) / rate(pesitwizard_transfers_total[5m])
# Durée moyenne des transferts
rate(pesitwizard_transfers_duration_seconds_sum[5m]) / rate(pesitwizard_transfers_duration_seconds_count[5m])Dashboards Grafana
Dashboard principal
Importez le dashboard depuis : /grafana/pesitwizard-dashboard.json
Panels inclus :
- Transferts par minute
- Volume transféré
- Connexions actives
- Taux d'erreur
- Statut du cluster
- Top partenaires
Alertes recommandées
yaml
# alerting-rules.yml
groups:
- name: pesitwizard
rules:
- alert: PesitHighErrorRate
expr: rate(pesitwizard_errors_total[5m]) > 0.1
for: 5m
labels:
severity: warning
annotations:
summary: "Taux d'erreur PeSIT Wizard élevé"
- alert: PesitNoLeader
expr: sum(pesitwizard_cluster_is_leader) == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Pas de leader PeSIT Wizard"
- alert: PesitClusterDegraded
expr: pesitwizard_cluster_members < 3
for: 5m
labels:
severity: warning
annotations:
summary: "Cluster PeSIT Wizard dégradé"Logs
Format des logs
2025-01-10 10:30:00.123 INFO [pesitwizard-server] [session-123] CONNECT partner=CLIENT_XYZ ip=192.168.1.100
2025-01-10 10:30:01.456 INFO [pesitwizard-server] [session-123] CREATE file=VIREMENT.XML virtualFile=VIREMENTS
2025-01-10 10:30:05.789 INFO [pesitwizard-server] [session-123] TRANSFER_COMPLETE bytes=15234 duration=4333msCentralisation avec ELK
yaml
# filebeat.yml
filebeat.inputs:
- type: container
paths:
- /var/log/containers/pesitwizard-server-*.log
processors:
- add_kubernetes_metadata: ~
output.elasticsearch:
hosts: ["elasticsearch:9200"]
index: "pesitwizard-%{+yyyy.MM.dd}"Requêtes Kibana utiles
# Erreurs des dernières 24h
level:ERROR AND kubernetes.labels.app:pesitwizard-server
# Transferts d'un partenaire
message:"TRANSFER_COMPLETE" AND partner:CLIENT_XYZ
# Connexions échouées
message:"CONNECT" AND status:FAILEDHealth checks
Endpoints
| Endpoint | Description |
|---|---|
/actuator/health | Santé globale |
/actuator/health/readiness | Prêt à recevoir du trafic |
/actuator/health/liveness | Application en vie |
Kubernetes probes
yaml
readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
initialDelaySeconds: 60
periodSeconds: 30Alerting
Email
Configurez les alertes email dans l'application :
yaml
pesitwizard:
alerting:
email:
enabled: true
smtp-host: smtp.example.com
from: pesitwizard@example.com
to: ops@example.com
triggers:
- type: TRANSFER_FAILED
- type: CONNECTION_FAILED
- type: CLUSTER_DEGRADEDWebhook
yaml
pesitwizard:
alerting:
webhook:
enabled: true
url: https://hooks.slack.com/services/xxx
events:
- TRANSFER_FAILED
- CLUSTER_LEADER_CHANGED