Observability

Технические метрики (значения как пример)

Latency

  1. Брокер → Data Receiver: <0.1ms (P99)
  2. Data Receiver → Redis: <0.2ms (P99)
  3. Redis → API Services: <0.3ms (P99)
  4. API Services → Client: <0.4ms (P99)

Resource utilization

  1. Redis Memory Usage: <75% per replica
  2. CPU Utilization: <70% P95 по всем сервисам
  3. Network I/O: <90% of interface capacity

Продуктовые метрики

  • Client Session Duration: среднее >4 часа
  • Client Churn Rate

Service Level Indicators (SLIs) & Objectives (SLOs)

Service SLI SLO Error Budget
Quote Delivery Availability 99.99% 4.3 min/month
API Response Latency P99 <5ms 5% requests
Data Accuracy Error Rate <0.001% 10 errors/million
Connection Success Success Rate >99.9% 0.1% failure rate

А также не забываем про Chaos Engineering