Capacity Planning¶
Capacity planning prevents guesswork and avoids scaling by panic.
Inputs you need¶
- expected requests/sec and websocket concurrency
- payload size profiles
- target latency SLO
- available CPU/memory budget
Baseline process¶
- run representative load tests
- capture throughput and p95/p99 latency
- inspect CPU and memory saturation points
- test failure and recovery behavior
Tuning levers in Palfrey¶
- worker count
- concurrency limit
- keep-alive timeout
- protocol backend mode choices
Planning workflow example¶
from __future__ import annotations
from dataclasses import dataclass
@dataclass
class BenchmarkCommand:
"""Represents one benchmark invocation."""
target: str
requests: int
concurrency: int
def render(self) -> str:
"""Render command line suitable for local runs."""
return f"python -m benchmarks.run --target {self.target} --requests {self.requests} --concurrency {self.concurrency}"
print(BenchmarkCommand(target="palfrey", requests=100_000, concurrency=200).render())
Practical guardrails¶
- change one variable at a time
- keep benchmark command + environment details in version control
- prefer repeatable tests over single "hero" numbers
Non-technical summary¶
Capacity planning is budgeting for runtime behavior before incidents force emergency changes.