Capacity Planning¶

Capacity planning prevents guesswork and avoids scaling by panic.

Inputs you need¶

expected requests/sec and websocket concurrency
payload size profiles
target latency SLO
available CPU/memory budget

Baseline process¶

run representative load tests
capture throughput and p95/p99 latency
inspect CPU and memory saturation points
test failure and recovery behavior

Tuning levers in Palfrey¶

worker count
concurrency limit
keep-alive timeout
protocol backend mode choices

Planning workflow example¶

from __future__ import annotations

from dataclasses import dataclass


@dataclass
class BenchmarkCommand:
    """Represents one benchmark invocation."""

    target: str
    requests: int
    concurrency: int

    def render(self) -> str:
        """Render command line suitable for local runs."""
        return f"python -m benchmarks.run --target {self.target} --requests {self.requests} --concurrency {self.concurrency}"


print(BenchmarkCommand(target="palfrey", requests=100_000, concurrency=200).render())

Practical guardrails¶

change one variable at a time
keep benchmark command + environment details in version control
prefer repeatable tests over single "hero" numbers

Non-technical summary¶

Capacity planning is budgeting for runtime behavior before incidents force emergency changes.