Workers¶

Workers run multiple server processes for parallelism and isolation.

Why workers matter¶

use multiple CPU cores
isolate crashes to one process
support rolling process replacement patterns

Reference app:

from __future__ import annotations

from palfrey import run

if __name__ == "__main__":
    run(
        "docs_src.reference.programmatic_run:app",
        host="0.0.0.0",
        port=8000,
        workers=4,
        timeout_worker_healthcheck=10,
        backlog=2048,
    )

CLI example:

palfrey main:app --workers 4 --host 0.0.0.0 --port 8000

Gunicorn integration with PalfreyWorker¶

Palfrey includes Gunicorn worker classes:

palfrey.workers.PalfreyWorker
palfrey.workers.PalfreyH11Worker

Direct command example:

gunicorn main:app -k palfrey.workers.PalfreyWorker -w 4 -b 0.0.0.0:8000

H11-specific worker example:

gunicorn main:app -k palfrey.workers.PalfreyH11Worker -w 4 -b 0.0.0.0:8000

Gunicorn config file example:

"""Example Gunicorn config using Palfrey worker classes."""

from __future__ import annotations

bind = "0.0.0.0:8000"
workers = 4
worker_class = "palfrey.workers.PalfreyWorker"

# Optional Gunicorn settings that interact with Palfrey worker runtime.
keepalive = 5
timeout = 30
max_requests = 20000
max_requests_jitter = 2000

# Forwarded header trust can also be controlled in Gunicorn settings.
forwarded_allow_ips = "127.0.0.1"

Run using config:

gunicorn main:app -c docs_src/operations/gunicorn_conf.py

Worker health and recycle controls¶

--timeout-worker-healthcheck
--limit-max-requests
--limit-max-requests-jitter

Sizing guidance¶

start near core count
benchmark realistic workload
observe CPU, memory, tail latency
adjust incrementally

Important behavior notes¶

each worker has independent memory/process state
each worker runs its own lifespan startup/shutdown
worker count can affect external dependency load (DB pool pressure)

Non-technical summary¶

Workers are additional runtime lanes. More lanes can increase throughput, but each lane consumes resources.