Skip to content

Benchmarks

Performance comparison of django-vtasks against Celery and django-tasks-rq.

Methodology

All benchmarks simulate an async Django view dispatching tasks — the common ASGI use case.

  • Enqueue: All frameworks enqueue from an async context (await aenqueue() for VTasks, sync_to_async(task.delay) for Celery, sync_to_async(backend.enqueue) for RQ). This reflects real async Django views.
  • Processing: Each framework runs a single worker process with 200 concurrency (except RQ which is single-threaded).
  • Task types: NoOp (raw overhead) and Sleep 10ms (simulates a lightweight DB query or API call).
  • Infrastructure: Valkey 9, Python 3.14, Docker containers on the same host.
  • Cloud simulation: 2ms network latency added via tc netem on the Valkey container, simulating cloud-hosted Valkey/ElastiCache.
  • Measurement: Total time from worker start to all tasks completed, including worker startup.

Results — Local (0ms latency)

Scenario Tasks Enqueue (ops/s) Process (ops/s) Peak RSS (MB) Valkey Conns
VTasks — NoOp 5,000 4,471 4,493 76 3
VTasks — Sleep 10ms 5,000 5,203 3,796 76 3
Celery Threads — NoOp 5,000 2,142 1,042 122 11
Celery Threads — Sleep 10ms 5,000 2,228 894 123 11
RQ — NoOp 500 500 52 174 4
RQ — Sleep 10ms 500 436 25 170 4

Results — Simulated Cloud (2ms RTT)

To simulate a production environment where Valkey runs on a separate host (e.g. AWS ElastiCache), we add 2ms network latency using tc netem on the Valkey container.

Scenario Tasks Enqueue (ops/s) Process (ops/s) Peak RSS (MB) Valkey Conns
VTasks — NoOp 5,000 398 403 74 3
VTasks — Sleep 10ms 5,000 395 411 74 3
Celery Threads — NoOp 5,000 293 121 122 11
Celery Threads — Sleep 10ms 5,000 292 121 122 11
RQ — NoOp 500 55 11 174 4
RQ — Sleep 10ms 500 55 9 170 4

Analysis

vs Celery Threads (same deployment model: 1 process, 200 concurrency)

  • Enqueue: VTasks is ~2x faster — native async vs sync_to_async wrapping
  • Processing (local): VTasks is ~4x faster for I/O tasks (3,796 vs 894 ops/s)
  • Processing (cloud): VTasks is ~3.4x faster (411 vs 121 ops/s)
  • Memory: VTasks uses 38% less RAM (76 MB vs 123 MB)
  • Connections: VTasks uses 3 Valkey connections vs 11

vs RQ (django-tasks ecosystem)

  • Processing: VTasks is 75-150x faster — RQ is single-threaded with no async support
  • Enqueue: VTasks is ~10x faster
  • Memory: VTasks uses 56% less RAM

Why VTasks is faster

  • Rust I/O driver (django-vcache) — async Valkey communication with minimal overhead
  • Native asyncio — no thread pool overhead for async tasks, no sync_to_async tax on enqueue
  • Minimal connections — 2-3 multiplexed connections vs per-worker connections
  • Efficient serializationorjson with optional zstd compression

Network latency impact

With 2ms RTT, all frameworks degrade significantly because each Valkey operation pays the round-trip cost. VTasks maintains its relative advantage but the absolute throughput drops ~10x. The worker currently fetches one task at a time via BLMOVE — a prefetch optimization using BLMPOP to batch-fetch tasks is planned.

Running Benchmarks

# Start services
docker compose up -d db cache

# Run the comparison suite
docker compose run --rm web uv run python benchmarks/compare.py

# For cloud simulation, add 2ms latency to the Valkey container:
# (requires cap_add: NET_ADMIN on the cache service in compose.yml)
docker compose exec cache apt-get update -qq && apt-get install -y -qq iproute2
docker compose exec cache tc qdisc add dev eth0 root netem delay 2ms

# Run benchmarks again
docker compose run --rm web uv run python benchmarks/compare.py

# Remove latency
docker compose exec cache tc qdisc del dev eth0 root

Notes

  • These benchmarks focus on throughput under controlled conditions
  • Real-world performance depends on task complexity, network latency, and infrastructure
  • Celery offers features (chains, chords, result backends) that django-vtasks intentionally omits
  • RQ uses 500 tasks (vs 5,000) because its single-threaded worker would take too long otherwise
  • Celery enqueue uses sync_to_async(task.delay) — this is what you actually do in an async Django view
  • VTasks enqueue uses native await task.aenqueue() — no sync wrapping needed

Benchmarks last updated: April 2026