Files
quiz/tests/stress
ameer 2136286275 add live stress harness, app-level admin login rate limit
tests/stress/live_accuracy.mjs: classroom-scale accuracy + latency
test that targets the deployed server (single-session, sid=main).
Logs in as admin via /admin/login, resets the session, joins N
students serially over HTTP, opens N student WebSockets in batches
of 8 (250ms apart) plus the instructor WS, then drives every
question through the admin "next" command. Each student picks
uniformly random A-D, sends the submit, waits for the submit_ack,
and records the round-trip latency. After session_ended, the script
verifies that every student whose pick == correct got score > 0,
every other submission got score == 0, and reports p50/p95/p99
ack latency. First live run: 50 students, 100 submits, 100% acks,
100% accuracy match, p99 555ms (≈intercontinental RTT to HK).

tests/stress/live_loop.sh: tmux-friendly loop that runs the live
test every 60s and appends a JSONL summary line per cycle to
runs/live_summary.jsonl. Mirrors the morning's api_stress run_loop
shape so per-cycle aggregates are easy to scrape.

app/rate_limit.py: tiny in-memory token bucket. Capacity + refill
in tokens/minute, keyed by client IP via X-Forwarded-For (with a
fallback to request.client.host). Process-local state — admin
login is the only user.

POST /admin/login: rate-limited at 10 attempts/minute/IP. Generous
for the legit instructor (who succeeds in 1-2 tries) and prohibitive
for brute force from a single attacker IP. Student endpoints
deliberately NOT rate-limited because campus students share NAT
gateways and IP-level limits would false-positive a whole class.

The bucket is per-app-instance (instantiated inside the router
factory), so test apps each get a fresh one and tests don't poison
each other.
2026-05-03 00:23:07 +08:00
..

Quiz portal stress harness

Adversarial frontend + API stress tests for the quiz portal. Built 2026-05-02.

Files

  • lib.mjs — shared helpers: server boot, cookie jar, Student and Admin WS wrappers, the fixed STRESS_POOL.
  • api_stress.mjs — pure WS adversarial scenarios (no browser): happy path with 20 concurrent students, late join, mid-question disconnect, sleep/wake to next question (the phone-screen-sleep scenario), cookie tampering, cross-session cookie reuse, duplicate student_id, bad submits (out-of-order, wrong idx, resubmit), close-boundary race, malformed-JSON fuzz, flaky reconnect.
  • ui_stress.mjs — Playwright/Chromium scenarios that exercise the real SPA: happy UI flow, sleep/wake by closing+reopening browser context with persisted cookie, cookie-tamper via document.cookie, two browsers with same student_id.
  • run_loop.sh — bash wrapper that runs api_stress.mjs every cycle and ui_stress.mjs every UI_EVERY cycles (default 5), with a fresh random seed each time. Logs JSON summary lines to runs/summary.jsonl and full output to runs/run-<timestamp>.jsonl.

Quick start

# One-shot
node api_stress.mjs              # uses Date.now() seed
node api_stress.mjs 12345 8210   # explicit seed + port
node ui_stress.mjs               # browser-based; HEADLESS=0 to watch

# Long-running loop in tmux
tmux new -d -s quiz_stress 'cd /home/ameer/RD/Projects/Apps/quiz/tests/stress && bash run_loop.sh'
tmux attach -t quiz_stress       # to watch
tmux send -t quiz_stress C-c     # to stop

Each cycle boots a fresh uvicorn on its own port and clean DB, runs scenarios, then tears down. Failures are recorded in the failures array of the per-cycle summary line.

Known findings (tracked outside this dir)

  • Codex bug: app/room.py student_ws (line ~87) and instructor_ws call await websocket.receive_json() whose JSON parsing can raise JSONDecodeError, but the surrounding try/except only catches WebSocketDisconnect. Result: a single malformed message kills that client's WS handler. The fuzz scenario in api_stress.mjs flags this consistently. Fix: wrap the receive in try/except (JSONDecodeError, RuntimeError): and either close cleanly or send {"type":"error","code":"bad_message"} and continue.

Adding scenarios

Write an async function name(server) { ... } in api_stress.mjs (or (server, browser) for UI), add it to the SCENARIOS map / array, and re-run. Use expect(cond, scenario, msg, extra) for assertions and note(scenario, msg) for warnings that shouldn't fail the suite. Critical pattern: pre-register waitFor waiters BEFORE the action that triggers the message — Student.waitFor(type) only resolves on NEW messages, not cached ones, to avoid stale-state false passes.