perf: add ingest and search load-test harnesses

scripts/generate_synthetic_pdfs.py builds real PDF/1.4 documents with
a hand-written xref so we can generate tens of thousands of ~2 KB
PDFs locally. Helvetica only covers latin-1, which is fine for a
load generator (throughput, not retrieval relevance); the docstring
calls this out so no one mistakes the output for a quality corpus.

scripts/load_ingest.py drives POST /ingest/folder, then polls a
hypothetical /documents/stats endpoint every poll-interval seconds
to track terminal-state progression. Writes a JSON history report so
results can be diffed between runs.

scripts/locustfile_search.py defines a SearchUser profile mixing
hybrid / lexical / semantic queries against POST /search plus a
health-check sampler. Asserts non-empty results so a "200 with
zero hits" regression surfaces as a failure rather than a green
percentile graph.

RUNBOOK gains a Load testing section with CPU/GPU SLO tables for
both axes (sustained docs/min, search latency p50/p95/p99).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Vadim Malanov
2026-05-13 17:11:08 +03:00
parent 349f4ea838
commit a97d0bbcfd
4 changed files with 379 additions and 0 deletions

View File

@@ -0,0 +1,72 @@
"""Locust load profile for the LegacyHUB hybrid search API.
Run:
pip install locust
locust -f scripts/locustfile_search.py \
--host http://localhost:8000 \
--users 50 --spawn-rate 5 --run-time 5m
Or headless with HTML report:
locust -f scripts/locustfile_search.py --host http://localhost:8000 \
--headless --users 100 --spawn-rate 10 --run-time 10m \
--html load_search.html
"""
from __future__ import annotations
import random
from locust import HttpUser, between, task
QUERIES = [
"ГОСТ 21.501-93 рабочие чертежи",
"класс бетона B25",
"регламент технического обслуживания",
"контроль качества сварных соединений",
"схема электропитания корпус 3",
"журнал ремонтов узлов",
"правила производства земляных работ",
"акты приемки скрытых работ",
"fundament concrete grade",
"maintenance schedule appendix",
]
MODES = ["hybrid", "hybrid", "hybrid", "lexical", "semantic"]
class SearchUser(HttpUser):
wait_time = between(0.5, 2.5)
api_prefix = "/api/v1"
@task(8)
def hybrid_search(self):
body = {
"query": random.choice(QUERIES),
"limit": random.choice([5, 10, 20]),
"filters": {
"document_id": None,
"source_path": None,
"block_type": None,
"min_ocr_confidence": None,
},
"search_mode": random.choice(MODES),
}
with self.client.post(
f"{self.api_prefix}/search",
json=body,
name="POST /search",
catch_response=True,
) as res:
if res.status_code != 200:
res.failure(f"HTTP {res.status_code}: {res.text[:120]}")
return
data = res.json()
if not data.get("results"):
res.failure("empty results")
@task(1)
def health(self):
self.client.get(f"{self.api_prefix}/health", name="GET /health")