Commit Graph

11 Commits

Author SHA1 Message Date
Vadim Malanov
463622c644 deps: tighten version ranges, pin Docling to <2.15
Docling's DocumentConverter shape (text_items, prov[0].page_no,
export_to_markdown signature) still moves between 2.x minor releases.
Cap docling to >=2.0.0,<2.15 so a wheel bump cannot silently break
the defensive walkers in app/ingestion/docling_extractor.py until a
staging smoke test has run against the new minor.

Every other runtime dep gets the same major/minor upper bound:
- web/api: fastapi <0.117, uvicorn <0.33, pydantic <3
- db: sqlalchemy <2.1, psycopg <3.3, alembic <1.14
- search: opensearch-py <3, qdrant-client <1.13
- ingest: ocrmypdf <17, pikepdf <10, pypdf <6
- ml: FlagEmbedding <2, sentence-transformers <4, transformers <5,
      torch <3, numpy <3
- ops/utils: structlog <26, orjson <4, httpx <0.29, click <9

Lift any specific upper bound only after the corresponding regression
test passes on a staging upgrade.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 17:12:15 +03:00
Vadim Malanov
a97d0bbcfd perf: add ingest and search load-test harnesses
scripts/generate_synthetic_pdfs.py builds real PDF/1.4 documents with
a hand-written xref so we can generate tens of thousands of ~2 KB
PDFs locally. Helvetica only covers latin-1, which is fine for a
load generator (throughput, not retrieval relevance); the docstring
calls this out so no one mistakes the output for a quality corpus.

scripts/load_ingest.py drives POST /ingest/folder, then polls a
hypothetical /documents/stats endpoint every poll-interval seconds
to track terminal-state progression. Writes a JSON history report so
results can be diffed between runs.

scripts/locustfile_search.py defines a SearchUser profile mixing
hybrid / lexical / semantic queries against POST /search plus a
health-check sampler. Asserts non-empty results so a "200 with
zero hits" regression surfaces as a failure rather than a green
percentile graph.

RUNBOOK gains a Load testing section with CPU/GPU SLO tables for
both axes (sustained docs/min, search latency p50/p95/p99).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 17:11:08 +03:00
Vadim Malanov
349f4ea838 perf(reranker): add benchmark harness and passage clipping
- scripts/benchmark_reranker.py exercises the configured reranker
  with synthetic queries or live OpenSearch samples and prints
  p50/p95/p99 latency, mean latency, and pairs/sec throughput.
  Supports --warmup, --candidates, --passage-length, --source, and a
  --json-only mode for CI.
- app/indexing/reranker.py clips passages to 2048 characters before
  scoring so a runaway chunk cannot starve the cross-encoder beyond
  bge-reranker-v2-m3's training window.
- RUNBOOK.md gains a Reranker benchmark section with CPU/GPU SLO
  targets and a remediation ladder (lower top-K, raise batch size,
  switch device, disable reranker) when measured p95 exceeds budget.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 17:08:04 +03:00
Vadim Malanov
f42fb978a8 chore: drop dead _qid helper and surface ocr_confidence on SearchHit
- app/indexing/qdrant_client.py: remove the identity-only _qid()
  helper and pass chunk_id straight to PointStruct (Qdrant accepts
  the UUID string directly).
- services/types.ts: SearchHit gets an explicit, optional
  ocr_confidence field so consumers can type the value instead of
  casting through metadata.
- widgets/SearchResultCard.tsx: replaces the
  (hit.metadata as { ocr_confidence? }) cast with the new field. No
  behavior change when the backend omits it.

tsc --noEmit: clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 16:55:32 +03:00
Vadim Malanov
785d3be970 test: add Alembic migration smoke and /search contract tests
tests/test_alembic.py points Alembic at an in-process SQLite database
in --sql mode so the migration files are validated end to end without
needing the real Postgres compose service. Asserts the documents,
chunks, and processing_events tables plus the unique constraints
appear in the generated DDL, and that the revision graph stays
linear at 0001_initial.

tests/test_routes_search.py monkeypatches
app.indexing.hybrid_search.run_search so the FastAPI route can be
exercised with the real SearchRequest/SearchResponse schemas. Covers
the happy path (rank, citation, reranked flag) and that empty queries
are rejected at schema validation before the backend is called.

pytest tests/test_alembic.py tests/test_routes_search.py -q: 4 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 16:54:15 +03:00
Vadim Malanov
d3c96161b0 ops: add docker-compose.prod.yml overlay
Production overlay narrows the dev defaults:
- removes published ports from postgres, minio, opensearch, qdrant,
  redis - only the api container stays externally reachable;
- enables the OpenSearch security plugin and requires
  OPENSEARCH_ADMIN_PASSWORD via ?:required interpolation;
- requires Qdrant API key, MinIO root credentials, postgres password,
  and CORS_ALLOWED_ORIGINS to be set (no localhost fallback);
- doubles OpenSearch heap (-Xms2g -Xmx2g) and worker concurrency to 4;
- drops the MinIO management console.

Validated with:
  set -a; . .env.prod.example; CORS_ALLOWED_ORIGINS=https://example.com
  docker compose -f docker-compose.yml -f docker-compose.prod.yml config

The RUNBOOK was updated in the initial commit and already documents
the overlay invocation and credential rotation workflow.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 16:52:57 +03:00
Vadim Malanov
a375ca55b9 refactor: extract ensure_artifact into app/storage/artifacts.py
The artifact-upsert helper was duplicated four times (scanner.py,
table_processor.py, figure_processor.py, pipeline.py) with slightly
different signatures. Consolidates into a single keyword-only function
keyed on (document_id, storage_key) - the identity the schema already
enforces - so re-running the pipeline never creates duplicate rows.

scanner / table_processor / figure_processor now import the shared
helper directly. pipeline.py keeps a thin local wrapper to preserve
the positional call sites at three artifact upsert points (OCR_PDF,
MARKDOWN, DOCLING_JSON).

Tests: 24 passed (5 health + 19 original).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 16:51:54 +03:00
Vadim Malanov
cd9977f8c3 feat(api): add CORS middleware and /health contract test
CORS:
- New setting CORS_ALLOWED_ORIGINS (comma separated). Defaults cover
  the three local Vite ports (5173, 5273, 4173); production overlay
  expects the real origin in .env.prod.
- main.py wires CORSMiddleware from settings.cors_origins. No * in
  production - see RUNBOOK and .env.prod.example.
- docker-compose.yml forwards the variable to both api and worker.

Tests:
- tests/test_api_health.py uses FastAPI TestClient and monkeypatches
  the five probe functions (postgres/minio/opensearch/qdrant/redis).
  Verifies the all-ok, any-error, and degraded paths, that the root
  endpoint reports the configured api prefix, and that the CORS
  preflight echoes the allowed origin.
- pytest tests/test_api_health.py -q: 5 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 16:48:49 +03:00
Vadim Malanov
eecdfaa847 fix(frontend): clear TypeScript strict-mode errors
- vite-env.d.ts now declares ImportMetaEnv with the three VITE_*
  variables the project uses, restoring proper typing for
  import.meta.env in apiClient.ts.
- QualityFlag.tsx widens its 'flags' prop to accept the domain
  QualityFlags type, the loose Record form used in mocks, or null,
  ending the structural-mismatch errors at five callsites
  (DocumentsPage, DocumentViewerPage, QualityControlPage,
  ChunkPreview, SearchResultCard).
- DashboardPage trend callbacks are typed against DashboardStats so
  the implicit-any complaints disappear without weakening intent.

npx tsc --noEmit -> clean. vite build -> ok.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 16:46:56 +03:00
Vadim Malanov
54714b5757 ci: add GitHub Actions workflow and ESLint v9 config
Adds two-job CI (backend + frontend) running ruff, pytest (unit only -
skipping heavy ML deps), docker compose config validation for both dev
and prod overlays, plus npm ci -> eslint -> tsc -> vite build for the
frontend.

ESLint config uses the v9 flat-config format that the project was
already on (eslint v9 dropped .eslintrc support); replaces the broken
'eslint . --ext' invocation and adds @typescript-eslint, react-hooks,
and react-refresh plugins.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 16:44:04 +03:00
Vadim Malanov
7f72171572 chore: bootstrap repository with governance docs
Initialize git, add Apache-2.0 LICENSE, .gitattributes (LF line
endings), AGENTS.md (entry points, stack, discovery order, baseline
checks), RUNBOOK.md (dev boot, prod deploy with overlay, ingestion,
failures, rollback, scaling notes), .env.prod.example with rotated
credential placeholders, and dev-only warnings on .env.example.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 16:41:50 +03:00