13 Commits

Author SHA1 Message Date
Vadim Malanov
0a407f1f09 perf(frontend): route-based code splitting + vendor chunking
Some checks failed
CI / Backend (lint + tests + compose) (push) Has been cancelled
CI / Frontend (lint + type-check + build) (push) Has been cancelled
Splits each page into its own lazily-imported chunk via React.lazy
with Suspense fallback (a skeleton matching the dashboard layout
shape). Adds a vite manualChunks function that pushes heavy third-
party libraries into long-lived vendor chunks so page chunks stay
small and the vendor cache survives release cycles.

Vendor groupings: vendor-react, vendor-router, vendor-tanstack,
vendor-radix (+ cmdk), vendor-motion, vendor-recharts (+ d3 deps),
vendor-axios, vendor-state (zustand), vendor-toast (sonner),
vendor-lucide, vendor (everything else).

Build output (before -> after, gzipped):
  initial entry      348.65 kB -> 8.75 kB
  largest chunk      1163.97 kB -> 81.65 kB (vendor-recharts, only
                                 loaded on Dashboard + SystemHealth)
  build warning      "chunks > 500 kB" -> gone

DocumentsPage, SettingsPage, etc. no longer pull recharts into their
critical path; the dashboard pays the chart cost once, cached.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 17:19:06 +03:00
Vadim Malanov
24282d1279 feat(api): optional API-key auth middleware
Adds defence-in-depth shared-secret auth that activates when API_KEY
is set. Behaviour:

- empty API_KEY (dev default): every request allowed, middleware is
  not even installed;
- non-empty API_KEY: every request under APP_API_PREFIX except
  /health must carry X-API-Key: <value> or
  Authorization: Bearer <value>. /, /docs, /redoc, /openapi.json and
  CORS preflight stay open. hmac.compare_digest is used for the
  constant-time comparison.

The middleware resolves settings lazily so test fixtures can reload
app.config and have the new API_KEY take effect on the next install.

Tests (tests/test_api_security.py, 5 cases):
- /health remains open;
- protected route rejects missing key (401);
- protected route accepts X-API-Key header;
- protected route accepts Authorization: Bearer header;
- protected route rejects a wrong key.

Frontend:
- VITE_API_KEY env reads the key and Axios injects it on every
  request, falling back to no header when empty so SSO/reverse-proxy
  deployments stay unchanged.
- vite-env.d.ts adds the new env entry.

Docs/ops:
- .env.example documents the dev-default empty key;
- .env.prod.example marks API_KEY as a required rotation point;
- docker-compose.yml forwards API_KEY (defaults to empty);
- docker-compose.prod.yml fails the stack with ?:required when API_KEY
  is missing;
- RUNBOOK gains an API authentication section with header examples
  and the reverse-proxy + key layering recommendation.

pytest -q: 33 passed (5 new security + 28 prior).
npx tsc --noEmit: clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 17:17:27 +03:00
Vadim Malanov
463622c644 deps: tighten version ranges, pin Docling to <2.15
Docling's DocumentConverter shape (text_items, prov[0].page_no,
export_to_markdown signature) still moves between 2.x minor releases.
Cap docling to >=2.0.0,<2.15 so a wheel bump cannot silently break
the defensive walkers in app/ingestion/docling_extractor.py until a
staging smoke test has run against the new minor.

Every other runtime dep gets the same major/minor upper bound:
- web/api: fastapi <0.117, uvicorn <0.33, pydantic <3
- db: sqlalchemy <2.1, psycopg <3.3, alembic <1.14
- search: opensearch-py <3, qdrant-client <1.13
- ingest: ocrmypdf <17, pikepdf <10, pypdf <6
- ml: FlagEmbedding <2, sentence-transformers <4, transformers <5,
      torch <3, numpy <3
- ops/utils: structlog <26, orjson <4, httpx <0.29, click <9

Lift any specific upper bound only after the corresponding regression
test passes on a staging upgrade.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 17:12:15 +03:00
Vadim Malanov
a97d0bbcfd perf: add ingest and search load-test harnesses
scripts/generate_synthetic_pdfs.py builds real PDF/1.4 documents with
a hand-written xref so we can generate tens of thousands of ~2 KB
PDFs locally. Helvetica only covers latin-1, which is fine for a
load generator (throughput, not retrieval relevance); the docstring
calls this out so no one mistakes the output for a quality corpus.

scripts/load_ingest.py drives POST /ingest/folder, then polls a
hypothetical /documents/stats endpoint every poll-interval seconds
to track terminal-state progression. Writes a JSON history report so
results can be diffed between runs.

scripts/locustfile_search.py defines a SearchUser profile mixing
hybrid / lexical / semantic queries against POST /search plus a
health-check sampler. Asserts non-empty results so a "200 with
zero hits" regression surfaces as a failure rather than a green
percentile graph.

RUNBOOK gains a Load testing section with CPU/GPU SLO tables for
both axes (sustained docs/min, search latency p50/p95/p99).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 17:11:08 +03:00
Vadim Malanov
349f4ea838 perf(reranker): add benchmark harness and passage clipping
- scripts/benchmark_reranker.py exercises the configured reranker
  with synthetic queries or live OpenSearch samples and prints
  p50/p95/p99 latency, mean latency, and pairs/sec throughput.
  Supports --warmup, --candidates, --passage-length, --source, and a
  --json-only mode for CI.
- app/indexing/reranker.py clips passages to 2048 characters before
  scoring so a runaway chunk cannot starve the cross-encoder beyond
  bge-reranker-v2-m3's training window.
- RUNBOOK.md gains a Reranker benchmark section with CPU/GPU SLO
  targets and a remediation ladder (lower top-K, raise batch size,
  switch device, disable reranker) when measured p95 exceeds budget.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 17:08:04 +03:00
Vadim Malanov
f42fb978a8 chore: drop dead _qid helper and surface ocr_confidence on SearchHit
- app/indexing/qdrant_client.py: remove the identity-only _qid()
  helper and pass chunk_id straight to PointStruct (Qdrant accepts
  the UUID string directly).
- services/types.ts: SearchHit gets an explicit, optional
  ocr_confidence field so consumers can type the value instead of
  casting through metadata.
- widgets/SearchResultCard.tsx: replaces the
  (hit.metadata as { ocr_confidence? }) cast with the new field. No
  behavior change when the backend omits it.

tsc --noEmit: clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 16:55:32 +03:00
Vadim Malanov
785d3be970 test: add Alembic migration smoke and /search contract tests
tests/test_alembic.py points Alembic at an in-process SQLite database
in --sql mode so the migration files are validated end to end without
needing the real Postgres compose service. Asserts the documents,
chunks, and processing_events tables plus the unique constraints
appear in the generated DDL, and that the revision graph stays
linear at 0001_initial.

tests/test_routes_search.py monkeypatches
app.indexing.hybrid_search.run_search so the FastAPI route can be
exercised with the real SearchRequest/SearchResponse schemas. Covers
the happy path (rank, citation, reranked flag) and that empty queries
are rejected at schema validation before the backend is called.

pytest tests/test_alembic.py tests/test_routes_search.py -q: 4 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 16:54:15 +03:00
Vadim Malanov
d3c96161b0 ops: add docker-compose.prod.yml overlay
Production overlay narrows the dev defaults:
- removes published ports from postgres, minio, opensearch, qdrant,
  redis - only the api container stays externally reachable;
- enables the OpenSearch security plugin and requires
  OPENSEARCH_ADMIN_PASSWORD via ?:required interpolation;
- requires Qdrant API key, MinIO root credentials, postgres password,
  and CORS_ALLOWED_ORIGINS to be set (no localhost fallback);
- doubles OpenSearch heap (-Xms2g -Xmx2g) and worker concurrency to 4;
- drops the MinIO management console.

Validated with:
  set -a; . .env.prod.example; CORS_ALLOWED_ORIGINS=https://example.com
  docker compose -f docker-compose.yml -f docker-compose.prod.yml config

The RUNBOOK was updated in the initial commit and already documents
the overlay invocation and credential rotation workflow.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 16:52:57 +03:00
Vadim Malanov
a375ca55b9 refactor: extract ensure_artifact into app/storage/artifacts.py
The artifact-upsert helper was duplicated four times (scanner.py,
table_processor.py, figure_processor.py, pipeline.py) with slightly
different signatures. Consolidates into a single keyword-only function
keyed on (document_id, storage_key) - the identity the schema already
enforces - so re-running the pipeline never creates duplicate rows.

scanner / table_processor / figure_processor now import the shared
helper directly. pipeline.py keeps a thin local wrapper to preserve
the positional call sites at three artifact upsert points (OCR_PDF,
MARKDOWN, DOCLING_JSON).

Tests: 24 passed (5 health + 19 original).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 16:51:54 +03:00
Vadim Malanov
cd9977f8c3 feat(api): add CORS middleware and /health contract test
CORS:
- New setting CORS_ALLOWED_ORIGINS (comma separated). Defaults cover
  the three local Vite ports (5173, 5273, 4173); production overlay
  expects the real origin in .env.prod.
- main.py wires CORSMiddleware from settings.cors_origins. No * in
  production - see RUNBOOK and .env.prod.example.
- docker-compose.yml forwards the variable to both api and worker.

Tests:
- tests/test_api_health.py uses FastAPI TestClient and monkeypatches
  the five probe functions (postgres/minio/opensearch/qdrant/redis).
  Verifies the all-ok, any-error, and degraded paths, that the root
  endpoint reports the configured api prefix, and that the CORS
  preflight echoes the allowed origin.
- pytest tests/test_api_health.py -q: 5 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 16:48:49 +03:00
Vadim Malanov
eecdfaa847 fix(frontend): clear TypeScript strict-mode errors
- vite-env.d.ts now declares ImportMetaEnv with the three VITE_*
  variables the project uses, restoring proper typing for
  import.meta.env in apiClient.ts.
- QualityFlag.tsx widens its 'flags' prop to accept the domain
  QualityFlags type, the loose Record form used in mocks, or null,
  ending the structural-mismatch errors at five callsites
  (DocumentsPage, DocumentViewerPage, QualityControlPage,
  ChunkPreview, SearchResultCard).
- DashboardPage trend callbacks are typed against DashboardStats so
  the implicit-any complaints disappear without weakening intent.

npx tsc --noEmit -> clean. vite build -> ok.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 16:46:56 +03:00
Vadim Malanov
54714b5757 ci: add GitHub Actions workflow and ESLint v9 config
Adds two-job CI (backend + frontend) running ruff, pytest (unit only -
skipping heavy ML deps), docker compose config validation for both dev
and prod overlays, plus npm ci -> eslint -> tsc -> vite build for the
frontend.

ESLint config uses the v9 flat-config format that the project was
already on (eslint v9 dropped .eslintrc support); replaces the broken
'eslint . --ext' invocation and adds @typescript-eslint, react-hooks,
and react-refresh plugins.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 16:44:04 +03:00
Vadim Malanov
7f72171572 chore: bootstrap repository with governance docs
Initialize git, add Apache-2.0 LICENSE, .gitattributes (LF line
endings), AGENTS.md (entry points, stack, discovery order, baseline
checks), RUNBOOK.md (dev boot, prod deploy with overlay, ingestion,
failures, rollback, scaling notes), .env.prod.example with rotated
credential placeholders, and dev-only warnings on .env.example.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 16:41:50 +03:00