Commit Graph

4 Commits

Author SHA1 Message Date
Vadim Malanov
f42fb978a8 chore: drop dead _qid helper and surface ocr_confidence on SearchHit
- app/indexing/qdrant_client.py: remove the identity-only _qid()
  helper and pass chunk_id straight to PointStruct (Qdrant accepts
  the UUID string directly).
- services/types.ts: SearchHit gets an explicit, optional
  ocr_confidence field so consumers can type the value instead of
  casting through metadata.
- widgets/SearchResultCard.tsx: replaces the
  (hit.metadata as { ocr_confidence? }) cast with the new field. No
  behavior change when the backend omits it.

tsc --noEmit: clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 16:55:32 +03:00
Vadim Malanov
a375ca55b9 refactor: extract ensure_artifact into app/storage/artifacts.py
The artifact-upsert helper was duplicated four times (scanner.py,
table_processor.py, figure_processor.py, pipeline.py) with slightly
different signatures. Consolidates into a single keyword-only function
keyed on (document_id, storage_key) - the identity the schema already
enforces - so re-running the pipeline never creates duplicate rows.

scanner / table_processor / figure_processor now import the shared
helper directly. pipeline.py keeps a thin local wrapper to preserve
the positional call sites at three artifact upsert points (OCR_PDF,
MARKDOWN, DOCLING_JSON).

Tests: 24 passed (5 health + 19 original).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 16:51:54 +03:00
Vadim Malanov
cd9977f8c3 feat(api): add CORS middleware and /health contract test
CORS:
- New setting CORS_ALLOWED_ORIGINS (comma separated). Defaults cover
  the three local Vite ports (5173, 5273, 4173); production overlay
  expects the real origin in .env.prod.
- main.py wires CORSMiddleware from settings.cors_origins. No * in
  production - see RUNBOOK and .env.prod.example.
- docker-compose.yml forwards the variable to both api and worker.

Tests:
- tests/test_api_health.py uses FastAPI TestClient and monkeypatches
  the five probe functions (postgres/minio/opensearch/qdrant/redis).
  Verifies the all-ok, any-error, and degraded paths, that the root
  endpoint reports the configured api prefix, and that the CORS
  preflight echoes the allowed origin.
- pytest tests/test_api_health.py -q: 5 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 16:48:49 +03:00
Vadim Malanov
7f72171572 chore: bootstrap repository with governance docs
Initialize git, add Apache-2.0 LICENSE, .gitattributes (LF line
endings), AGENTS.md (entry points, stack, discovery order, baseline
checks), RUNBOOK.md (dev boot, prod deploy with overlay, ingestion,
failures, rollback, scaling notes), .env.prod.example with rotated
credential placeholders, and dev-only warnings on .env.example.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 16:41:50 +03:00