chore: bootstrap repository with governance docs
Initialize git, add Apache-2.0 LICENSE, .gitattributes (LF line endings), AGENTS.md (entry points, stack, discovery order, baseline checks), RUNBOOK.md (dev boot, prod deploy with overlay, ingestion, failures, rollback, scaling notes), .env.prod.example with rotated credential placeholders, and dev-only warnings on .env.example. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
146
RUNBOOK.md
Normal file
146
RUNBOOK.md
Normal file
@@ -0,0 +1,146 @@
|
||||
# LegacyHUB — Operational Runbook
|
||||
|
||||
## Quick boot (dev)
|
||||
|
||||
```bash
|
||||
cp .env.example .env
|
||||
docker compose up -d --build
|
||||
docker compose exec api python scripts/init_db.py
|
||||
docker compose exec api python scripts/init_opensearch.py
|
||||
docker compose exec api python scripts/init_qdrant.py
|
||||
docker compose exec api python scripts/smoke_test.py
|
||||
```
|
||||
|
||||
Verify:
|
||||
|
||||
```bash
|
||||
curl -fsS http://localhost:8000/api/v1/health | jq .
|
||||
```
|
||||
|
||||
Frontend dev:
|
||||
|
||||
```bash
|
||||
cd frontend && cp .env.example .env && npm install && npm run dev
|
||||
# http://localhost:5273
|
||||
```
|
||||
|
||||
## Production deploy
|
||||
|
||||
Production overlay enables OpenSearch security plugin, removes default ports,
|
||||
forces externally-supplied credentials, and disables debug routes.
|
||||
|
||||
```bash
|
||||
# 1. Ensure secrets exist
|
||||
cp .env.prod.example .env.prod
|
||||
$EDITOR .env.prod # rotate every credential, never commit
|
||||
|
||||
# 2. Build + recreate
|
||||
docker compose \
|
||||
-f docker-compose.yml -f docker-compose.prod.yml \
|
||||
--env-file .env.prod \
|
||||
up -d --build --force-recreate api worker
|
||||
|
||||
# 3. Migrations
|
||||
docker compose -f docker-compose.yml -f docker-compose.prod.yml \
|
||||
--env-file .env.prod exec api python scripts/init_db.py
|
||||
|
||||
# 4. Health gate
|
||||
docker compose -f docker-compose.yml -f docker-compose.prod.yml \
|
||||
--env-file .env.prod exec api python scripts/smoke_test.py
|
||||
curl -fsS https://<host>/api/v1/health | jq -e '.status == "ok"'
|
||||
```
|
||||
|
||||
Hardening notes (mandatory for prod):
|
||||
|
||||
- Rotate every credential in `.env.prod` from `.env.prod.example` placeholders.
|
||||
- Put OpenSearch behind TLS and admin password. Remove
|
||||
`DISABLE_SECURITY_PLUGIN=true` (handled by overlay).
|
||||
- Front the API with a reverse proxy that performs auth + TLS termination.
|
||||
- Restrict CORS via `CORS_ALLOWED_ORIGINS` (comma-separated) — never `*` in
|
||||
prod.
|
||||
- MinIO root key/secret in prod must come from a secret store, not the repo.
|
||||
- Mount `data/input` and `data/work` from durable storage, not the workstation.
|
||||
|
||||
## Ingestion
|
||||
|
||||
```bash
|
||||
# trigger from the API
|
||||
curl -X POST http://localhost:8000/api/v1/ingest/folder \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"path":"/data/input","recursive":true,"force":false}'
|
||||
|
||||
# or inline (no Celery)
|
||||
docker compose exec api python scripts/ingest_folder.py \
|
||||
--path /data/input --recursive --mode inline
|
||||
|
||||
# re-index a single doc
|
||||
docker compose exec api python scripts/reindex_document.py \
|
||||
--document-id <uuid>
|
||||
```
|
||||
|
||||
## Failure handling
|
||||
|
||||
Each stage emits a row to `processing_events` with `level` and `data`. Inspect:
|
||||
|
||||
```bash
|
||||
docker compose exec postgres psql -U legacyhub -d legacyhub -c \
|
||||
"SELECT created_at, stage, level, message FROM processing_events
|
||||
ORDER BY created_at DESC LIMIT 50;"
|
||||
```
|
||||
|
||||
| Failure | Where to look | Fix |
|
||||
|----------------------|-----------------------------------------------------|----------------------------------|
|
||||
| `OCR_FAILED` | `processing_events` → `OCR_STARTED` then error | Confirm `tesseract-ocr-rus` package; rerun `scripts/reindex_document.py` |
|
||||
| `EXTRACTION_FAILED` | `processing_events` → Docling stage | Check timeout; verify Docling version pin |
|
||||
| Indexing stuck | OpenSearch + Qdrant health | `scripts/init_opensearch.py`, `scripts/init_qdrant.py` |
|
||||
| Reranker disabled | API logs → `reranker.disabled` | Ensure `RERANKER_ENABLED=true`; HF cache mounted |
|
||||
|
||||
## Verification gates (per change)
|
||||
|
||||
1. `python -m pytest tests/ -q` — full unit suite (19+ tests).
|
||||
2. `python -m compileall -q app scripts tests`.
|
||||
3. `docker compose config --quiet`.
|
||||
4. Frontend: `npx tsc --noEmit && npm run build`.
|
||||
5. `/api/v1/health` returns `{"status":"ok"}`.
|
||||
6. One smoke ingest of a known PDF; verify `/search` returns a result.
|
||||
|
||||
## Rollback
|
||||
|
||||
1. Capture deployed commit SHA before deploy (`git rev-parse HEAD`).
|
||||
2. To roll back the API/worker image only:
|
||||
```bash
|
||||
docker compose -f docker-compose.yml -f docker-compose.prod.yml \
|
||||
--env-file .env.prod up -d --build --force-recreate api worker \
|
||||
--no-deps # keep PG/MinIO/OS/Qdrant intact
|
||||
```
|
||||
3. Data services (PostgreSQL, MinIO, OpenSearch, Qdrant) are stateful and
|
||||
should not be rolled back casually. Restore from backup via the standard
|
||||
TeamHUB Suite backup runbook.
|
||||
|
||||
## Scaling notes (~70k PDFs)
|
||||
|
||||
- Workers horizontally scale: `docker compose up -d --scale worker=8`.
|
||||
- Set `EMBEDDING_DEVICE=cuda` on a GPU-capable worker image for ~10× embedding
|
||||
throughput.
|
||||
- OpenSearch single shard suffices to ~10M chunks; increase shards and add
|
||||
replicas in prod.
|
||||
- Qdrant single-node OK for ~5M vectors; switch to cluster build beyond that.
|
||||
|
||||
## Common one-liners
|
||||
|
||||
```bash
|
||||
# count indexed chunks in OpenSearch
|
||||
curl 'http://localhost:9200/legacy_chunks/_count' | jq .
|
||||
|
||||
# inspect Qdrant collection
|
||||
curl 'http://localhost:6333/collections/legacy_chunks' | jq .
|
||||
|
||||
# list MinIO buckets
|
||||
docker compose exec minio mc alias set local http://localhost:9000 \
|
||||
"$MINIO_ACCESS_KEY" "$MINIO_SECRET_KEY"
|
||||
docker compose exec minio mc ls local
|
||||
|
||||
# how many docs reached INDEXING_COMPLETED
|
||||
docker compose exec postgres psql -U legacyhub -d legacyhub -c \
|
||||
"SELECT status, COUNT(*) FROM documents GROUP BY status;"
|
||||
```
|
||||
Reference in New Issue
Block a user