API cleanup & alignment for packaged Chroma docs

TL;DR

  • Canonical endpoint is GET /chroma/docs (lists portal_chroma_doc).
  • /extract/docs is deprecated. Keep a hidden 308 redirect to /chroma/docs for one sprint, then remove.
  • DB DDL stays as-is (030_portal_chroma_doc.sql). Do not add columns like model, status, payload, etc. Those live inside meta (JSONB) and are projected by the API if needed.
  • Unify implementation on the repository layer (PortalChromaDocRepo) with response_model=ChromaDocsList to avoid schema drift.

Tasks

1) Canonicalize /chroma/docs (group under “Package” in Swagger)

Replace the current direct-SQL handler with the repo version:

# app/routers/chroma_docs.py
+from typing import Optional, Literal
 from fastapi import APIRouter, Depends, Query
 from sqlalchemy.orm import Session
 from app.db import get_session
+from app.schemas.chroma_package import ChromaDocsList
+try:
+    from app.repos.portal_chroma_doc_repo import PortalChromaDocRepo
+except ImportError:
+    from app.repos.portal_chroma_doc_repo import PortalChromaDocRepository as PortalChromaDocRepo

-router = APIRouter(tags=["Chroma"])
+router = APIRouter()

-@router.get("/chroma/docs", summary="portal_chroma_doc の一覧(キーセット)")
-def list_docs(...):
-    ...
+@router.get(
+    "/chroma/docs",
+    response_model=ChromaDocsList,
+    tags=["Package"],  # show under “Package”
+    summary="portal_chroma_doc の一覧(キーセット)",
+)
+def list_chroma_docs(
+    status: Optional[Literal["queued","upserted","failed"]] = Query(None),
+    entity: Optional[Literal["field","view_common"]] = Query(None),
+    model: Optional[str] = None,
+    collection: Optional[str] = None,
+    limit: int = Query(50, ge=1, le=500),
+    cursor: Optional[str] = None,
+    session: Session = Depends(get_session),
+):
+    try:
+        cur_id = int(cursor) if cursor else None
+    except Exception:
+        cur_id = None
+    repo = PortalChromaDocRepo(session)
+    items, next_cursor = repo.list_keyset(
+        status=status, entity=entity, model=model,
+        collection=collection, limit=limit, cursor=cur_id,
+    )
+    return {"items": items, "next_cursor": (str(next_cursor) if next_cursor else None)}

If there’s any duplicate/older chroma_docs.py (direct SQL version), remove it to avoid import collisions.

2) Deprecate /extract/docs with a hidden 308 redirect

Replace the old handler in app/routers/extracts.py:

-from fastapi import APIRouter, Depends, Query
+from fastapi import APIRouter, Request
+from fastapi.responses import RedirectResponse

 router = APIRouter()

-@router.get("/docs", ... tags=["Package"], summary="portal_chroma_doc の一覧")
-def list_chroma_docs(...):
-    ...
+@router.get("/docs", include_in_schema=False)
+def extract_docs_alias(request: Request):
+    # Compat for one sprint: /extract/docs -> /chroma/docs
+    url = request.url_for("list_chroma_docs")  # function name, not operation_id
+    return RedirectResponse(url=url, status_code=308)

Ensure main.py includes routers like:

app.include_router(chroma_docs.router)               # absolute path /chroma/docs
app.include_router(extracts.router, prefix="/extract")

(Do not double-prefix /chroma.)

3) Database DDL — no change required

Keep 030_portal_chroma_doc.sql exactly as it is. Design intent:

  • Fixed columns: keys and lifecycle (entity, natural_key, lang, doc_id, collection, state, etc.)
  • Variable business fields belong in meta JSONB: model, model_table, field_name, action_xmlid (note: xmlid, not xlmid), status, payload, etc.

If you need to filter by model frequently, we’ll project from meta:

-- optional perf indices
CREATE INDEX IF NOT EXISTS idx_chroma_doc_meta_model
  ON public.portal_chroma_doc ((meta->>'model'));
CREATE INDEX IF NOT EXISTS idx_chroma_doc_meta_action_xmlid
  ON public.portal_chroma_doc ((meta->>'action_xmlid'));

QA checklist

  • /openapi.json contains /chroma/docs under the Package tag; /extract/docs is not shown.
  • GET /extract/docs returns 308 Permanent Redirect to /chroma/docs.
  • GET /chroma/docs returns { "items": [...], "next_cursor": "..." } matching ChromaDocsList.
  • Fields like model, status, payload are projections from meta (Optional in the schema).

Quick commands:

curl -s http://localhost:8080/openapi.json | jq -r '.paths | keys[]' | grep extract/docs || echo "OK: hidden"
curl -I http://localhost:8080/extract/docs | head -n 1            # HTTP/1.1 308 Permanent Redirect
curl -s "http://localhost:8080/chroma/docs?limit=1" | jq '.'

Commit message

refactor(api): canonicalize packaged docs under /chroma/docs; deprecate /extract/docs via 308
- unify implementation via PortalChromaDocRepo (response_model=ChromaDocsList)
- tag as Package in Swagger to remove duplication
- add hidden compatibility alias /extract/docs -> /chroma/docs (remove next sprint)

Notes for the team

  • Please do not add top-level columns like model/status/payload to portal_chroma_doc. These belong to meta by design.
  • If any client still uses /extract/docs, switch to /chroma/docs. The alias will be removed after one sprint.

Comments

コメントを残す

メールアドレスが公開されることはありません。 が付いている欄は必須項目です