Skip to content

Stock Knowledge Graph + LLM — System Requirements

Implementation spec derived from two design docs:

  • Stock_Factor_Graph.docx — the factor universe (9 node categories, 6 edge types, 6 regimes, sector factor maps, indicator catalog, edge-weighting methodology).
  • Stock_KG_Schema_and_LLM_Integration.docx — the schema (entities, relationships, properties, queries) and the LLM integration (pattern-based hybrid pipeline, citation grounding, verification, audit log).

Treat both docs as authoritative; this file translates them into buildable requirements, a reference architecture, a GitHub-only free-tier hosting plan, and a phased delivery plan suitable for a public repo.


1. Goal & non-goals

Goal. A queryable property graph of the equity universe (entities, relationships, time-bounded properties, provenance) plus an LLM orchestration layer that answers multi-hop, explainable financial questions with citations. The whole project — code, CI, docs, container images, scheduled ingestion, demo UI, live backend — must be hostable for free using GitHub and free-tier partners reachable from GitHub.

Non-goals (explicit). - Not a price forecaster. No alpha model, no LSTM/transformer over OHLCV. - Not a trading system. No execution, no OMS/EMS, no broker integration. - Not an unconstrained "ask anything" chatbot. The LLM is restricted to a finite set of templated reasoning patterns. - Not a Bloomberg replacement. Free-tier data only by default; licensed feeds via gated config.


2. Functional requirements

FR-1. Entity store (graph)

9 node types from the schema doc + auxiliaries: - Core: Company, Sector, Indicator, FinancialItem, Event, Person, Product, Regime, Document - Auxiliary: Segment, Listing, Index, CausalMechanism, InfluenceLink (reified edge), CompanyAlias

Each Indicator carries a category from the factor graph taxonomy: MACRO | POLICY | GEOPOLITICAL | INDUSTRY | NARRATIVE | TECHNICAL | FLOW | EXOGENOUS (the 9th, COMPANY, maps to FinancialItem + Event + company properties rather than to Indicator).

FR-2. Relationship store (graph)

5 edge families from the schema doc, mapped onto the 6 factor-graph edge types:

Schema family Factor-graph edge types Notes
Categorical / structural BELONGS_TO, CHILD_OF, INCLUDED_IN, etc.
Inter-company SUPPLIED_BY, HAS_CUSTOMER, COMPETES_WITH, PARTNERS_WITH
Causal-influence CAUSAL, CORRELATIONAL, REFLEXIVE, AMPLIFYING, INHIBITORY, SUBSTITUTIVE INFLUENCES is the canonical edge; AMPLIFIES/INHIBITS operate on reified InfluenceLink nodes; REFLEXIVE loops are detected by traversal
Temporal / event-link UPDATES, INVOLVES, TRIGGERED, RESOLVED_BY
Provenance SOURCED_FROM, REPORTED_IN, DERIVED_FROM, RESTATES

FR-3. Temporal validity

Every property and edge that can change over time MUST carry effective/end dates. Historical queries resolve to historical state. Lag stored as a distribution (median, p10, p90), not a point estimate.

FR-4. Provenance

Every fact traces to a Document via SOURCED_FROM / REPORTED_IN. Confidence required on inferred or estimated facts.

FR-5. Hybrid storage

Layer Local dev Hosted free demo
Property graph Neo4j Community 5.x in Docker Neo4j AuraDB Free (1 instance, 200k nodes / 400k rels, sleeps after 3d, auto-resumes)
Time-series + relational TimescaleDB (Postgres) in Docker Supabase Free (500MB Postgres + pgvector + auth, sleeps after 7d)
Vector pgvector on the same Postgres pgvector on Supabase
Object store MinIO in Docker Cloudflare R2 Free (10GB storage, 10GB/mo egress) — or Git LFS / GitHub Releases for static artifacts
Audit log Postgres Supabase

This choice keeps free-tier limits realistic for a public demo (S&P 100 + ~50 indicators + 1 yr of events fits in 200k nodes; full-universe coverage requires user-provided paid tier).

FR-6. Ingestion pipelines (free-tier sources only by default)

Idempotent upsert workers, lineage-tagged. Each runs as a GitHub Actions scheduled workflow writing to the hosted free DBs.

Source Type Refresh Source URL/SDK
SEC EDGAR Company, FinancialItem, Document, Event (8-K) Daily cron in Actions https://data.sec.gov/submissions/CIK*.json, XBRL frames API
FRED Indicator (macro) Daily cron https://api.stlouisfed.org/fred/ (free API key)
Treasury Direct Indicator (rates, curve) Daily cron XML feeds
BLS / BEA Indicator (CPI, PCE, NFP, GDP) Monthly cron Free public APIs
GDELT / RSS Event (news) Hourly cron GDELT GKG, free RSS aggregators
Wikidata Person, Product, basic Company Weekly cron SPARQL endpoint
GitHub-curated YAML Inter-company edges (supply chain, partnerships, competition) On PR merge Repo files, validated in CI
Optional, gated Bloomberg / Refinitiv / FactSet n/a Disabled by default; activated via secrets

Earnings-transcript ingestion and licensed sell-side: documented but not part of the free baseline.

FR-7. Update cadence (per schema doc §39.1)

Same as the schema doc; in this build all cadences are GitHub Actions cron schedules (see §11).

FR-8. Multi-hop reasoning patterns

The 6 patterns from the schema doc, each as a parameterised, tested template: 1. causal_attribution — "why is X happening?" 2. forward_propagation — "what if X moves?" 3. comparative_analysis — "compare A vs B" 4. supply_chain_traversal — "who depends on X?" 5. risk_decomposition — "biggest risks to X?" 6. regime_counterfactual — "what if regime changes?"

FR-9. LLM orchestration (pattern-based hybrid)

Six pipeline stages from the schema doc §28: intent → params → query → enrich → compose → verify. Each stage independently logged.

FR-10. Citation grounding

Every numerical claim cites a result row; every factual claim cites a Document. Verifier rejects unsupported claims. [DOC: <id>] markers rendered as links.

FR-11. Regime conditioning

6 regimes (GOLDILOCKS, REFLATION, STAGFLATION, SLOWDOWN, CRISIS, NARRATIVE_DOMINATED) plus hybrid co-active states. INFLUENCES.regime_magnitudes is a map keyed by regime category. Active regime resolved at query time.

FR-12. Audit log

Every answer logs: question, classified pattern, extracted params, executed Cypher, raw graph results, retrieved Document IDs, active regime, model + version, prompt hash, composed answer, verification outcome. Append-only in Supabase.

FR-13. Reproducibility

Versioned: schema (with migrations), data (point-in-time snapshots committed to a separate data-snapshots branch or to GitHub Releases as Parquet), prompt hashes (committed in llm/prompts/).

FR-14. Evaluation harness

  • ~500 labelled benchmark questions in eval/benchmark/. CI gate on regression.
  • Adversarial set: nonexistent companies, wrong premises, ambiguous tickers, time-traveling questions.
  • Metrics: factual accuracy (>97%), citation completeness (>95%), retrieval relevance (>40%), pattern classification (>90%), hallucination rate (trend-monitored).

FR-15. Query/API surface

  • POST /ask (NL question), POST /query/{pattern} (typed pattern call), SSE streaming with stage-by-stage events.
  • Read-only Cypher passthrough behind admin auth.

FR-16. UI (MVP scope)

  • CLI client first (Python, Typer).
  • Static demo UI: Next.js exported to static HTML, hosted on GitHub Pages (docs/ or gh-pages branch). No SSR.
  • The static UI calls the public backend (BYOK Anthropic key entered by the user in the browser, stored in localStorage; no secrets on the server).

FR-17. Bring-your-own-key (BYOK) LLM

Free-tier hosting cannot fund LLM calls. The public demo: - Accepts the user's own Anthropic API key in the UI (or, alternately, an OpenAI / OpenRouter / local-Ollama URL). - Server never persists the key; it is forwarded per request in an X-LLM-Key header. - For self-hosters, key can be set via .env / Codespaces secret.

FR-18. Indicator seed catalog

Ship a versioned schema/seed/indicators.yaml derived from the factor-graph doc, covering ~50 indicators across the 8 indicator categories. Minimal viable list, with FRED/BLS/Treasury source IDs:

Category Examples (subset)
MACRO — monetary FED_FUNDS_TARGET (FRED:DFEDTARU), 10Y_TREASURY (FRED:DGS10), 2Y_TREASURY, 2s10s_SLOPE, 10Y_TIPS (real yield), M2, FED_BALANCE_SHEET
MACRO — inflation CPI_HEADLINE (BLS), CORE_PCE (BEA), CLEVELAND_INFLATION_NOWCAST, 5Y5Y_INFLATION
MACRO — growth GDP_NOWCAST_ATLANTA, ISM_MFG, ISM_SERVICES, LEI_CONFERENCE_BOARD, NFP, UNEMPLOYMENT_RATE
MACRO — currency DXY, USDJPY, USDCNH, EUR_USD
MACRO — commodity BRENT_OIL_PRICE, WTI_OIL_PRICE, NATGAS_HENRY_HUB, GOLD, COPPER, URANIUM_U3O8
POLICY EFFECTIVE_TARIFF_RATE, TARIFF_UNCERTAINTY_INDEX, SECTION_232_LIVE, IRA_SUBSIDY_FLOW
GEOPOLITICAL IRAN_CONFLICT_ACTIVE, TAIWAN_TENSION_SCORE, SANCTIONS_REGIME_CHANGE
INDUSTRY HYPERSCALER_AI_CAPEX_BUDGET, GLOBAL_RIG_COUNT, SEMI_BB_RATIO, REIT_CAP_RATE_AVG
NARRATIVE AI_NARRATIVE_STRENGTH, SOFT_LANDING_NARRATIVE_STRENGTH, CUDA_MOAT_NARRATIVE, GLP1_DISRUPTION_NARRATIVE
TECHNICAL MAG7_RELATIVE_BREADTH, SP500_PCT_ABOVE_200DMA, MCCLELLAN_OSCILLATOR, VIX, MOVE
FLOW CTA_TECH_POSITIONING, DEALER_GAMMA, AAII_BULL_BEAR, BUYBACK_BLACKOUT, ETF_FLOWS_QQQ
EXOGENOUS PANDEMIC_ACTIVE, MAJOR_INFRA_OUTAGE

Narrative and exogenous indicators need NLP/manual curation; ship them as scaffolding with a manual ingestion path.

FR-19. Sector factor maps

Ship schema/seed/sector_maps.yaml encoding the 7 sector partitions from the factor-graph doc (Tech/Semis, Energy, Financials, Healthcare/Pharma, Consumer, Industrials/Defense, Real Estate/REITs) with: sub-industry partition, dominant factors, critical edges, regime-dependent edges, non-obvious factors. Used by risk_decomposition and comparative_analysis patterns to bootstrap before empirical calibration is available.

FR-20. Regime detection

Daily classifier job. Phase 1: rule-based (transparent). Inputs: GDP nowcast, core PCE, Fed funds direction, 10Y direction, VIX, IG/HY spreads, DXY, sector leadership, top-10 concentration, narrative scores. Output: dominant regime + co-active regime(s) + confidence. Hidden Markov / multi-label classifier deferred to Phase 7.

FR-21. Fragility score

Composite indicator combining concentration, breadth, leverage, narrative dominance, credit-spread tightness. Surfaced as a top-level UI badge. Used to caveat answers ("model output trustworthy unless fragility-score elevated").

FR-22. Reflexive-loop detection

Cycle detection over INFLUENCES edges of mechanism_class IN {WEALTH_EFFECT, CAPEX_AFFORDABILITY, EMPLOYMENT_EARNINGS, VOL_OF_VOL}. Surfaced when the user asks about a node that participates in an active loop.


3. Non-functional requirements

Area Requirement
Latency Templated query p95 < 500 ms (graph only). End-to-end p95 < 8 s (incl. LLM compose + verify). On free Auradb cold-start, first call may take ~30 s; document this.
Throughput Public demo: 1 QPS (Hugging Face Spaces / Fly free tier). Self-host: 10+ QPS.
Scale (free tier) 200k nodes, 400k rels (AuraDB Free). Enough for S&P 100 + 50 indicators + 1 yr events.
Availability Best-effort; sleeps allowed; document the cold-start UX.
Reproducibility Any answer ≤6 mo old reproducible from audit log.
Compliance Free-tier sources only by default; vendor data behind opt-in flag. PII limited to publicly disclosed roles. No redistribution of vendor data.
Security Secrets via GitHub Actions secrets / Codespaces secrets / .env. No keys committed. BYOK pattern for the LLM.
Observability Structured logs to stdout (captured by Actions / hosting platform). Pipeline trace IDs. Metrics scraped by GitHub Actions weekly into a static dashboard on Pages.
Cost $0 for the maintainer. End user funds their own LLM key.

4. Reference architecture (free-tier deployment)

                  GITHUB.COM (free, public repo)
   ┌───────────────────────────────────────────────────────┐
   │  Source │ CI │ Actions cron │ Pages (UI) │ Releases   │
   │         │    │              │            │ (Parquet)  │
   │         │    │              │            │            │
   │  Issues │ PRs│ Codespaces   │ Container  │ LFS for    │
   │         │    │ (60h/mo)     │ Registry   │ large data │
   └────┬────┴────┴──────┬───────┴─────┬──────┴──────┬─────┘
        │                │             │             │
        │ commits        │ scheduled   │ static UI   │ docker
        │                │ ingest      │ host        │ pulls
        ▼                ▼             ▼             ▼
   ┌────────────────────────────────────────────────────────┐
   │         FREE-TIER DATA + RUNTIME PARTNERS              │
   │                                                        │
   │  Neo4j AuraDB Free  ◀──── graph writes ─── Actions     │
   │  (200k/400k, sleeps)                                   │
   │                                                        │
   │  Supabase Free      ◀──── docs, ts, vectors, audit     │
   │  (Postgres+pgvector,                                   │
   │   500MB, sleeps)                                       │
   │                                                        │
   │  Cloudflare R2 Free ◀──── raw filings, blobs           │
   │  (10GB)                                                │
   │                                                        │
   │  Hugging Face Space ◀──── FastAPI orchestrator         │
   │  (16GB RAM, 2 vCPU,        (Python, container          │
   │   sleeps after 48h)        from GHCR)                  │
   │                                                        │
   │  User's browser     ─── BYOK Anthropic ───▶ Anthropic  │
   │  (key in localStorage)    (forwarded as header)        │
   └────────────────────────────────────────────────────────┘

Why each free-tier partner:

Partner Why this one Free-tier cap Failure mode
Neo4j AuraDB Free The doc's recommended graph DB; only managed-Neo4j free tier; native Cypher 200k nodes / 400k rels / 1 instance / sleeps after 3d (auto-resume on first query) Cold-start latency on demo
Supabase Free Postgres + pgvector + auth + REST in one; same DB serves time-series, vectors, docs, audit log 500MB DB / 50k MAU / pauses after 7d inactive Pause means manual resume; mitigated by weekly Actions cron pinging
Cloudflare R2 Free S3-compatible, no egress fees on free tier; works with boto3 10GB storage / 1M Class A ops/mo Sufficient for ~10k filings PDFs
Hugging Face Spaces Runs Docker containers built from GHCR; persistent URL; CORS-friendly for the Pages UI 16GB RAM, sleeps after 48h, auto-wakes Cold start; mitigated by warm-up ping
GitHub Pages Static hosting for Next.js export; project URL <user>.github.io/<repo> 1GB site, 100GB/mo bandwidth Plenty
GitHub Actions Scheduled cron, CI, container builds 2000 min/mo private; unlimited for public repos Public repo solves it
GHCR Container registry, free for public Unlimited public
Anthropic (BYOK) User funds their own usage Per-user $5 starter credit If user has no key, fall back to read-only graph queries (no LLM compose)

4.1 Repository layout

stock-kg/
├── README.md                      # quickstart, architecture diagram, demo link
├── LICENSE                        # Apache-2.0 (recommended)
├── pyproject.toml                 # uv / poetry
├── docker-compose.yml             # neo4j + postgres+timescale+pgvector + minio
├── .github/
│   ├── workflows/
│   │   ├── ci.yml                 # lint + type + unit + golden patterns
│   │   ├── benchmark.yml          # nightly eval suite, results to Pages
│   │   ├── ingest-edgar.yml       # daily SEC pull → AuraDB + R2
│   │   ├── ingest-fred.yml        # daily macro indicators → Supabase
│   │   ├── ingest-news.yml        # hourly GDELT/RSS → AuraDB Events
│   │   ├── ingest-treasury.yml    # daily curve / yields
│   │   ├── classify-regime.yml    # daily regime classifier
│   │   ├── calibrate-edges.yml    # monthly INFLUENCES recalibration
│   │   ├── pages-deploy.yml       # static UI build + deploy to gh-pages
│   │   ├── space-deploy.yml       # push container to HF Space
│   │   └── keepalive.yml          # weekly ping to wake Supabase / AuraDB
│   └── ISSUE_TEMPLATE/
├── docs/                          # MkDocs or Docusaurus → Pages
│   ├── architecture.md
│   ├── schema.md                  # rendered from /schema
│   ├── patterns.md
│   ├── factor-catalog.md          # from Stock_Factor_Graph.docx
│   ├── sector-maps.md
│   └── adr/                       # architecture decision records
├── schema/
│   ├── nodes.cypher
│   ├── constraints.cypher
│   ├── seed/
│   │   ├── gics_sectors.yaml
│   │   ├── indicators.yaml        # FR-18 catalog, ~50 indicators
│   │   ├── regimes.yaml           # 6 regimes + transition seeds
│   │   ├── sector_maps.yaml       # FR-19, 7 sector factor maps
│   │   ├── companies_sp100.csv    # bootstrap universe
│   │   └── inter_company.yaml     # curated edges (PR-reviewed)
│   └── migrations/0001_*.cypher
├── ingestion/
│   ├── edgar/
│   ├── fred/
│   ├── treasury/
│   ├── news/
│   ├── transcripts/               # opt-in, licensed
│   ├── curation_validate.py       # CI hook for inter_company.yaml
│   └── common/                    # http client, idempotent upserts, lineage
├── kg/
│   ├── client_neo4j.py            # works against AuraDB and local docker
│   ├── repositories/
│   └── queries/                   # 6 patterns as templated Cypher
├── orchestrator/
│   ├── pipeline.py                # 6-stage pipeline
│   ├── intent.py
│   ├── params.py
│   ├── compose.py
│   └── verify.py
├── llm/
│   ├── client.py                  # provider-pluggable; reads X-LLM-Key
│   ├── prompts/                   # versioned, hashed
│   └── providers/                 # anthropic, openai, openrouter, ollama
├── api/
│   ├── main.py                    # FastAPI; CORS open for Pages origin
│   └── Dockerfile                 # built and pushed to GHCR + HF Space
├── eval/
│   ├── benchmark/                 # 500 labelled Q&A YAML
│   ├── adversarial/
│   └── runner.py
├── ui/                            # Next.js, exports to static
│   ├── pages/
│   ├── components/                # answer card, citation, regime badge,
│   │                              # fragility chip, reasoning trace, 3D graph
│   └── next.config.js             # output: 'export'
└── tests/
    ├── unit/
    ├── integration/               # against ephemeral neo4j (testcontainers)
    └── golden/                    # pattern → expected Cypher

5. Phased delivery plan (free-tier from Phase 0)

Each phase ends with a tagged release. Phase 0 already publishes to GitHub Pages; Phase 4 is the first public end-to-end live demo.

Phase Scope Exit criteria
0. Skeleton + Pages Repo, license (Apache-2.0), CI (lint + type + unit), docker-compose up for local, MkDocs site auto-deployed to GitHub Pages, ADR-0 records architecture. Pages renders; docker-compose up brings local stack healthy; all schema constraints applied.
1. Static seed Load GICS sectors, S&P 100 Company, indicator seed catalog (FR-18), 6 regimes, sector maps (FR-19). Local + AuraDB Free both populated by a single bootstrap script. MATCH (c:Company) RETURN count(c) = 100 on AuraDB; sector hierarchy + indicator catalog + regimes traversable.
2. Scheduled SEC + FRED GitHub Actions cron jobs ingest 10-K/10-Q (last 4 quarters) for S&P 100 → FinancialItem + Document (raw to R2) and FRED indicators → Supabase TS. Provenance edges in place. Last 4 quarters of revenue/EPS for all 100 names with SOURCED_FROM to a real EDGAR URL; daily macro indicator ticks on AuraDB nodes via Supabase pointer.
3. Three patterns + LLM v0 Implement causal_attribution, comparative_analysis, risk_decomposition. Implement 6-stage pipeline (verifier stub initially). Anthropic SDK with prompt caching. CLI ask. Provider abstraction supports BYOK header. Local CLI returns cited answers end-to-end against AuraDB.
4. Static UI on Pages + Space backend Next.js static UI on Pages with answer card, inline citations, regime badge, reasoning-trace expander. FastAPI container in GHCR, deployed to Hugging Face Space. UI accepts BYOK in localStorage. Public URL on Pages calls the Space backend, returns a cited answer. Live demo for 3 questions.
5. News + Events Hourly GDELT/RSS → Event nodes with LLM subtype classification. Link to Indicator / Company. Keepalive ping to prevent Supabase pause. "Why is NVDA down?" returns event-grounded answer.
6. Inter-company curation Curated YAML for supply chain, customer concentration, partnerships of top 50 names. CI validation. supply_chain_traversal and forward_propagation work for curated names.
7. Regime classifier + remaining patterns Daily rule-based regime classifier (Action). regime_magnitudes populated. regime_counterfactual and forward_propagation patterns. Fragility score in UI. Counterfactual responses show magnitude deltas across regimes; fragility chip live.
8. Vectors + transcripts (opt-in) Earnings transcript chunks → pgvector. Hybrid retrieval (vector → graph). Docs explain that transcript-vendor licensing is the user's responsibility. Hybrid retrieval question returns chunks + graph context.
9. Verifier + audit log Independent verification LLM call. Full audit log in Supabase. Audit-replay command. Replay reproduces a prior answer modulo flagged drift.
10. Eval harness on Pages 500-question benchmark + adversarial set. Nightly Actions run; metrics published as static dashboard on Pages. Public scoreboard live; CI gates regressions.
11. 3D graph viz (optional) Three.js / react-force-graph component in the UI. Filters: time-horizon, regime, top-N. Demo visualises top-N edges for a selected company in current regime.

A single developer working part-time can reasonably reach Phase 4 (first public live demo) in ~6 weeks. Phases 5–11 are the long tail.


6. Operational requirements

  • Calibration jobs. Monthly rolling and quarterly full INFLUENCES recalibration via calibrate-edges.yml. Surface last_calibrated in UI when stale.
  • Lineage. Every value carries source, ingestion_run_id (the GitHub Actions run URL), parser_version. One-click trace from a UI cell back to the workflow run.
  • Schema versioning. _schema_version tag on every node. Migrations in schema/migrations/. CI rejects ingestions that target an unsupported version.
  • Keepalive. keepalive.yml runs weekly to ping AuraDB Free and Supabase, preventing pause. Document that the demo may still cold-start once per week.
  • Cost guards. Budget per question; verifier uses cheaper model (Haiku); max-retry cap. Forward BYOK key only — never proxy with maintainer credit.
  • Monitoring. Track classification accuracy, hallucination rate (verifier rejection rate), retrieval relevance, p95 latency per stage. Published nightly on Pages.

7. Indicator + edge calibration methodology

Three layers from the factor-graph doc §32 applied in sequence: 1. Empirical correlation. Rolling 5y and 10y correlations between factor changes and stock-return responses, computed in Actions and persisted on INFLUENCES edges. 2. Regime adjustment. Each historical period tagged with active regime; per-regime magnitudes stored in regime_magnitudes. 3. Causal-mechanism check. Free-form mechanism_text field required before promoting a correlation to a causal edge. CI rejects edges without a mechanism.

Edge required fields: empirical_corr_5y, empirical_corr_10y, regime_magnitudes (map), mechanism, mechanism_text, confidence, last_calibrated, calibration_window.


8. Open decisions for the user (recommendations in bold)

  1. Free-tier hosts confirmed. Neo4j AuraDB Free + Supabase + Cloudflare R2 + Hugging Face Spaces + GitHub Pages. Alt: Render free (cold starts), Fly.io free (256MB RAM is tight). Stay with the recommended set.
  2. License. Apache-2.0 (patent grant + permissive) vs MIT (slightly simpler) vs AGPL (forces SaaS forks open).
  3. Public-repo or private with public release? Public from day one to get unlimited Actions minutes.
  4. LLM provider abstraction from day one vs Anthropic-first. Pluggable from day one — minor cost, big win for free-tier users who want OpenRouter / Ollama.
  5. Static UI framework. Next.js with static export (familiar, ecosystem) vs Astro (lighter) vs plain Vite + React.
  6. Docs framework. MkDocs Material (Python-native, simple) vs Docusaurus (richer, more setup).
  7. Inter-company curation strategy. Manual YAML in repo, PR-reviewed for transparency. Bloomberg Supply Chain only for self-hosters with licenses.
  8. Regime classifier. Rule-based for v0 (transparent), HMM/multi-label later.
  9. Streaming responses. SSE so UI can render the pipeline stages as they complete.
  10. Universe size. S&P 100 + ~30 ETFs + indicator catalog to fit AuraDB Free comfortably; full S&P 500 is also feasible (~5k nodes for companies; the relationship count is the gating factor).

9. Risks & mitigations

Risk Mitigation
Free-tier service shrinks or disappears Architecture is local-first; docker-compose up reproduces full stack offline. Hosted demo is a deployment target, not a dependency.
AuraDB Free 200k/400k cap hit Cap demo universe at S&P 100. Document growth plan: paid AuraDB Pro or self-hosted Neo4j on a $5/mo VPS.
Cold starts hurt demo UX Keepalive workflow + UI shows "warming up…" with progress.
LLM hallucination Five-layer mitigation per schema doc §31 (schema, validation, pattern, grounded compose, verifier).
Stale calibrations silently degrade answers Surface last_calibrated; CI alert on staleness.
Schema sprawl ADR + PR review required for any new node type or edge family.
Demo-to-real gap Prioritise benchmark over demo; CI gates on benchmark regression.
BYOK key leakage Key stored in browser localStorage only; forwarded per request; never logged server-side; documented threat model.
Vendor data licensing Free-tier sources only by default; vendor sources behind opt-in flag and gated config.

10. GitHub-only free-tier hosting summary

Everything below is free for a public repo, indefinitely:

Need GitHub feature / partner Notes
Source + collaboration GitHub repo (public) Unlimited
CI / lint / tests Actions (public repo) Unlimited minutes
Scheduled ingestion Actions cron (schedule: cron) Up to 1000 concurrent runs
Container images GHCR Free for public images
Static UI / docs GitHub Pages 1GB site, 100GB/mo bandwidth
Backend runtime Hugging Face Spaces (Docker) 16GB RAM, sleeps after 48h
Graph DB Neo4j AuraDB Free 200k/400k, sleeps
Postgres + vectors + audit Supabase Free 500MB, sleeps
Object storage Cloudflare R2 Free 10GB
Dev environment Codespaces 60h/mo free
Releases / large data GitHub Releases + Git LFS 2GB LFS free
Issue tracking, PR review GitHub
LLM BYOK from user Maintainer pays $0

Maintainer cost: $0/mo. End-user cost: their own LLM key (Anthropic $5 free credit, or Ollama local for $0).


11. Definition of "done" for v1

  • All 9 entity types and 5 edge families ingestable and queryable.
  • All 6 reasoning patterns implemented with golden tests.
  • Pipeline end-to-end with composer + verifier.
  • ≥97% factual accuracy and ≥95% citation completeness on the benchmark.
  • Audit log entries reproducible.
  • One-command local bring-up via docker-compose up.
  • Public live demo on GitHub Pages calling Hugging Face Space backend, BYOK pattern.
  • Daily ingestion + regime classifier + keepalive workflows green.
  • README, architecture doc, schema doc, factor catalog, sector maps, ADRs published.
  • Apache-2.0 licensed.