๐ฅ HEALTHCARE DATA PLATFORM โ L1 CONTROL ROOM
Can humans + AI trust hospital data today?
PROD | Updated 08:02 AM
๐ข SYSTEM STATUS โ SHOULD WE PANIC TODAY?
No โ all systems operational. ๐
No fake patients
No broken feeds
No critical quality failures
Charts are safe to read. Data is safe to feed agents.
โค๏ธ Can we trust the chart?
QC passed? 99.2%
Missing keys? 0.04%
Duplicate visits? 0.00%
Fake patients? none
[Open B2]
โฐ Is the data fresh?
Latest ingest 08:01
Data delay 2m
Stale alert ON
SLA met 99.1%
[Open B4]
๐ง Is it alive?
Uptime 99.94%
Jobs OK 99.1%
MTTR 38m
Silent fails 0
[Open B4]
๐ Can people use this?
Star schema PASS
Contracts PASS
Query marts YES
[Open B3] [Open B5]
๐ก Will compliance yell?
PII scan PASS
Audit lineage ON
HIPAA tagging PASS
[Open B2] [Open B6]
๐ค Is it agent-ready?
Marts agent-allowed 4 / 5
PII-safe views ON
Contract coverage 100%
Agent freshness SLA met
[Open B2] [Open B5]
๐จ IF SOMETHING TURNS RED
Trust issue โ Open B2
Modeling issue โ Open B3
Pipeline issue โ Open B4
Warehouse issue โ Open B5
Agent-readiness issue โ Open B5
Architecture โ Open B6
traces to:
/api/control-room 200
dashboard_spec.yml
data/quality/l1_checkpoint_report.json
โค๏ธ TRUST INVESTIGATION ROOM
Can we trust the patient and visit numbers?
PROD | Updated 08:02 AM
๐ CURRENT STATUS
Mostly healthy โ 1 issue needs attention (NOT patient-facing yet)
No fake patients detected
โค๏ธ TRUST VITALS (value + inline benchmark)
1. ๐งช QC passed? 99.2% (good โฅ95 | strong โฅ99)
2. ๐ Missing key fields? 0.04% (good <1 | strong <0.1)
3. ๐ฏ Duplicate visits? 0.00% (good <1 | strong =0)
4. ๐ง Fake patients? 100.0% (good โฅ99 | strong =100)
5. ๐ต Can we trace every number? 94% (good โฅ90 | strong โฅ95)
6. ๐งฎ Do systems agree? 99.94% (good โฅ99 | strong โฅ99.9)
๐ EVIDENCE (PROOF) โ show me receipts (direct clicks to bad files)
visit โ patient relationship 97.8% (expected =100)
direct proof clicks:
blast radar
impacts: Patient Count KPI, ER Census, RAG Patient Lookup
scope: clinical marts only
patient-facing dashboards: NOT impacted (yet)
MRN null spike 0.12% (strong <0.10)
direct proof clicks:
blast radar
impacts: reporting only (for now)
risk: if it grows โ becomes patient-identity risk
duplicate visit_id 0.00% (required =0)
๐ TRIAGE โ what do we do + who owns it
Broken visit โ patient join
owner: Data Platform (on-call) | ETA: < 1h | action: Page on-call
MRN null spike
owner: Analytics Engineering | ETA: next sprint | action: Create ticket
๐ AUTO-MITIGATIONS โ what we already auto healed so humans don't panic
๐ค Auto-remediation coverage 50%
โ
Machines stabilized the symptoms โ dashboards stayed up, no bad data shipped. โณ The other 50% needs a human: the KPI definition call below. ๐
๐ค GOOD LUCK HUMAN โ HITL your turn now
๐ฐ Finance visits: 1,024 vs ๐งพ Billing visits: 1,011 โ ๏ธ
๐ค Machine verdict: both valid ๐ต (definition fight, not a data bug)
โณ What happens if you ignore this:
๐ค Finance ships "1,024" to execs
๐ค Billing ships "1,011" to Ops
๐ BI publishes both (by accident)
๐ธ Someone screenshots the mismatch in Slack
๐ Congratulations: you just scheduled a 90-minute "who's lying" meeting
๐ฅ Who fights who:
๐ฐ Finance Lead: "Visits = posted revenue events"
๐งพ Billing Lead: "Visits = billable encounters"
๐ Data Lead: "Please stop redefining reality in Google Sheets"
โ๏ธ Compliance (walks in late): "Which one is in the audit report?"
๐ก blast radar:
๐ฏ Patient Count KPI (exec dashboard)
๐ฅ ER Census (ops)
๐ Downstream RAG "patient lookup" confidence (counts stop matching)
๐ Fix it in 3 moves โ each step has a button
1 ๐ Pick the winning definition (or publish both, clearly labelled, like civilized people)
2 ๐ Write it down as a KPI contract (one paragraph, not a novel)
3 ๐ Enforce it in dbt (tests + semantic layer) so this argument can't respawn next week
traces to:
/api/trust-room 200
trust_metrics_spec.yml
evidence_links.md
๐ B3 DATA MARKETPLACE โ Mart Catalog + dbt Lineage
Can humans + AI pick the right dataset without join hell?
dbt + SQL ยท Gold Layer ยท BI / AI ready
๐ MART CATALOG โ ready-to-query data products
๐ฆ mart_er_triage
ER ops / census / triage
Grain 1 row = 1 ER visit
Consumers BI + AI + Ops
โ
precomputed โ
prejoined โ
precleaned โ
preaggregated
SELECT * FROM mart_er_triage;
๐ฆ mart_patient_summary
patient lookup / repeat visits
Grain 1 row = 1 patient
Consumers BI + AI
โ
precomputed โ
prejoined โ
precleaned โ
preaggregated
SELECT * FROM mart_patient_summary;
๐ฆ mart_claims_summary
billing / recon
Grain 1 row = 1 claim
Consumers Fin + Ops
โ
precomputed โ
prejoined โ
precleaned โ
preaggregated
SELECT * FROM mart_claims_summary;
๐งฌ LINEAGE PREVIEW โ where mart_er_triage comes from
โผ
โผ
โผ
consumed by โผ
๐ Executive Dashboard
๐ค AI Retrieval
๐ CONTRACT SNAPSHOT โ do we agree what the mart means?
visit = completed care encounter PASS
ER census = active ER encounters in reporting window PASS
patient = unique human receiving care PASS
claim = billable / reimbursable event PASS
๐ก Main message: precomputed + prejoined + precleaned + preaggregated so humans can SELECT * FROM mart_ instead of writing join hell.
traces to:
mart_catalog_ascii.md
lineage_ascii.md
sample_queries.sql
๐ B4 PIPELINE OPERATIONS DAG
What runs first? What depends on what? What breaks downstream?
tech: Airflow + Python + dbt + GitHub Actions
1 data/raw/
source healthcare data
โผ
2 ingest_raw.py
validate / load raw
โผ
3 identity_resolver.py
patient_identity_map.json
4 provider_cleaning.py
provider reference data
both must finish โผ
5 dbt build
bronze โ silver โ gold
โผ
6 dbt tests
not_null / unique
7 schema checks
contracts valid
8 recon checks
finance vs billing
all must pass โผ
9 quality_gate.py
PASS โ publish FAIL โ block + alert
โผ
10 mart_patient
precomputed mart
11 mart_visit
prejoined mart
12 mart_claims
finance mart
marts published together โผ
13 api_refresh
FastAPI / OpenAPI
portfolio/ = consumers only, not pipeline โผ
14 B1 dashboard
executive cockpit
15 B2 trust view
quality cockpit
16 AI consumers
RAG / agents
๐ DEPENDENCY RULES
1 โ 22 โ 3, 43, 4 โ 55 โ 6, 7, 8
6, 7, 8 โ 99 โ 10, 11, 1210, 11, 12 โ 1313 โ 14, 15, 16
๐ฅ BLAST RADIUS EXAMPLES
If (3) identity_resolver.py fails:
patient_identity_map.json fails โ dbt build may still run, but trust quality drops โ B2 Trust Dashboard turns yellow / red
If (6) dbt tests fail:
quality_gate.py blocks publish โ marts do not refresh โ API / dashboard serve last-known-good snapshot
If (9) quality_gate.py fails:
mart_patient / mart_visit / mart_claims blocked โ B1 Executive Dashboard shows degraded mode โ AI consumers do not receive bad data
โ
WHAT B4 PROVES
๐ Freshness data arrives + refreshes on schedule
๐ Reliability tasks run in dependency order
๐งฏ Recovery failed jobs retry, or block publish safely
๐ฅ Blast radius you know exactly what breaks downstream
traces to:
dag_ascii.md
runbook.md
๐ญ B5 WAREHOUSE EXPLORER
Do the tables exist โ and are they modelled so every join is safe?
dataset: healthcare_dw ยท BigQuery (dbt core models)
Real tables โ modelled as a star schema โ integrity enforced by dbt tests โ every number traces back to SQL.
๐ฉบ WAREHOUSE AT A GLANCE
1 dataset
11 tables
8 gold models
0 views
last refresh 12:51
health Healthy ๐ข
497 encounters
โญ THE STAR SCHEMA โ 7 conformed dimensions โ 1 fact
dim_patient
dim_doctor
dim_hospital
dim_diagnosis
dim_insurance
dim_medication
dim_date
7 FK relationships โ 1 fact โผ
โญ fact_patient_encounters
1 row = 1 encounter ยท 7 surrogate FKs + 8 measures
๐ฅ๐ฅ๐ฅ MEDALLION PATH โ where the star comes from
๐ฅ raw/healthcare_dataset.csv
โผ
โผ
int_encounters_enriched
int_readmissions
โผ
๐ฅ gold star schema 8 models
๐ INTEGRITY ENFORCED โ dbt tests that gate every build
โ
encounter_id โ not_null + unique โ no duplicate or ghost encounters
โ
7 FK relationships (patient/doctor/hospital/diagnosis/insurance/medication/date keys) โ joins can't silently drop rows
โ
accepted_values on is_emergency / is_readmission [0,1] โ clinical flags can't go dirty
๐งช These run on dbt test and gate dbt build. Honest scope: row-shape integrity (the cheap tests that stop silent FK drops) โ not semantic clinical validation.
๐ PROOF QUERY โ verified against the real dbt model
SELECT medical_condition, COUNT (*) AS encounters
FROM fact_patient_encounters
GROUP BY 1
ORDER BY encounters DESC ;
๐ 497 encounters (synthetic dataset). The star schema + enforced FKs are the skill โ they hold the same at 497 rows or 497M. Row count isn't the flex; the modelling is.
traces to:
/api/warehouse-room 200
dbt-project/models/marts/core/
warehouse_room_payload.json
๐ B6 SYSTEM ARCHITECTURE
How does the whole machine connect? (the 10-second version)
๐ฅ Sources EHR ยท claims ยท providers
โผ
โ๏ธ dbt โ ๐ญ BigQuery transform + test โ trusted marts
โผ
๐ API on Cloud Run serves the room payloads
โผ
๐ Humans B1 / B2 cockpit
๐ค Agents RAG ยท agent-allowed only
๐ API surface โ what Cloud Run serves
/api/control-room /api/trust-room /api/warehouse-room /api/retrieve /api/ask
๐ค L2 grounded agent โ answers grounded on trusted marts, every claim cites [doc N]
BM25 retrieves top-K from the redacted enriched corpus โ Gemini answers only from that evidence. No raw PII indexed.
[Ask grounded agent โ]
โฒ Why Cloud Run: stateless API, scales to zero when idle, one container, deploys from the same repo CI already guards.
traces to:
architecture.mmd
dependency_map.mmd