System Architecture
The Smart Data Platform is organized into four layers, each hosted in a dedicated Fabric workspace per environment. Data flows from external source systems through Bronze (raw storage), Gold (dbt transformations), Semantic (business-friendly models), and finally into Reports consumed by end users.
Architecture Diagram
graph TD
subgraph Sources["External Sources"]
DV["Dataverse / AX"]
VS["Vesper API"]
SP["SharePoint"]
DC["Datacollect"]
BD["Broker Data"]
end
subgraph Azure["Azure Resource Group: rg-fabric-dbt-platform"]
KV["Key Vault<br/>kv-fabric-dbt-keys"]
BLOB["Blob Storage<br/>gerisdbtartifacts"]
FA["Function App<br/>func-fabric-ingest"]
SWA["Static Web App<br/>Documentation Site"]
TF["Terraform State<br/>Azure Storage"]
end
subgraph Fabric["Microsoft Fabric"]
subgraph Bronze["Bronze Workspace"]
SC["Lakehouse Shortcuts"]
end
subgraph Gold["Gold Workspace"]
STG["Staging (views)"]
INT["Intermediate (views)"]
MRT["Marts (tables)"]
end
subgraph Semantic["Semantic Workspace"]
SM["Semantic Models"]
end
subgraph Reports["Reports Workspace"]
RPT["Power BI Reports"]
APP["Workspace App"]
end
end
subgraph DevOps["Azure DevOps: geris-devops"]
PIPE["CI/CD Pipelines"]
REPO["Git Repository"]
end
DV -->|shortcuts| SC
SP -->|shortcuts| SC
VS -->|API| FA
DC -->|API| FA
BD -->|API| FA
FA --> SC
SC --> STG --> INT --> MRT
MRT -->|DirectLake| SM --> RPT --> APP
MRT -->|SQL query| FA
KV -.->|secrets| PIPE
KV -.->|secrets| FA
BLOB -.->|dbt artifacts| PIPE
TF -.->|state| PIPE
REPO --> PIPE -->|deploy| Fabric
Interactive Diagram
For a detailed, interactive version of this architecture diagram with clickable components, open the full-screen architecture diagram in a new tab.
Key Components
| Component | Location | Purpose | Management |
|---|---|---|---|
| dbt project | dbt/ | SQL models across staging, intermediate, and mart layers. Dual-dialect: DuckDB locally, T-SQL on Fabric. See Model Inventory. | Git + CI/CD pipelines |
| Terraform | terraform/ | Provisions 4 Fabric workspaces, Gold Warehouse, Bronze Lakehouse, Dataverse shortcuts, Azure Function App, role assignments. | infra-deploy.yml pipeline |
| fabric-cicd | deployment/ | Deploys semantic models and reports from git to Fabric workspaces. Uses parameter files for environment-specific GUIDs and connection strings. | fabric-deploy.yml pipeline |
| Security layer | security/ | SQL scripts for warehouse roles, grants, column-level restrictions, and row-level security (RLS). | security-deploy.yml pipeline |
| Azure Functions | functions/ | API ingestion (Datacollect, broker data), observability (pipeline metrics, CU monitoring on 15-min timer), and config-driven exports (SQL to Excel to email). | functions-deploy.yml pipeline |
| Scripts | scripts/ | 50+ utility scripts for deployment, validation, provisioning, data export, semantic model management, and monitoring. | Manual or pipeline-invoked |
Pipeline Orchestration
After a git push to main, pipelines execute in a defined sequence: infra-deploy (Terraform) runs first, then fabric-deploy, security-deploy, and functions-deploy run in parallel, followed by dbt-dev-build. All use lockBehavior: sequential. See Pipeline Architecture for the full dependency chain, trigger model, and Key Vault integration.
External Dependencies
| Dependency | Purpose | Notes |
|---|---|---|
| Azure Resource Group | rg-fabric-dbt-platform — contains all Azure resources | Function App, Key Vault, Storage Account, Static Web App, App Insights |
| Microsoft Fabric | Lakehouse (Bronze), Warehouse (Gold), Semantic Models, Power BI Reports, OneLake storage | 4 workspaces per env, all provisioned via Terraform + Fabric REST API |
| Azure DevOps | Git repo, CI/CD pipelines, service connections | Org: geris-devops, project: insights-requests |
| Azure Key Vault | kv-fabric-dbt-keys — SPN credentials, connection strings, deployment tokens | Single vault for all environments |
| Azure Blob Storage | gerisdbtartifacts — dbt manifest.json, compiled artifacts | One storage account, per-env blob prefix |
| Azure Function App | func-fabric-ingest-ENV — API ingestion, exports, observability | Linux Consumption plan, Python 3.11, System-assigned MI |
| Static Web App | swa-geris-docs-ENV — documentation site | Free tier, Entra ID auth (pending IT) |
| Source Systems | Dataverse/AX, Vesper, SharePoint, Datacollect, Broker Data | See Data Sources for full inventory |
What Does NOT Exist Here
Understanding what the platform is NOT helps set correct expectations:
- No frontend/UI application -- end users consume data through Power BI reports and the Fabric Workspace App. There is no custom web application.
- No real-time/streaming data -- all data is batch-processed via dbt builds (nightly full build + slim CI on PRs). Dataverse shortcuts provide near real-time Bronze data, but Gold is batch-refreshed.
- No direct database writes from users -- Gold Warehouse is read-only for consumers. Only dbt and security scripts write to it.
- No PySpark notebooks -- the legacy
datalake/notebooks are reference-only. All new transforms are dbt SQL.
For additional "what we don't use" decisions (no per-env Key Vaults, no variable groups, no Bicep, etc.), see Technology Stack — What We Deliberately Do NOT Use.
Explore Further
- Data Flow Pipeline -- detailed source-to-report data flow with ingestion methods
- Workspace Layout -- the 4-workspace model, management methods, and environment isolation
- Technology Stack -- tools, versions, key decisions, and what we deliberately avoid
- FAQ -- common questions about architecture decisions