System Architecture

The Smart Data Platform is organized into four layers, each hosted in a dedicated Fabric workspace per environment. Data flows from external source systems through Bronze (raw storage), Gold (dbt transformations), Semantic (business-friendly models), and finally into Reports consumed by end users.

Architecture Diagram

graph TD
    subgraph Sources["External Sources"]
        DV["Dataverse / AX"]
        VS["Vesper API"]
        SP["SharePoint"]
        DC["Datacollect"]
        BD["Broker Data"]
    end

    subgraph Azure["Azure Resource Group: rg-fabric-dbt-platform"]
        KV["Key Vault<br/>kv-fabric-dbt-keys"]
        BLOB["Blob Storage<br/>gerisdbtartifacts"]
        FA["Function App<br/>func-fabric-ingest"]
        SWA["Static Web App<br/>Documentation Site"]
        TF["Terraform State<br/>Azure Storage"]
    end

    subgraph Fabric["Microsoft Fabric"]
        subgraph Bronze["Bronze Workspace"]
            SC["Lakehouse Shortcuts"]
        end

        subgraph Gold["Gold Workspace"]
            STG["Staging (views)"]
            INT["Intermediate (views)"]
            MRT["Marts (tables)"]
        end

        subgraph Semantic["Semantic Workspace"]
            SM["Semantic Models"]
        end

        subgraph Reports["Reports Workspace"]
            RPT["Power BI Reports"]
            APP["Workspace App"]
        end
    end

    subgraph DevOps["Azure DevOps: geris-devops"]
        PIPE["CI/CD Pipelines"]
        REPO["Git Repository"]
    end

    DV -->|shortcuts| SC
    SP -->|shortcuts| SC
    VS -->|API| FA
    DC -->|API| FA
    BD -->|API| FA
    FA --> SC

    SC --> STG --> INT --> MRT
    MRT -->|DirectLake| SM --> RPT --> APP
    MRT -->|SQL query| FA

    KV -.->|secrets| PIPE
    KV -.->|secrets| FA
    BLOB -.->|dbt artifacts| PIPE
    TF -.->|state| PIPE
    REPO --> PIPE -->|deploy| Fabric

Interactive Diagram

For a detailed, interactive version of this architecture diagram with clickable components, open the full-screen architecture diagram in a new tab.

Key Components

Component	Location	Purpose	Management
dbt project	`dbt/`	SQL models across staging, intermediate, and mart layers. Dual-dialect: DuckDB locally, T-SQL on Fabric. See Model Inventory.	Git + CI/CD pipelines
Terraform	`terraform/`	Provisions 4 Fabric workspaces, Gold Warehouse, Bronze Lakehouse, Dataverse shortcuts, Azure Function App, role assignments.	`infra-deploy.yml` pipeline
fabric-cicd	`deployment/`	Deploys semantic models and reports from git to Fabric workspaces. Uses parameter files for environment-specific GUIDs and connection strings.	`fabric-deploy.yml` pipeline
Security layer	`security/`	SQL scripts for warehouse roles, grants, column-level restrictions, and row-level security (RLS).	`security-deploy.yml` pipeline
Azure Functions	`functions/`	API ingestion (Datacollect, broker data), observability (pipeline metrics, CU monitoring on 15-min timer), and config-driven exports (SQL to Excel to email).	`functions-deploy.yml` pipeline
Scripts	`scripts/`	50+ utility scripts for deployment, validation, provisioning, data export, semantic model management, and monitoring.	Manual or pipeline-invoked

Pipeline Orchestration

After a git push to main, pipelines execute in a defined sequence: infra-deploy (Terraform) runs first, then fabric-deploy, security-deploy, and functions-deploy run in parallel, followed by dbt-dev-build. All use lockBehavior: sequential. See Pipeline Architecture for the full dependency chain, trigger model, and Key Vault integration.

External Dependencies

Dependency	Purpose	Notes
Azure Resource Group	`rg-fabric-dbt-platform` — contains all Azure resources	Function App, Key Vault, Storage Account, Static Web App, App Insights
Microsoft Fabric	Lakehouse (Bronze), Warehouse (Gold), Semantic Models, Power BI Reports, OneLake storage	4 workspaces per env, all provisioned via Terraform + Fabric REST API
Azure DevOps	Git repo, CI/CD pipelines, service connections	Org: `geris-devops`, project: `insights-requests`
Azure Key Vault	`kv-fabric-dbt-keys` — SPN credentials, connection strings, deployment tokens	Single vault for all environments
Azure Blob Storage	`gerisdbtartifacts` — dbt manifest.json, compiled artifacts	One storage account, per-env blob prefix
Azure Function App	`func-fabric-ingest-ENV` — API ingestion, exports, observability	Linux Consumption plan, Python 3.11, System-assigned MI
Static Web App	`swa-geris-docs-ENV` — documentation site	Free tier, Entra ID auth (pending IT)
Source Systems	Dataverse/AX, Vesper, SharePoint, Datacollect, Broker Data	See Data Sources for full inventory

What Does NOT Exist Here

Understanding what the platform is NOT helps set correct expectations:

No frontend/UI application -- end users consume data through Power BI reports and the Fabric Workspace App. There is no custom web application.
No real-time/streaming data -- all data is batch-processed via dbt builds (nightly full build + slim CI on PRs). Dataverse shortcuts provide near real-time Bronze data, but Gold is batch-refreshed.
No direct database writes from users -- Gold Warehouse is read-only for consumers. Only dbt and security scripts write to it.
No PySpark notebooks -- the legacy datalake/ notebooks are reference-only. All new transforms are dbt SQL.

For additional "what we don't use" decisions (no per-env Key Vaults, no variable groups, no Bicep, etc.), see Technology Stack — What We Deliberately Do NOT Use.

Explore Further

Data Flow Pipeline -- detailed source-to-report data flow with ingestion methods
Workspace Layout -- the 4-workspace model, management methods, and environment isolation
Technology Stack -- tools, versions, key decisions, and what we deliberately avoid
FAQ -- common questions about architecture decisions