Development Documentation
View as:

System Architecture

The Smart Data Platform is organized into four layers, each hosted in a dedicated Fabric workspace per environment. Data flows from external source systems through Bronze (raw storage), Gold (dbt transformations), Semantic (business-friendly models), and finally into Reports consumed by end users.

Architecture Diagram

graph TD
    subgraph Sources["External Sources"]
        DV["Dataverse / AX"]
        VS["Vesper API"]
        SP["SharePoint"]
        DC["Datacollect"]
        BD["Broker Data"]
    end

    subgraph Azure["Azure Resource Group: rg-fabric-dbt-platform"]
        KV["Key Vault<br/>kv-fabric-dbt-keys"]
        BLOB["Blob Storage<br/>gerisdbtartifacts"]
        FA["Function App<br/>func-fabric-ingest"]
        SWA["Static Web App<br/>Documentation Site"]
        TF["Terraform State<br/>Azure Storage"]
    end

    subgraph Fabric["Microsoft Fabric"]
        subgraph Bronze["Bronze Workspace"]
            SC["Lakehouse Shortcuts"]
        end

        subgraph Gold["Gold Workspace"]
            STG["Staging (views)"]
            INT["Intermediate (views)"]
            MRT["Marts (tables)"]
        end

        subgraph Semantic["Semantic Workspace"]
            SM["Semantic Models"]
        end

        subgraph Reports["Reports Workspace"]
            RPT["Power BI Reports"]
            APP["Workspace App"]
        end
    end

    subgraph DevOps["Azure DevOps: geris-devops"]
        PIPE["CI/CD Pipelines"]
        REPO["Git Repository"]
    end

    DV -->|shortcuts| SC
    SP -->|shortcuts| SC
    VS -->|API| FA
    DC -->|API| FA
    BD -->|API| FA
    FA --> SC

    SC --> STG --> INT --> MRT
    MRT -->|DirectLake| SM --> RPT --> APP
    MRT -->|SQL query| FA

    KV -.->|secrets| PIPE
    KV -.->|secrets| FA
    BLOB -.->|dbt artifacts| PIPE
    TF -.->|state| PIPE
    REPO --> PIPE -->|deploy| Fabric

Interactive Diagram

For a detailed, interactive version of this architecture diagram with clickable components, open the full-screen architecture diagram in a new tab.

Key Components

ComponentLocationPurposeManagement
dbt projectdbt/SQL models across staging, intermediate, and mart layers. Dual-dialect: DuckDB locally, T-SQL on Fabric. See Model Inventory.Git + CI/CD pipelines
Terraformterraform/Provisions 4 Fabric workspaces, Gold Warehouse, Bronze Lakehouse, Dataverse shortcuts, Azure Function App, role assignments.infra-deploy.yml pipeline
fabric-cicddeployment/Deploys semantic models and reports from git to Fabric workspaces. Uses parameter files for environment-specific GUIDs and connection strings.fabric-deploy.yml pipeline
Security layersecurity/SQL scripts for warehouse roles, grants, column-level restrictions, and row-level security (RLS).security-deploy.yml pipeline
Azure Functionsfunctions/API ingestion (Datacollect, broker data), observability (pipeline metrics, CU monitoring on 15-min timer), and config-driven exports (SQL to Excel to email).functions-deploy.yml pipeline
Scriptsscripts/50+ utility scripts for deployment, validation, provisioning, data export, semantic model management, and monitoring.Manual or pipeline-invoked

Pipeline Orchestration

After a git push to main, pipelines execute in a defined sequence: infra-deploy (Terraform) runs first, then fabric-deploy, security-deploy, and functions-deploy run in parallel, followed by dbt-dev-build. All use lockBehavior: sequential. See Pipeline Architecture for the full dependency chain, trigger model, and Key Vault integration.

External Dependencies

DependencyPurposeNotes
Azure Resource Grouprg-fabric-dbt-platform — contains all Azure resourcesFunction App, Key Vault, Storage Account, Static Web App, App Insights
Microsoft FabricLakehouse (Bronze), Warehouse (Gold), Semantic Models, Power BI Reports, OneLake storage4 workspaces per env, all provisioned via Terraform + Fabric REST API
Azure DevOpsGit repo, CI/CD pipelines, service connectionsOrg: geris-devops, project: insights-requests
Azure Key Vaultkv-fabric-dbt-keys — SPN credentials, connection strings, deployment tokensSingle vault for all environments
Azure Blob Storagegerisdbtartifacts — dbt manifest.json, compiled artifactsOne storage account, per-env blob prefix
Azure Function Appfunc-fabric-ingest-ENV — API ingestion, exports, observabilityLinux Consumption plan, Python 3.11, System-assigned MI
Static Web Appswa-geris-docs-ENV — documentation siteFree tier, Entra ID auth (pending IT)
Source SystemsDataverse/AX, Vesper, SharePoint, Datacollect, Broker DataSee Data Sources for full inventory

What Does NOT Exist Here

Understanding what the platform is NOT helps set correct expectations:

  • No frontend/UI application -- end users consume data through Power BI reports and the Fabric Workspace App. There is no custom web application.
  • No real-time/streaming data -- all data is batch-processed via dbt builds (nightly full build + slim CI on PRs). Dataverse shortcuts provide near real-time Bronze data, but Gold is batch-refreshed.
  • No direct database writes from users -- Gold Warehouse is read-only for consumers. Only dbt and security scripts write to it.
  • No PySpark notebooks -- the legacy datalake/ notebooks are reference-only. All new transforms are dbt SQL.

For additional "what we don't use" decisions (no per-env Key Vaults, no variable groups, no Bicep, etc.), see Technology Stack — What We Deliberately Do NOT Use.

Explore Further

  • Data Flow Pipeline -- detailed source-to-report data flow with ingestion methods
  • Workspace Layout -- the 4-workspace model, management methods, and environment isolation
  • Technology Stack -- tools, versions, key decisions, and what we deliberately avoid
  • FAQ -- common questions about architecture decisions