ADR Decision Log

This page records all significant architectural and technical decisions made during the platform build. Each ADR (Architecture Decision Record) captures not just what was decided, but why — including the alternatives that were considered and rejected.

Status legend:

Active — decision is in effect and guiding implementation
GO-LIVE BLOCKER — must be resolved before production cutover (Epic 11)
Blocked — decision is accepted but implementation is blocked by an external dependency
Superseded — replaced by a newer decision

ADR-1: 4-Workspace Layout (Gold/Bronze/Semantic/Reports)

Date: 2026-03-25 | Status: Active

Decision: Each environment has 4 Fabric workspaces: Gold (Terraform-only, no git), Bronze (git-sync, Lakehouse), Semantic (git-sync, TMDL), Reports (git-sync or fabric-cicd).

Reasoning: Terraform-created items (Warehouse) and git-synced items cannot coexist in the same workspace. SPNs cannot CommitToGit, PreferRemote on non-empty workspace removes everything, and .platform logicalIds are workspace-specific. Separating workspaces by lifecycle avoids all of these conflicts.

Alternatives considered:

Single workspace — rejected because Terraform + git sync conflict causes data loss
2-workspace split — rejected because semantic models and reports need separate deployment lifecycles (models deploy before reports, reports reference model GUIDs)

Consequences: More workspaces to manage (16 total across 4 environments), but clean separation of concerns. Gold Warehouse is never git-synced. Each workspace has a clear owner (Terraform vs git).

ADR-2: CLI Authentication for dbt-fabric (Not SPN)

Date: 2026-03-20 | Status: Active

Decision: All Fabric targets use authentication: CLI in dbt profiles instead of ServicePrincipal.

Reasoning: ODBC Driver 18's ActiveDirectoryServicePrincipal auth times out on Azure DevOps Ubuntu hosted agents due to a libmsal issue. az account get-access-token works reliably from the same agents with the same SPN credentials.

Alternatives considered:

ServicePrincipal auth — rejected due to unreliable timeouts in CI (30-60s hangs followed by connection failure)
Managed Identity — not supported by dbt-fabric adapter

Consequences: All dbt build steps must run inside AzureCLI@2 tasks (not bare script: tasks). Local development uses az login before running dbt commands. This adds a dependency on the Azure CLI being available in the execution environment.

ADR-3: DuckDB for Local Development

Date: 2026-03-18 | Status: Active

Decision: Local dbt development uses DuckDB (--target local) with Parquet seed files in dev-data/.

Reasoning: Fast iteration without requiring a Fabric connection. The full test suite runs in seconds locally vs minutes against Fabric. Developers can work offline and on planes.

Alternatives considered:

Always-online against DEV Fabric — rejected because it's slow (minutes per build), blocks on network issues, and costs CU capacity for every developer iteration

Consequences: Must maintain dual-dialect SQL (DuckDB + T-SQL). Source YAML must exclude local and duckdb targets. The load_parquet_sources() macro bootstraps DuckDB from Parquet files on run-start. Some Fabric-specific SQL behaviors (case sensitivity, varchar defaults) are invisible locally — the CI pipeline catches these.

ADR-4: Keep Dataverse Shortcuts (No Direct Extraction)

Date: 2026-03-22 | Status: Active

Decision: Bronze layer uses Fabric Lakehouse shortcuts to Dataverse tables, not custom extraction pipelines.

Reasoning: Zero-maintenance data ingestion — Microsoft handles the sync for approximately 100 source tables. No ETL code to write, test, or maintain for the bulk of source data.

Alternatives considered:

Custom Dataverse extraction pipelines — rejected due to maintenance burden (would need to handle schema changes, incremental loads, error handling for each table)
Synapse Link — rejected because it requires additional infrastructure and licensing

Consequences: Dependent on Microsoft's shortcut sync latency (typically minutes, occasionally hours). Cannot transform data at the Bronze layer — all transformations happen in dbt (Gold). If a Dataverse table schema changes, dbt sources need updating but no extraction code changes.

ADR-5: Single Key Vault for All Environments

Date: 2026-03-20 | Status: Active

Decision: One Azure Key Vault (kv-fabric-dbt-keys) stores secrets for all environments (DEV, UAT, PROD).

Reasoning: Simplicity — fewer resources to manage, single source of truth for SPN credentials. The project's scale (3 environments, 2 SPNs) does not justify the operational overhead of per-environment vaults.

Alternatives considered:

Per-environment Key Vaults — rejected as unnecessary complexity for this project's scale

Consequences: Secret names must be environment-prefixed if they differ per environment. All pipelines reference the same vault. Access control is at the vault level, not per-environment.

ADR-6: Side-by-Side Deployment (No In-Place Migration)

Date: 2026-03-18 | Status: Active

Decision: The new dbt platform is built entirely in new workspaces alongside the legacy PySpark system. Cutover switches users to new workspaces after full validation.

Reasoning: Zero risk to production reporting during migration. The legacy system stays untouched until the new system is proven through parallel run comparison and stakeholder sign-off.

Alternatives considered:

In-place migration — rejected because modifying existing production workspaces risks breaking reports during development. Any bug would immediately affect business users.

Consequences: Must validate every model output against legacy PySpark output before cutover (parallel run comparison). Old datalake/ notebooks are reference-only. Rollback is trivially simple — just point users back to old workspace. See Go-Live Checklist.

ADR-7: Cherry-Pick Promotion (Not Branch Merge)

Date: 2026-03-25 | Status: Active

Decision: Changes are promoted from DEV to UAT to PROD via cherry-pick, not full branch merge. Each environment branch (main, release-uat, release-prod) is independently managed.

Reasoning: Selective promotion allows deploying specific features or fixes to UAT/PROD without carrying all DEV-only changes. Full branch merges would bring experimental or in-progress work to production.

Alternatives considered:

Full branch merge (GitFlow-style) — rejected because DEV accumulates experimental changes that should not reach PROD until explicitly selected

Consequences: Cherry-pick conflicts must be resolved manually. Each environment's deployment config (deployment/ENV.yml, deployment/parameter-ENV.yml) is independently maintained. The auto-changelog tool tracks which commits have been promoted.

ADR-8: Export UI Auth Bypassed for Testing

Date: 2026-04-12 | Status: GO-LIVE BLOCKER

Decision: Auth checks on Export Manager endpoints (ui.py, api.py) are commented out with TODO markers. The export UI is currently open to anyone with the Function App URL.

Reasoning: The DEV Function App is manually provisioned (provision_function_app = false), so the Terraform auth_settings_v2 block was never applied. Easy Auth requires an App Registration + Function App auth config that doesn't exist on DEV yet.

Fix before go-live:

Create App Registration for the Function App
Enable Easy Auth (Authentication blade) pointing to that App Registration
Set groupMembershipClaims: SecurityGroup in the app manifest
Uncomment auth checks in functions/exports/endpoints/ui.py and api.py (search TODO: Re-enable)
Set ALLOWED_GROUP_IDS app setting with group IDs from terraform/environments/dev/terraform.tfvars

Consequences: Until fixed, anyone with the URL can trigger exports and download data. This is acceptable in DEV but not in production.

ADR-9: Export Email Sender Not Configured

Date: 2026-04-12 | Status: GO-LIVE BLOCKER

Decision: The export system's email sender is set to daan.aerts@geris.nl as a placeholder. Email sending does not work — the Function App's Managed Identity lacks Mail.Send permission.

Reasoning: A dedicated shared mailbox (e.g., exports@geris.nl) will be created for production use. The Download button on the UI allows testing the full query-to-Excel pipeline without email.

Fix before go-live:

Create the dedicated shared mailbox
Grant Mail.Send application permission (Microsoft Graph) to the Function App MI
Scope the permission to the shared mailbox (Application Access Policy)
Admin-consent the permission in Entra ID
Update EXPORT_SENDER_EMAIL and exports.yml sender
Test end-to-end: trigger export from UI, verify email arrives

Consequences: The Send button fails until fixed. Use the Download button for testing.

ADR-10: CU Utilization Monitoring Blocked by Trial Capacity

Date: 2026-04-13 | Status: Blocked (pre-live)

Decision: CU utilization data will remain empty until the platform moves off trial capacity.

Reasoning: The Fabric Admin API (/v1/admin/capacities) requires the Managed Identity to be a Capacity Admin. The current capacity is a trial started by someone else — Capacity Admin cannot be granted on a trial that is not owned by the organization.

Fix before go-live:

Provision a paid Fabric capacity (F2+) owned by Geris
Add the Function App MI as Capacity Admin in Fabric admin portal
CU data starts flowing automatically — no code changes needed

Consequences: The CU utilization page in the ETL Monitoring report shows no data. All other monitoring (pipeline runs, dbt metrics, freshness SLA, ingestion health) works normally.

Operations Overview — how decisions affect daily operations
Architecture — system architecture informed by these decisions
Go-Live Checklist — go-live blockers in action
Troubleshooting — gotchas that led to some of these decisions

ADR Decision Log

ADR-1: 4-Workspace Layout (Gold/Bronze/Semantic/Reports)

ADR-2: CLI Authentication for dbt-fabric (Not SPN)

ADR-3: DuckDB for Local Development

ADR-4: Keep Dataverse Shortcuts (No Direct Extraction)

ADR-5: Single Key Vault for All Environments

ADR-6: Side-by-Side Deployment (No In-Place Migration)

ADR-7: Cherry-Pick Promotion (Not Branch Merge)

ADR-8: Export UI Auth Bypassed for Testing

ADR-9: Export Email Sender Not Configured

ADR-10: CU Utilization Monitoring Blocked by Trial Capacity

Related Pages