Development Documentation
View as:

Local Development Setup

This page walks through setting up a local development environment from scratch. After completing these steps, you will be able to build and test the entire dbt pipeline locally using DuckDB, with no Fabric connectivity required.

System Requirements

The local-first loop runs the entire dbt graph against a DuckDB file under dev-data/. A handful of models — notably fact_inventory_snapshot, dim_logistics, fact_trade — materialise intermediate results on the order of tens of GB during their build. DuckDB spills those intermediates to DBT_DUCKDB_TEMP (default dev-data/.duckdb_tmp/). Plan for enough free disk.

ResourceMinimumRecommendedNotes
RAM16 GB32 GB+DuckDB's default memory_limit=80% — more RAM = fewer disk spills = much faster fact_inventory_snapshot builds. With 16 GB you'll still build everything, just slowly.
Free disk (SSD)80 GB150 GB+dev-data/fabric_datalake.duckdb (~1.6 GB) + Bronze parquets (~5 GB) + peak spill dev-data/.duckdb_tmp/ can reach 60 GB during fact_inventory_snapshot. NVMe strongly preferred over SATA — spill read/write is the dominant wait.
CPU4 cores8+ coresdbt/profiles.yml local runs 8 threads; DuckDB also uses intra-query parallelism within each model.
OSWindows 11 / macOS / LinuxRepository is developed on Windows 11. Scripts use Unix shell (run via Git Bash on Windows).
Python3.103.11See table below.
Close DuckDB viewers before buildingDBeaver or VS Code "Power User for dbt" keep the .duckdb file locked and will block dbt with "file being used by another process" errors. Close them (or connect read-only) during a build.

Running provisioning (fabric dev start …) additionally needs an az login session and the platform SPN credentials loaded from Key Vault (handled by scripts/_lib_spn.sh).

Prerequisites

ToolVersionPurposeInstall
Python3.11 (3.10-3.12 accepted; 3.13 incompatible)dbt runtime, scriptspython.org
Node.jsLTS v20+MCP servers (npx-based)nodejs.org
Azure CLILatestFabric authentication, deploymentswinget install Microsoft.AzureCLI
Terraform>= 1.8Feature environment provisioningterraform.io
ODBC Driver 18LatestDirect Fabric Warehouse connectionsMicrosoft docs
GitLatestVersion controlIncluded with Azure DevOps access
DockerLatest (optional)Terraform MCP server onlyDocker Desktop

Why Python 3.11 specifically? The dbt-fabric adapter is incompatible with Python 3.13. While 3.10-3.12 all work, 3.11 is recommended for consistency across the team and CI.

Step-by-Step Setup

1. Clone the Repository

git clone https://geris-devops@dev.azure.com/geris-devops/insights-requests/_git/fabric_monorepo
cd fabric_monorepo

2. Configure Git Hooks

git config core.hooksPath .githooks

This enables the drift-check post-commit hook and the gitleaks pre-commit secret scanner.

3. Install Python Dependencies

The easiest path is to run any scripts/fabric … subcommand — the wrapper auto-bootstraps a local .venv/ with Python 3.11/3.12, installs dbt-core / dbt-fabric==1.9.8 / dbt-duckdb==1.10.1 plus runtime extras (pyarrow, azure-storage-file-datalake, deltalake), and runs dbt deps on first use. Subsequent invocations are a no-op.

sync_cloud_parquets.py reads cloud_only tables directly from OneLake Delta storage (abfss://<workspace_id>@onelake.dfs.fabric.microsoft.com/<warehouse_id>/Tables/dbo/<table>/) via the deltalake library, bypassing the Warehouse SQL endpoint. This reduces sync time for large tables like fact_inventory_snapshot from ~10 minutes to under a minute, and consumes zero Warehouse CU.

If you prefer to do it manually:

bash scripts/setup-local.sh      # creates .venv/, installs dbt, runs dbt deps
pip install uv                   # provides uvx for Python-based MCP servers

Note: dbt-fabric and dbt-duckdb are pip packages, NOT dbt packages. Adding them to packages.yml breaks dbt deps.

4. (Optional) Refresh dbt Packages

cd dbt && dbt deps --profiles-dir .

This step is already performed by setup-local.sh / the fabric wrapper; re-run it only if dbt/packages.yml changes.

5. Verify Local Build

cd dbt && dbt build --target local --profiles-dir .

This runs the full pipeline against DuckDB using Parquet seed data from dev-data/. If this succeeds, your local environment is correctly set up.

6. Authenticate for Fabric Targets (Optional)

If you need to run against the DEV Fabric Warehouse:

az login

Then set environment variables and build:

export FABRIC_SERVER="<warehouse-sql-endpoint>.datawarehouse.fabric.microsoft.com"
export FABRIC_DATABASE="Gold_Warehouse"
cd dbt && dbt build --target dev --profiles-dir .

All Fabric targets use authentication: CLI -- no SPN credentials needed locally.

Understanding the Local Build

DuckDB and Parquet Seed Data

The local target uses a DuckDB file at dev-data/fabric_datalake.duckdb. On each run, the load_parquet_sources() macro bootstraps DuckDB by reading Parquet files from dev-data/. These Parquet files are representative samples of production data, committed to the repository.

The duckdb target (used by CI smoke tests) runs in-memory for even faster execution. Both targets produce the same results -- the difference is persistence.

What --profiles-dir . Does

The profiles.yml file lives in the dbt/ directory, not in the default ~/.dbt/ location. Every dbt command requires --profiles-dir . to find it. Forgetting this flag produces a "profile not found" error.

Available Targets

TargetEngineUse Case
localDuckDB (file)Local development, manual testing
duckdbDuckDB (in-memory)CI smoke tests
devFabric WarehouseDEV environment
ciFabric WarehouseCI slim builds
uatFabric WarehouseUAT environment
prodFabric WarehouseProduction (pipelines only)
feat-NAMEFabric WarehouseFeature environments (auto-generated)

MCP Server Setup

The project includes .mcp.json in the repository root with shared MCP server configurations. Claude Code (and other MCP-compatible tools) pick them up automatically. Nine servers are configured:

ServerRuntimeAuthPurpose
microsoft-learnRemote HTTPNoneMicrosoft Learn docs search/fetch
powerbi-modelingnpxBrowser loginSemantic model editing: TMDL, DAX
fabric-prodevnpxNoneFabric API specs, schemas, best practices
azure-devopsnpxBrowser loginPipelines, work items, PRs, wiki
dbt-coreuvxNoneLineage, impact analysis, column tracing
terraformDockerNoneTerraform Registry docs and module search
azurenpxaz login276 tools across 57 Azure services
fabric-opsuvxaz loginRead-only Fabric operational intel
duckdbuvxNoneQuery local DuckDB file

First-use authentication: Servers that require browser login (powerbi-modeling, azure-devops) will open a browser on the first tool call. Credentials are cached after that. Servers that require az login (azure, fabric-ops) need an active Azure CLI session.

Verifying MCP Servers

# No-auth servers (should start and exit cleanly):
npx -y @microsoft/fabric-mcp@latest server start --mode all < /dev/null
python -m uv tool run mcp-server-motherduck --db-path ./dbt/fabric_datalake.duckdb < /dev/null

# Auth-required servers (may prompt browser login):
npx -y @azure/mcp@latest server start < /dev/null
npx -y @azure-devops/mcp geris-devops < /dev/null

Pre-Commit Hooks (Secret Scanning)

The repository uses gitleaks via the pre-commit framework. Set up once per clone:

pip install pre-commit
pre-commit install

Every git commit will automatically scan staged files for secret patterns. The CI pipeline also runs gitleaks as a safety net.

Common Issues

SymptomCauseFix
"profile not found"Missing --profiles-dir .Add --profiles-dir . to every dbt command
"Env var required but not provided"dbt parses ALL targets at startupAdd empty defaults: env_var('X', '')
DuckDB passes, Fabric failsDialect differences (case, datetime2)See Dual-Dialect Patterns
"uvx not found"uv not installedRun pip install uv
"npx not found"Node.js not installedInstall Node.js LTS v20+
"docker not found" (terraform MCP only)Docker not installed or not runningInstall Docker Desktop, ensure daemon is running
Pipeline not triggering after pushYAML parse error in pipeline fileTry manual queue -- the API returns the parse error
"Cannot find the object" in security scriptsFabric returns error code 15151Check for 15151, not "Invalid object name"

Related Pages