Coding Conventions
Consistent conventions reduce cognitive load and prevent entire categories of bugs. This page covers naming, project structure, dbt model patterns, semantic model rules, and the verification checklist that must pass before every PR.
Naming Conventions
| Element | Convention | Examples |
|---|---|---|
| Warehouse columns | snake_case | vendor_account_and_name, lf_loading_time, com_age |
| dbt models (dimensions) | dim_ prefix | dim_customer, dim_vendor, dim_item |
| dbt models (facts) | fact_ prefix | fact_delivery, fact_trade, fact_invoice |
| dbt models (staging) | stg_ prefix + source | stg_ax__inventtable, stg_dataverse__contacts |
| dbt models (intermediate) | int_ prefix + domain | int_finance__journal_entries, int_trading__positions |
| dbt sources | Grouped by origin system | Subdirs: ax, dataverse, bd, datacollect, market_data, monitoring, sharepoint, vesper |
| Pipeline YAML files | kebab-case | dbt-dev-build.yml, infra-deploy.yml |
| Deployment configs | deployment/ENV.yml | deployment/dev.yml, deployment/uat.yml |
| Parameter files | deployment/parameter-ENV.yml | deployment/parameter-dev.yml |
TMDL Naming Rules
Semantic model TMDL files have strict naming requirements because names must match the warehouse output exactly:
sourceColumnvalues must match warehouse column names exactly (snake_case)- Display names in TMDL are also snake_case -- no
&or special characters - Report visual titles and column headers use the Display Name from
data-dictionary.md(e.g.,vendor_account_namedisplays as "Vendor Account Name")
The data dictionary is auto-generated. Run python scripts/generate_data_dictionary.py to refresh it. Override individual column display names via meta.display_name in dbt schema.yml or scripts/display_name_overrides.yml.
Project Structure
fabric_monorepo/
dbt/
models/
staging/ <-- Views, 1:1 with sources, light rename/cast
ax/
dataverse/
bd/
datacollect/
market_data/
monitoring/
sharepoint/
vesper/
intermediate/ <-- Views, joining/transforming staged data
marts/ <-- Tables, business-ready dims and facts (70+)
views/ <-- Consumer views, tagged consumer_view
monitoring/ <-- ETL monitoring facts + dims, tagged monitoring
macros/
cross_db/ <-- Cross-dialect compatibility macros
profiles.yml <-- Connection targets (local, dev, ci, uat, prod)
workspaces/
semantic/ <-- TMDL files per semantic model
reports/ <-- Report definitions (definition.pbir, report.json)
security/
roles.sql <-- Database role definitions
grants/ <-- Per-table GRANT/DENY statements
rls/ <-- Row-level security predicates
ddl/ <-- Schema DDL
deployment/ <-- Per-environment config (dev.yml, uat.yml, prod.yml)
terraform/ <-- Infrastructure as code
pipelines/ <-- Azure DevOps pipeline YAML
scripts/ <-- Python utility scripts
functions/ <-- Azure Functions (ingestion handlers)
docs-site/ <-- Documentation site (Next.js, MDX content)
dbt Model Patterns
Layer Rules
| Layer | Materialization | Purpose | Business Logic |
|---|---|---|---|
| Staging | View | 1:1 source mapping, rename, cast | None -- pure projection |
| Intermediate | View | Cross-source joins, domain assembly | Light transformations |
| Marts | Table | Business-ready dimensions and facts | Full business logic |
| Marts/views | View | Consumer-facing slices | Filtered projections |
| Marts/monitoring | Table | Pipeline health metrics | Aggregations |
Exception: Monitoring staging models that cast non-varchar types must use materialized='table' to avoid DuckDB internal binding errors on empty stubs.
Dual-Dialect SQL
Every SQL file must work on both DuckDB and Fabric. The key patterns:
-- Recursive CTEs
{% if target.name in ('local', 'duckdb') %}WITH RECURSIVE{% else %}WITH{% endif %}
-- Date casting (always explicit precision)
cast(column as datetime2(6))
-- Left-padding (no lpad in T-SQL)
right('00' || cast(value as varchar), 2)
See Dual-Dialect Patterns for the complete reference.
Source Target Conditions
Source YAML files must exclude ALL local targets:
{% if target.name not in ('local', 'duckdb') %}
database: "Lakehouse_Bronze"
{% endif %}
Missing either target from the exclusion causes "Catalog does not exist" errors.
TMDL Conventions
sourceColumnmust match the warehouse column name exactly- Calculated columns are not supported in DirectLake mode -- use DAX measures instead
- Relationships are defined in
relationships.tmdl, not per-table files - Metrics live in
metrics.tmdl
Verification Checklist
This checklist must pass before presenting any code for review. It is the single most important quality gate in the project.
- Column names in dbt output match TMDL
sourceColumnvalues exactly - dbt contract in
_marts__models.ymlmatches actual model output columns - Source YAML target conditions exclude BOTH
localandduckdbtargets - No bare
datetime2-- alwaysdatetime2(6)with explicit precision - No
lpad()-- use portableright('00' || ...)pattern - Recursive CTEs use target-conditional
WITH RECURSIVE - Any column rename is grepped across ALL layers: dbt contract, TMDL sourceColumn, TMDL display name, report.json, security SQL, DAX measures
-
--profiles-dir .included in every dbt command -
dbt build --target local --profiles-dir .passes all tests (not just compiles)
Process Rules
Story completion: A story is NOT done until dbt build --target local --profiles-dir . has been explicitly run and all tests pass. "Compiles" does not mean "tests pass."
Epic completion: Per-story validation checklists are not sufficient. An epic is NOT done until an Epic Validation Runbook is produced in docs/ with step-by-step hands-on test instructions (exact commands, expected outputs, manual test scenarios).
Migration validation: When migrating from legacy PySpark to dbt, validate EVERY column's value logic -- not just names. Key divergence areas: CASE statement conditions, NULL handling (1900-01-01 to NULL), sign conventions, JOIN types (INNER vs LEFT), column assignment per row type, sort order values, aggregation formulas.
Report Conventions
- Report visual titles use Display Names from the data dictionary
- Colors and formatting follow brand guidelines (see
.mex/context/brand-guidelines.md) - Report definitions are stored in
workspaces/reports/ReportName.Report/
Related Pages
- Developer Workflow -- Where conventions fit in the development lifecycle
- Dual-Dialect Patterns -- Complete cross-dialect reference
- PR & Code Review -- How conventions are enforced in review
- Data Dictionary -- Column-level display name reference