Development Documentation
View as:

Coding Conventions

Consistent conventions reduce cognitive load and prevent entire categories of bugs. This page covers naming, project structure, dbt model patterns, semantic model rules, and the verification checklist that must pass before every PR.

Naming Conventions

ElementConventionExamples
Warehouse columnssnake_casevendor_account_and_name, lf_loading_time, com_age
dbt models (dimensions)dim_ prefixdim_customer, dim_vendor, dim_item
dbt models (facts)fact_ prefixfact_delivery, fact_trade, fact_invoice
dbt models (staging)stg_ prefix + sourcestg_ax__inventtable, stg_dataverse__contacts
dbt models (intermediate)int_ prefix + domainint_finance__journal_entries, int_trading__positions
dbt sourcesGrouped by origin systemSubdirs: ax, dataverse, bd, datacollect, market_data, monitoring, sharepoint, vesper
Pipeline YAML fileskebab-casedbt-dev-build.yml, infra-deploy.yml
Deployment configsdeployment/ENV.ymldeployment/dev.yml, deployment/uat.yml
Parameter filesdeployment/parameter-ENV.ymldeployment/parameter-dev.yml

TMDL Naming Rules

Semantic model TMDL files have strict naming requirements because names must match the warehouse output exactly:

  • sourceColumn values must match warehouse column names exactly (snake_case)
  • Display names in TMDL are also snake_case -- no & or special characters
  • Report visual titles and column headers use the Display Name from data-dictionary.md (e.g., vendor_account_name displays as "Vendor Account Name")

The data dictionary is auto-generated. Run python scripts/generate_data_dictionary.py to refresh it. Override individual column display names via meta.display_name in dbt schema.yml or scripts/display_name_overrides.yml.

Project Structure

fabric_monorepo/
  dbt/
    models/
      staging/       <-- Views, 1:1 with sources, light rename/cast
        ax/
        dataverse/
        bd/
        datacollect/
        market_data/
        monitoring/
        sharepoint/
        vesper/
      intermediate/  <-- Views, joining/transforming staged data
      marts/         <-- Tables, business-ready dims and facts (70+)
        views/       <-- Consumer views, tagged consumer_view
        monitoring/  <-- ETL monitoring facts + dims, tagged monitoring
    macros/
      cross_db/      <-- Cross-dialect compatibility macros
    profiles.yml     <-- Connection targets (local, dev, ci, uat, prod)
  workspaces/
    semantic/        <-- TMDL files per semantic model
    reports/         <-- Report definitions (definition.pbir, report.json)
  security/
    roles.sql        <-- Database role definitions
    grants/          <-- Per-table GRANT/DENY statements
    rls/             <-- Row-level security predicates
    ddl/             <-- Schema DDL
  deployment/        <-- Per-environment config (dev.yml, uat.yml, prod.yml)
  terraform/         <-- Infrastructure as code
  pipelines/         <-- Azure DevOps pipeline YAML
  scripts/           <-- Python utility scripts
  functions/         <-- Azure Functions (ingestion handlers)
  docs-site/         <-- Documentation site (Next.js, MDX content)

dbt Model Patterns

Layer Rules

LayerMaterializationPurposeBusiness Logic
StagingView1:1 source mapping, rename, castNone -- pure projection
IntermediateViewCross-source joins, domain assemblyLight transformations
MartsTableBusiness-ready dimensions and factsFull business logic
Marts/viewsViewConsumer-facing slicesFiltered projections
Marts/monitoringTablePipeline health metricsAggregations

Exception: Monitoring staging models that cast non-varchar types must use materialized='table' to avoid DuckDB internal binding errors on empty stubs.

Dual-Dialect SQL

Every SQL file must work on both DuckDB and Fabric. The key patterns:

-- Recursive CTEs
{% if target.name in ('local', 'duckdb') %}WITH RECURSIVE{% else %}WITH{% endif %}

-- Date casting (always explicit precision)
cast(column as datetime2(6))

-- Left-padding (no lpad in T-SQL)
right('00' || cast(value as varchar), 2)

See Dual-Dialect Patterns for the complete reference.

Source Target Conditions

Source YAML files must exclude ALL local targets:

{% if target.name not in ('local', 'duckdb') %}
database: "Lakehouse_Bronze"
{% endif %}

Missing either target from the exclusion causes "Catalog does not exist" errors.

TMDL Conventions

  • sourceColumn must match the warehouse column name exactly
  • Calculated columns are not supported in DirectLake mode -- use DAX measures instead
  • Relationships are defined in relationships.tmdl, not per-table files
  • Metrics live in metrics.tmdl

Verification Checklist

This checklist must pass before presenting any code for review. It is the single most important quality gate in the project.

  • Column names in dbt output match TMDL sourceColumn values exactly
  • dbt contract in _marts__models.yml matches actual model output columns
  • Source YAML target conditions exclude BOTH local and duckdb targets
  • No bare datetime2 -- always datetime2(6) with explicit precision
  • No lpad() -- use portable right('00' || ...) pattern
  • Recursive CTEs use target-conditional WITH RECURSIVE
  • Any column rename is grepped across ALL layers: dbt contract, TMDL sourceColumn, TMDL display name, report.json, security SQL, DAX measures
  • --profiles-dir . included in every dbt command
  • dbt build --target local --profiles-dir . passes all tests (not just compiles)

Process Rules

Story completion: A story is NOT done until dbt build --target local --profiles-dir . has been explicitly run and all tests pass. "Compiles" does not mean "tests pass."

Epic completion: Per-story validation checklists are not sufficient. An epic is NOT done until an Epic Validation Runbook is produced in docs/ with step-by-step hands-on test instructions (exact commands, expected outputs, manual test scenarios).

Migration validation: When migrating from legacy PySpark to dbt, validate EVERY column's value logic -- not just names. Key divergence areas: CASE statement conditions, NULL handling (1900-01-01 to NULL), sign conventions, JOIN types (INNER vs LEFT), column assignment per row type, sort order values, aggregation formulas.

Report Conventions

  • Report visual titles use Display Names from the data dictionary
  • Colors and formatting follow brand guidelines (see .mex/context/brand-guidelines.md)
  • Report definitions are stored in workspaces/reports/ReportName.Report/

Related Pages