dbt cheat sheet: the interactive dbt reference, updated for the AI era
Commands
Not the man page. The verbs and flags that actually earn a spot in muscle memory.
Core verbs: your daily drivers
dbt build- Run, test, seed and snapshot every selected resource in DAG order, the one command most teams run.
dbt run- Materialize models (view / table / incremental …) in the warehouse.
dbt test- Execute data tests and unit tests defined in the project.
dbt seed- Load CSV files from
seeds/into the warehouse. dbt snapshot- Capture slowly-changing-dimension history for sources.
Mnemonic: build = run + test + seed + snapshot, all interleaved in dependency order.
Worth remembering
dbt list- Show which resources a selector matches (no run).
dbt show- Preview a model’s rows after transformation.
dbt retry- Re-run the last invocation from the point of failure.
dbt clone- Copy selected nodes from a state into the target schema(s).
dbt compile- Render SQL without executing. Inspect the Jinja.
dbt source freshness- Check whether sources are within their declared freshness SLA.
dbt parse- Parse the project and emit timing, useful in CI.
Setup & utilities
dbt init- Scaffold a new project and set up your
profiles.ymlconnection. dbt deps- Install the packages declared in
packages.yml/dependencies.yml. dbt debug- Check the install, the active profile and the warehouse connection.
dbt clean- Delete the folders in
clean-targets:target/,dbt_packages/. dbt docs generate- Build the documentation site and the
catalog.jsonartifact. dbt docs serve- Serve the generated docs site locally in your browser.
dbt run-operation- Invoke a macro directly:
dbt run-operation my_macro --args '{k: v}'.
Flags worth remembering
--select/-s- Choose which resources to act on. See Node selection.
--exclude- Remove resources from the
--selectset; identical syntax. --target/-t- Pick a target from
profiles.yml:-t prod. --defer- Resolve unselected refs from another state instead of rebuilding them.
--state- Path to a
manifest.json, required by state-aware selectors and--defer. --full-refresh- Rebuild incremental models and seeds from scratch.
--vars- Pass project vars:
--vars '{"start_date": "2024-01-01"}'. --threads- Override the thread count from your profile.
--store-failures- Persist failing test rows to a table for debugging.
--empty- Build schema-only (zero-row) models: a fast structural check for CI.
--sql1.12+- On
dbt run-operation, execute ad-hoc SQL directly. No macro file needed.
How dbt build runs
dbt build runs each model, then its tests, in dependency order. Step through it, widen each tick with --threads, and fail a test to see what gets skipped.
- running
- passed
- failed
- skipped
Trivial / one-time flags: --help, --version, --debug, --log-level, --profiles-dir, --project-dir.
Essentials
Materializations, tests and packages: the building blocks every dbt project leans on.
Materializations
Set with {{ config(materialized='…') }} or in dbt_project.yml.
view- Default. A
CREATE VIEW: no storage, always fresh, recomputed on every read. tableCREATE TABLE AS: fully rebuilt from scratch each run.incremental- Built once, then each run only inserts / merges new or changed rows.
ephemeral- Not built at all. Inlined as a CTE into the models that
ref()it. materialized_view- A warehouse-managed materialized view: dbt owns the definition, the warehouse keeps the data refreshed.
Tests
generic- Reusable data tests declared in YAML:
unique,not_null,accepted_values,relationships. singular- A one-off
.sqlfile intests/: any query that returns failing rows. unit1.8+- Test model logic against mock inputs and expected output, declared with
unit_tests:.
dbt build runs each model’s tests right after it; failures stop dependents. Add --store-failures to keep the offending rows.
Packages worth installing
Add to packages.yml, then run dbt deps. Browse the rest on the dbt Package hub.
dbt_utils- The standard toolbox: cross-database SQL, generic tests and helper macros.
dbt_project_evaluator- Audits your project against dbt's best-practice rules.
dbt_expectations- Great Expectations-style data quality tests.
audit_helper- Compare two relations to validate a refactor or migration.
elementary- Anomaly-detection tests and data observability reports.
dbt_artifacts- Models your run results and metadata for monitoring.
Incremental models
The materialization that transforms only new rows, and the one most likely to bite you. Strategies, microbatch, and schema changes.
Incremental strategies
Pick with config(materialized='incremental', incremental_strategy='…').
merge- Upsert on
unique_key: updates existing rows, inserts new ones. append- Insert new rows only. Fastest, but no de-duplication.
delete+insert- Delete rows matching the new batch, then insert: a merge without native
MERGE. insert_overwrite- Replace whole partitions (by
partition_by), not rows. Nounique_key; warehouse-specific (BigQuery, Spark, Snowflake). microbatch1.9+- Split the load into time-based batches by
event_time: robust, retryable backfills.
is_incremental() gates a where filter; the incremental_strategy decides how the rows it lets through land in the table. Flip the switches to see what each run writes to fct_orders.
select * from {{ source('shop', 'orders') }}
{% if is_incremental() %}
where updated_at > (select max(updated_at) from {{ this }})
{% endif %} dbt run -s fct_orders
fct_orders: before
rows this run processes
fct_orders: after
is_incremental() is true only when the table already exists and you didn’t pass --full-refresh.
Microbatch
A strategy for large event tables: one run, many small dated queries.
Microbatch is for big time-series tables. Instead of one query with an is_incremental() filter, dbt splits the run into one bounded query per time period, here per day. Each batch is independent, so dbt can run them in parallel and retry just the ones that fail.
{{ config(
materialized='incremental',
incremental_strategy='microbatch',
event_time='session_start',
batch_size='day',
begin='2024-01-01',
) }}
select * from {{ ref('page_views') }} - already loaded
- this run's batches
- failed batch
When a model’s columns change
An incremental run doesn’t rebuild the table, so a changed column list needs handling. The on_schema_change config decides what happens:
on_schema_change | You add a column | You remove a column |
|---|---|---|
ignore | Default. The new column is silently dropped: it never reaches the table. | dbt run fails. |
fail | The run fails with a schema-change error. | The run fails with a schema-change error. |
append_new_columns | The column is added to the table. | The column is kept; it is not dropped. |
sync_all_columns | The column is added to the table. | The column is dropped from the table (data-type changes included). |
No mode backfills a new column for old rows: they stay NULL until a --full-refresh.
Common patterns
Copy-pasteable recipes for the things you do every week.
- Slim CI: build only what changed since prod
dbt build -s state:modified+ --defer --state path/to/prod - Production build, skipping unit tests and unmodified views
dbt build --exclude "test_type:unit state:unmodified,config.materialized:view" --state path/to/prod - Resume after a failure (pick up what failed / skipped)
dbt retry - Rebuild incremental models from scratch
dbt run --full-refresh -s config.materialized:incremental - Test only what you just touched
dbt test -s state:modified+ --state path/to/prod - Preview what a selector would match (no run)
dbt list -s "tag:nightly,config.materialized:table" - Refresh only what has newer source data
dbt source freshness && dbt build -s source_status:fresher+ --state path/ - Let Claude or Cursor operate your project safely (see dbt + AI)
uvx dbt-mcp
Node selection
The most-Googled corner of dbt. Type a selector below and watch it light up the DAG.
Specifying resources
The --select flag accepts one or more args. Each arg can be one of:
- A package’s name
- A model name
- A fully-qualified path to a directory of models
- A selection method (
path:,tag:,config:,test_type:,test_name:etc)
Examples:
$ dbt run --select my_dbt_project_nameall models in your project$ dbt run --select my_dbt_modela specific model$ dbt run --select path.to.my.modelsall models in a specific directory$ dbt run --select my_package.some_modela specific model in a specific package$ dbt run --select tag:nightlymodels with the “nightly” tag$ dbt run --select path/to/modelsmodels contained in path/to/models$ dbt run --select path/to/my_model.sqla specific model by its path
Graph operators
Plus operator (+)
$ dbt run --select my_model+selectmy_modeland all children$ dbt run --select +my_modelselectmy_modeland all parents$ dbt run --select +my_model+selectmy_model, and all of its parents and children
N-plus operator
$ dbt run --select my_model+1selectmy_modeland its first-degree children$ dbt run --select 2+my_modelselectmy_model, its first-degree parents, and its second-degree parents (“grandparents”)$ dbt run --select 3+my_model+4selectmy_model, its parents up to the 3rd degree, and its children down to the 4th degree
At operator (@)
$ dbt run -s @my_modelselectmy_model, all its descendants, and the ancestors of those descendants
Wildcards (*, ?, [abc]) work inside selector strings. They aren’t graph operators. Use them within a name or a method:value.
Methods examples
tagdbt run -s "tag:nightly"sourcedbt run -s "source:snowplow+"resource_typedbt list -s "resource_type:test"pathdbt run -s "models/staging/github"packagedbt run -s "package:snowplow"config.Xdbt run -s "config.materialized:incremental"fqndbt run -s "fqn:my_project.marts.finance.fct_orders"filedbt run -s "file:fct_orders.sql"semantic_modeldbt parse -s "semantic_model:orders"saved_querydbt sl list -s "saved_query:weekly_revenue"test_typedbt test -s "test_type:generic"test_namedbt test -s "test_name:unique"statedbt run -s "state:modified" --state path/to/artifactsexposuredbt run -s "+exposure:weekly_kpis"metricdbt build -s "+metric:weekly_active_users"resultdbt run -s "result:error" --state path/to/artifactssource_statusdbt build -s "source_status:fresher+"groupdbt run -s "group:finance"accessdbt list -s "access:public"versiondbt list -s "version:latest"unit_testdbt list -s "unit_test:*"selector1.12+dbt run -s "selector:nightly_marts"
Set operators
Unions (space-delimited)
$ dbt run --select +snowplow_sessions +fct_ordersrunsnowplow_sessions, all ancestors ofsnowplow_sessions,fct_orders, and all ancestors offct_orders
Intersections (comma-separated)
$ dbt run --select +snowplow_sessions,+fct_ordersrun all the common ancestors ofsnowplow_sessionsandfct_orders$ dbt run --select marts.finance,tag:nightlyrun models that are in themarts/financesubdirectory and taggednightly
Excluding models
dbt provides an --exclude flag with the same semantics as --select. Models specified with the --exclude flag will be removed from the set of models selected with --select.
Example:
$ dbt run --select path:models/marts/finance --exclude fct_orders+the finance marts, minusfct_ordersand everything downstream of it
State
Some methods compare the project against a manifest.json from another run (a previous invocation, or production), passed via the --state flag.
$ dbt build -s "state:modified+" --defer --state path/to/artifacts
Selectors
state:newabsent from the comparison manifeststate:modifiednew nodes, plus any changed existing nodestate:unmodifiedexisting nodes with no changesstate:oldpresent in the comparison manifest
Narrow state:modified to one kind of change with sub-selectors: .body, .configs, .relation, .macros, .contract.
Defer
--defer lets you build a model without first building its parents: a ref() to an unbuilt parent resolves to another environment’s state instead. It needs --state. Flip the switches to see where model_c’s ref('model_b') resolves.
dbt build -s model_c
No --defer: ref('model_b') resolves to your dev schema, where model_b was never built. model_c fails with a 'relation not found' error.
dbt + AI · 2026
The part the original cheat sheet couldn’t have: how dbt works in the agent era.
dbt MCP server
A Model Context Protocol server that lets AI agents (Claude, Cursor, VS Code) operate your dbt project, safely and with governance.
Run it locally:
uvx dbt-mcplocal server: full dbt CLI access
Tool groups it exposes:
dbt CLIrun,build,test,compile,list,parse,show…Discoveryget_model_details,get_lineage,get_model_health,get_mart_models…Semantic Layerlist_metrics,query_metrics,get_dimensions…SQLexecute_sql,text_to_sqlAdmin APItrigger_job_run,list_jobs,retry_job_run…Codegengenerate_model_yaml,generate_source,generate_staging_model
A remote MCP server (one endpoint per environment) is also available for data-consumption tools without any local setup.
Agents & Copilot
Developer Agent- Write and refactor models, generate tests, and validate changes from natural language inside the Studio IDE.
Analyst Agent- Answer natural-language questions with governed numbers, powered by the dbt Semantic Layer.
dbt Copilot- One-click inline generation of SQL, models, tests and documentation.
dbt Canvas- Visual, drag-and-drop model building. Governance and lineage maintained automatically.
dbt Insights- An AI query interface to explore, validate and share results with full lineage awareness.
Fusion engine
The dbt engine, rewritten in Rust: what changes when your project runs on it.
What Fusion brings
A ground-up rewrite of the dbt engine in Rust, shipped as a single binary (v2.0).
- SQL comprehension: Fusion actually parses your SQL, so it catches column & type errors before a run.
- Language server (LSP): go-to-definition, autocomplete and live errors in the editor.
- Faster parse / compile, with native ADBC warehouse drivers.
- Adds the session verbs
environment,invocation,cancel,reattach. - Deprecated flags are removed, most notably
--models/-m→ use--select.