top of page

Scaling Analytics Without Chaos: A Modular dbt Mesh for Multi-Team Setups

As data platforms grow, teams often encounter the same challenge: multiple projects need the same foundational datasets, yet duplicating transformations across projects quickly becomes difficult to maintain.


A scalable approach is to separate shared data assets from domain-specific transformations. In this architecture, a shared dbt project produces reusable, governed models, while consumer projects build use-case specific logic on top of those shared assets.



Flowchart comparing Common and Consumer Project models. Shows stages, sources, and artifacts. Includes text on governance and package use.
(Fig 1: High level architectural view of a Mesh with two dbt projects)

The diagram above illustrates a Mesh pattern, where metadata artifacts are used to expose selected models from the shared project to downstream consumer projects using dbt-loom. Mesh allows for better reusability of data assets across projects without compromising on the “governance independence” of consumer projects.


Architecture Overview


At a high level, the architecture consists of two types of dbt projects:


1. Common (Shared) Project


A centralized dbt project responsible for producing reusable models that can be consumed by multiple downstream teams or domains.


2. Consumer Projects


Independent dbt projects that implement domain-specific transformations while reusing the shared models exposed by the shared project.


3. dbt-loom


A dbt package that allows consumer projects to reference models from the shared project using its compiled metadata rather than importing its code directly.

This approach supports:

  • modular development

  • shared governance

  • independent deployment cycles

  • reusable, well-defined data assets


The Shared Project


Imagine you have a shared dbt project responsible for building commonly used datasets that multiple teams rely on.


For example, this project might produce shared entities such as:

  • dim_customer

  • dim_product

  • dim_item

  • shared inventory or transaction datasets


Rather than having each team implement these transformations independently, the shared project defines them once and makes them available to downstream projects.


Not every model in the shared project is necessarily meant to be consumed externally. Only a subset of models is exposed for reuse. These models are marked with:

access: public


Models with access: public become available to downstream projects, while other internal models remain private.


Artifact Generation


When the shared project runs:

dbt build --target <environment>

dbt produces artifact metadata describing the project’s models and dependencies.

Important artifacts include:

manifest.json

run_results.json

catalog.json


The key artifact for cross-project consumption is:

manifest.json


This file contains metadata describing:

  • models

  • dependencies

  • lineage

  • documentation

  • access configuration

In order to only update your manifest.json, you can also run dbt parse


Publishing Artifacts


The manifest generated by the shared project must be stored somewhere accessible to downstream projects.


Typical storage locations include:


Cloud Storage

For example:

  • Databricks Volumes

  • cloud object storage

  • shared filesystem


Local Development

For development environments, the artifact may also be stored in a local directory.

Only the manifest.json is required for downstream integration.



Consumer Projects


Consumer projects are responsible for building domain-specific models tailored to a particular use case.


For example, a consumer project might implement models related to:

  • inventory analytics

  • logistics reporting

  • replenishment logic

  • operational dashboards


These projects typically include their own:

  • source definitions

  • staging models

  • intermediate transformations

  • final marts


Referencing Shared Models


Without a dedicated mechanism, consumer projects might reference shared tables as sources:

select *

from {{ source('item_database', 'dim_item') }}


While this works, it has a major limitation: the lineage is lost. The downstream project cannot see how the table was created or what transformations contribute to it. However, instead of recreating shared business entities, they reuse the ones produced by the shared project.


Using dbt-loom


dbt-loom solves this problem by allowing consumer projects to reference upstream models using metadata from the shared project's manifest.


Instead of treating upstream tables as opaque sources, dbt-loom allows them to behave like normal dbt models.


Consumer projects can write:

select *

from {{ ref('common_project', 'dim_item') }}


Even though the model exists in another dbt project.


At compile time, dbt-loom:

  1. Reads the upstream manifest.json

  2. Identifies models marked with access: public

  3. Injects those models into the consumer project's dependency graph


As a result:

  • lineage becomes visible across projects

  • documentation integrates seamlessly

  • dependencies remain explicit


Installing dbt-loom


Consumer projects install dbt-loom as a dependency:

packages:

  - dbt-loom

Then install dependencies:

dbt deps


Configuring dbt-loom


A configuration file points dbt-loom to the upstream manifest.


Example configuration (dbt_loom.config):

manifests:

  - name: shared_project

    type: file

    config:

      path: /Volumes/shared_project/artifacts/manifest.json


Here it is assumed that the data warehouse/lakehouse you are working with is databricks.


When you are developing locally with dbt-core, you can specify the config path of the manifest like: "dbfs:///Volumes/my_catalog/artifacts/dbt_target/manifest.json" in a local dbt loom config file that can be named as “dbt_loom_local.config”.


The default value of the env variable DBT_LOOM_CONFIG is dbt_loom.config. When you are running dbt locally you need dbt-loom to use the config file "dbt_loom_local.config.yml".


For this, in windows you need to set the variable correctly using

$env:DBT_LOOM_CONFIG="dbt_loom_local.config.yml" in your powershell.

Run $env:DBT_LOOM_CONFIG  to check if the env variable set correctly.



Development Workflow


The development workflow typically looks like this:


Shared Project


  1. Build models

  2. Generate artifacts

  3. Publish manifest.json


Consumer Project


  1. Install dbt-loom

  2. Configure access to the shared manifest

  3. Reference shared models using ref()

  4. Build downstream transformations


Advantages of a dbt Mesh


This architecture provides several benefits. This is nicely summarized by dbt here: dbt Mesh FAQs | dbt Developer Hub but it is still worth summarizing the most significant benefits:


Model Reuse


Common datasets are defined once and reused across multiple projects.


Clear Ownership


The shared project owns canonical entities, while consumer projects own domain-specific logic.


Independent Deployment


Projects can be developed and deployed independently.


Improved Lineage


Because dbt-loom injects upstream models into the DAG, lineage remains visible across projects.


Reduced Duplication


Teams avoid recreating the same transformations in multiple repositories.


Example Scenario


Suppose multiple teams require a dataset called:

dim_item


Without a shared project, each team might implement its own version.


With this architecture:

Shared Project

   └── dim_item


Consumer projects simply reference it:

select *

from {{ ref('common'_project', 'dim_item') }}


The transformation remains centralized, governed, and reusable.


Conclusion


Separating shared data assets from domain-specific transformations enables a modular and scalable dbt architecture. By exposing a subset of models from a shared project and allowing downstream projects to reference them through metadata, teams can maintain clear ownership boundaries while still promoting reuse.


Using dbt-loom to connect these projects preserves lineage, documentation, and dependency visibility across repositories, making the overall platform easier to understand, maintain, and extend as new use cases emerge.


Recent Posts

See All

Comments


bottom of page