dbt Cloud
This page describes how to use Nagi with a dbt Cloud environment.
Recommended Approach
When using Nagi alongside dbt Cloud, we recommend starting with state evaluation and notifications. Setting autoSync: false on the Origin applies this to all automatically generated Assets.
yaml
kind: Origin
spec:
type: DBT
connection: my-bigquery
projectDir: ../dbt-project
autoSync: false
dbt Cloud jobs update models on their existing schedules, and Nagi periodically inspects the data state and sends notifications when drift is detected. For a gradual path toward automated convergence, see Concepts — From Monitoring to Automation.
Prerequisites
dbt CLI (dbt-core >= 1.0) must be installed in the environment where Nagi runs.
Init
Running nagi init generates Connection and Origin in resources/, the same as dbt Core.
When using dbt Cloud, add the dbtCloud field to the generated Connection.
yaml
apiVersion: nagi.io/v1alpha1
kind: Connection
metadata:
name: my-bigquery
spec:
dbtProfile:
profile: my_project
target: dev
dbtCloud:
credentialsFile: ~/.dbt/dbt_cloud.yml
credentialsFile is optional and defaults to ~/.dbt/dbt_cloud.yml.
This credentials file can be downloaded from the dbt Cloud UI (project settings CLI section -> "Download CLI configuration file"). Save it to ~/.dbt/dbt_cloud.yml.
Compile
Same as dbt Core: runs dbt compile against the local dbt project to generate manifest.json. For resource generation details, see Resource Generation.
Evaluate
Same as dbt Core.
Sync
When executing Sync, dbt CLI is run as a subprocess, the same as dbt Core.
In a dbt Cloud environment, Nagi's Sync and dbt Cloud jobs may update the same model simultaneously. To prevent this, Sync Control checks for running dbt Cloud jobs before executing a Sync.
Sync Control
When the Asset's Connection has dbtCloud configured, Nagi checks whether any dbt Cloud jobs are running before executing a Sync. At compile time, it parses the execute_steps of each dbt Cloud job to identify jobs that involve the target Asset. At Sync execution time, if that job is running, the Sync is aborted.
- At compile time, all jobs are fetched from the Jobs API, model names are extracted from
--selectinexecute_steps, and related jobs are identified for each Asset - At Sync execution time, running jobs are fetched from the Runs API, and it checks whether any job related to the target Asset is running
- If no related job is running, the Sync is executed
- If a related job is running, an error is returned
API
Nagi uses the following dbt Cloud Administrative API endpoints.
| Timing | API endpoint | Purpose |
|---|---|---|
| compile | GET /api/v2/accounts/{account_id}/jobs/ |
Fetch execute_steps of all jobs to build the mapping between Assets and jobs |
| Sync | GET /api/v2/accounts/{account_id}/runs/?status=3 |
Fetch running jobs (status=3: Running) to check whether any job related to the target Asset is running |
Required Permissions
The dbt Cloud API token requires the following permissions.
| Permission | Reason |
|---|---|
| Read access to Jobs | Fetching job definitions and execute_steps |
| Read access to Runs | Checking for running jobs |
The token-value included in ~/.dbt/dbt_cloud.yml is used for API authentication. The token is read from the file at the time of the API call but is not retained in memory.
Customization
Same as dbt Core.