- Notifications
You must be signed in to change notification settings - Fork15
Data models for Fivetran's Jira connector built using dbt.
License
fivetran/dbt_jira
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Jira Transformation dbt Package (Docs)
- Produces modeled tables that leverage Jira data fromFivetran's connector in the format described bythis ERD and builds off the output of ourJira source package.
- Enables you to better understand the workload, performance, and velocity of your team's work using Jira issues. It performs the following actions:
- Creates a daily issue history table so you can quickly create agile reports, such as burndown charts, along any issue field.
- Enriches the core issue table with relevant data regarding its workflow and current state.
- Aggregates bandwidth and issue velocity metrics along projects and users.
- Generates a comprehensive data dictionary of your source and modeled Jira data through thedbt docs site.
The following table provides a detailed list of all tables materialized within this package by default.
TIP: See more details about these tables in the package'sdbt docs site.
Table | Description |
---|---|
jira__daily_issue_field_history | Each record represents a day in which an issue remained open, enriched with data about the issue's sprint, its status, and the values of any fields specified by theissue_field_history_columns variable. |
jira__issue_enhanced | Each record represents a Jira issue, enriched with data about its current assignee, reporter, sprint, epic, project, resolution, issue type, priority, and status. It also includes metrics reflecting assignments, sprint rollovers, and re-openings of the issue. Note that all epics are consideredissues in Jira and are therefore included in this model (whereissue_type='epic' ). |
jira__project_enhanced | Each record represents a project, enriched with data about the users involved, how many issues have been opened or closed, the velocity of work, and the breadth of the project (i.e., its components and epics). |
jira__user_enhanced | Each record represents a user, enriched with metrics regarding their open issues, completed issues, the projects they work on, and the velocity of their work. |
Each Quickstart transformation job run materializes 43 models if all components of this data model are enabled. This count includes all staging, intermediate, and final models materialized asview
,table
, orincremental
.
To use this dbt package, you must have the following:
- At least one Fivetran Jira connection syncing data into your destination.
- ABigQuery,Snowflake,Redshift,Databricks, orPostgreSQL destination.
If you are using a Databricks destination with this package you will need to add the below (or a variation of the below) dispatch configuration within yourdbt_project.yml
. This is required in order for the package to accurately search for macros within thedbt-labs/spark_utils
then thedbt-labs/dbt_utils
packages respectively.
dispatch: -macro_namespace:dbt_utilssearch_order:['spark_utils', 'dbt_utils']
Models in this package that are materialized incrementally are configured to work with the different strategies available to each supported warehouse.
ForBigQuery andDatabricks All Purpose Cluster runtime destinations, we have choseninsert_overwrite
as the default strategy, which benefits from the partitioning capability.
For Databricks SQL Warehouse destinations, models are materialized as tables without support for incremental runs.
ForSnowflake,Redshift, andPostgres databases, we have chosendelete+insert
as the default strategy.
Regardless of strategy, we recommend that users periodically run a
--full-refresh
to ensure a high level of data quality.
Include the following jira package version in yourpackages.yml
file:
TIP: Checkdbt Hub for the latest installation instructions orread the dbt docs for more information on installing packages.
packages: -package:fivetran/jiraversion:[">=0.19.0", "<0.20.0"]
By default, this package runs using your destination and thejira
schema. If this is not where your Jira data is (for example, if your Jira schema is namedjira_fivetran
), add the following configuration to your rootdbt_project.yml
file:
vars:jira_database:your_destination_namejira_schema:your_schema_name
Your Jira connection may not sync every table that this package expects. If you do not have theSPRINT
,COMPONENT
, orVERSION
tables synced, add the respective variables to your rootdbt_project.yml
file. Additionally, if you want to remove comment aggregations from yourjira__issue_enhanced
model, add thejira_include_comments
variable to your rootdbt_project.yml
:
vars:jira_using_sprints:false# Enabled by default. Disable if you do not have the sprint table or do not want sprint-related metrics reported.jira_using_components:false# Enabled by default. Disable if you do not have the component table or do not want component-related metrics reported.jira_using_versions:false# Enabled by default. Disable if you do not have the versions table or do not want versions-related metrics reported.jira_using_priorities:false# Enabled by default. Disable if you are not using priorities in Jira.jira_include_comments:false# Enabled by default. Disabling will remove the aggregation of comments via the `count_comments` and `conversations` columns in the `jira__issue_enhanced` table.
Thedbt_jira
package offers variables to enable or disable conversation aggregations in thejira__issue_enhanced
table. These settings allow you to manage the amount of data processed and avoid potential performance or limit issues with large datasets.
jira_include_conversations
: Controls only theconversation
column in thejira__issue_enhanced
table.- Default: Disabled for Redshift due to string size constraints; enabled for other supported warehouses.
- Setting this to
false
removes theconversation
column but retains thecount_comments
field ifjira_include_comments
is still enabled. This is useful if you want a comment count without the full conversation details.
In yourdbt_project.yml
file:
vars:jira_include_conversations:false/true# Disabled by default for Redshift; enabled for other supported warehouses.
Thejira__daily_issue_field_history
model generates historical data for the columns specified by theissue_field_history_columns
variable. By default, the only columns tracked arestatus
,status_id
, andsprint
, but all fields found in the JiraFIELD
table'sfield_name
column can be included in this model. The most recent value of any tracked column is also captured injira__issue_enhanced
.
If you would like to change these columns, add the following configuration to yourdbt_project.yml
file. After adding the columns to yourdbt_project.yml
file, run thedbt run --full-refresh
command to fully refresh any existing models:
IMPORTANT: If you wish to use a custom field, be sure to list the
field_name
and not thefield_id
. The correspondingfield_name
can be found in thestg_jira__field
model.
vars:issue_field_history_columns:['the', 'list', 'of', 'field', 'names']
This package provides the option to usefield_name
instead offield_id
as the field-grain for issue field history transformations. By default, the package strictly partitions and joins issue field data usingfield_id
. However, this assumes that it is impossible to have fields with the same name in Jira. For instance, it is very easy to create anotherSprint
field, and different Jira users across your organization may choose the wrong or inconsistent version of the field. As such, thejira_field_grain
variable may be adjusted to change the field-grain behavior of the issue field history models. You may adjust the variable using the following configuration in your root dbt_project.yml.
vars:jira_field_grain:'field_name'# field_id by default
This packages allows you the option to utilize a buffer variable to bring in issues past their date of close. This is because issues can be left unresolved past that date. This buffer variable ensures that this daily issue history will not cut off field updates to these particular issues.
You may adjust the variable using the following configuration in your rootdbt_project.yml
.
vars:jira_issue_history_buffer:insert_number_of_months# 1 by default
By default, this package builds the Jira staging models within a schema titled (<target_schema>
+_jira_source
) and your Jira modeling models within a schema titled (<target_schema>
+_jira
) in your destination. If this is not where you would like your Jira data to be written to, add the following configuration to your rootdbt_project.yml
file:
models:jira_source:+schema:my_new_schema_name# leave blank for just the target_schemajira:+schema:my_new_schema_name# leave blank for just the target_schema
If an individual source table has a different name than the package expects, add the table name as it appears in your destination to the respective variable:
IMPORTANT: See this project's
dbt_project.yml
variable declarations to see the expected names.
vars:jira_<default_source_table_name>_identifier:your_table_name
Records from the source may occasionally arrive late. To handle this, we implement a one-week lookback in our incremental models to capture late arrivals without requiring frequent full refreshes. The lookback is structured in weekly increments, as the incremental logic is based on weekly periods. While the frequency of full refreshes can be reduced, we still recommend runningdbt --full-refresh
periodically to maintain data quality of the models.
To change the default lookback window, add the following variable to yourdbt_project.yml
file:
vars:jira:lookback_window:number_of_weeks# default is 1
Expand for details
Fivetran offers the ability for you to orchestrate your dbt project throughFivetran Transformations for dbt Core™. Learn how to set up your project for orchestration through Fivetran in ourTransformations for dbt Core setup guides.
This dbt package is dependent on the following dbt packages. These dependencies are installed by default within this package. For more information on the following packages, refer to thedbt hub site.
IMPORTANT: If you have any of these dependent packages in your own
packages.yml
file, we highly recommend that you remove them from your rootpackages.yml
to avoid package version conflicts.
packages: -package:fivetran/jira_sourceversion:[">=0.7.0", "<0.8.0"] -package:fivetran/fivetran_utilsversion:[">=0.4.0", "<0.5.0"] -package:dbt-labs/dbt_utilsversion:[">=1.0.0", "<2.0.0"] -package:dbt-labs/spark_utilsversion:[">=0.3.0", "<0.4.0"]
The Fivetran team maintaining this packageonly maintains the latest version of the package. We highly recommend you stay consistent with thelatest version of the package and refer to theCHANGELOG and release notes for more information on changes across versions.
A small team of analytics engineers at Fivetran develops these dbt packages. However, the packages are made better by community contributions.
We highly encourage and welcome contributions to this package. Check outthis dbt Discourse article on the best workflow for contributing to a package.
- If you have questions or want to reach out for help, see theGitHub Issue section to find the right avenue of support for you.
- If you would like to provide feedback to the dbt package team at Fivetran or would like to request a new dbt package, fill out ourFeedback Form.
About
Data models for Fivetran's Jira connector built using dbt.