Movatterモバイル変換


[0]ホーム

URL:


DOCS >REPLICATION >LOADING >UNDERSTANDING TABLE STRUCTURAL CHANGES

Get started with a free trial today

Already have an account? Sign in

Understanding Table Structural Changes

From time to time, Stitch will encounter data that can’t be loaded losslessly into the destination table in your destination. When this happens, Stitch may have to alter the structure of the table in order to successfully load the data.


Reasons for table structural changes

Stitch may need to perform table alterations for several reasons, including:

In this guide are examples of how Stitch will behave in each of these scenarios for each currently supported destination type.


Examples in this guide

The examples in this guide will use an example table namedcustomers to demonstrate Stitch’s behavior for each scenario.

Excludingthe_sdc columns, Stitch determines this is the structure of thecustomers table:

Column nameData type
id BIGINT
name STRING
age BIGINT
has_magic BOOLEAN

VARCHAR column widening

To preserve your destination’s performance and reduce disk usage, Stitch uses the smallest possibleVARCHAR column when storing string data.

For example: If the maximum width of a string column across all records is currently 127, Stitch will type the destination column asVARCHAR(128).

As string data can vary in width, Stitch will take different actions to accommodate the data, depending on the destination in use. See below for an example.

VARCHAR column widening example: First replication job

During the first replication job, Stitch extracts the following records for thecustomers table:

id (BIGINT) name (STRING) age (BIGINT) has_magic (BOOLEAN)
1 Finn 15 false
2 Jake 7 true

VARCHAR column widening example: Second replication job

During the next replication job, Stitch extracts the records in the table below. In this example, thename column contains data that exceeds its previously known width:

id (BIGINT) name (STRING) age (BIGINT) has_magic (BOOLEAN)
3 PrincessBubblegumWhoIsTheCurrentIncarnationOfTheCandyElementalAndRulesOverTheCandyKingdomWasBornFromTheMotherGumAfterTheMushroomWarAndHasARivalWhoIsAVampireNamedMarcellineTheVampireQueen 16 true
4 BMO 11 false

VARCHAR column widening example: New table structure

How Stitch loads the data depends on the type destination being used. Click the tabs below to see how accommodating this data works for each destination.

When Stitch detects string data that exceeds its previous maximum width in Azure Synapse Analytics destinations, Stitch will automatically widen the column to accommodate the data.

In this example, thename column would have originally beenNVARCHAR(128). Stitch will widen the column toNVARCHAR(256).

The structure of thecustomers table would be altered to the following:

Column nameData type
id [pk] BIGINT
name NVARCHAR(256)
age BIGINT
has_magic BOOLEAN

When Stitch detects string data that exceeds its previous maximum width in BigQuery destinations, Stitch will not widen the column.

All string data is stored asTEXT in BigQuery destinations, which do not require setting a width.

The structure of thecustomers would remain unchanged:

Column nameData type
id [pk] BIGINT
name TEXT
age BIGINT
has_magic BOOLEAN

When Stitch detects string data that exceeds its previous maximum width in Panoply and Redshift destinations, Stitch will automatically widen the column to accommodate the data.

In this example, thename column would have originally beenVARCHAR(128). Stitch will widen the column toVARCHAR(256), which requires dropping and re-creating the column.

Note: Dropping and re-creating columns will affectviews dependent on the column.

The structure of thecustomers table would be altered to the following:

Column nameData type
id [pk] BIGINT
name VARCHAR(256)
age BIGINT
has_magic BOOLEAN

When Stitch detects string data that exceeds its previous maximum width in PostgreSQL destinations, Stitch will not widen the column.

All string data is stored asTEXT in PostgreSQL destinations, which do not require setting a width.

The structure of thecustomers would remain unchanged:

Column nameData type
id [pk] BIGINT
name TEXT
age BIGINT
has_magic BOOLEAN

When Stitch detects string data that exceeds its previous maximum width in Snowflake destinations, Stitch will automatically widen the column to accommodate the data.

In this example, thename column would have originally beenVARCHAR(128). Stitch will widen the column toVARCHAR(256).

The structure of thecustomers table would be altered to the following:

Column nameData type
id [pk] BIGINT
name VARCHAR(256)
age BIGINT
has_magic BOOLEAN

Columns with mixed data types

Stitch requires that there only be one data type per column to properly type, load, and store data. If a column contains multiple data types, Stitch will create additional columns and append the data type to the column name. See below for an example.

Mixed data types example: First replication job

During the first replication job, the following rows are extracted for thecustomers table:

id (BIGINT) name (STRING) age (BIGINT) has_magic (BOOLEAN)
1 Finn 15 false
2 Jake 7 true

Mixed data types example: Second replication job

During the next replication job, the following rows are extracted:

id (BIGINT) name (STRING) age (DOUBLE) has_magic (STRING)
3 Bubblegum 16.0 yes
4 BMO 11 false

Stitch will detect that the data types in these newly replicated rows differ than the ones from the initial replication job. In this case:

  • age was originally aBIGINT, but can sometimes be a decimal
  • has_magic was originally aBOOLEAN, but can sometimes be a string

Mixed data types example: New table structure

To accommodate the data, Stitch will create a new column for the newly detected data type and store the data for that data type in the new column.

How columns are named as a result of “splitting” mixed data types depends on the type of destination being used. Click the tabs below to see how accommodating this data works for each destination.

When mixed data types are detected in the same column in Azure Synapse Analytics destinations:

  • Original columns: The column name remains as-is. The column’s data type will be the first data type Stitch detected for the column. For example:age will only storeBIGINT data. If a different data type is detected forage, it will be stored in an additional column and this column will benull.
  • Additional columns: The column’s data type is appended to the column name. Only data of this type will be stored in this column. For example:age__fl will only store float/double data. If a different data type is detected forage__fl, it will be stored in the correctly typed column and this column will benull.

The structure of thecustomers table would be altered to the following:

Column nameData type
id [pk] BIGINT
name TEXT
age BIGINT
age__fl FLOAT
has_magic BOOLEAN
has_magic__st TEXT

When mixed data types are detected in the same column in BigQuery destinations:

  • Original columns: The column name remains as-is. The column’s data type will be the first data type Stitch detected for the column. For example:age will only storeBIGINT data. If a different data type is detected forage, it will be stored in an additional column and this column will benull.
  • Additional columns: The column’s data type is appended to the column name. Only data of this type will be stored in this column. For example:age__nu will only store numeric data. If a different data type is detected forage__nu, it will be stored in the correctly typed column and this column will benull.

The structure of thecustomers table would be altered to the following:

Column nameData type
id [pk] BIGINT
name TEXT
age BIGINT
age__nu NUMERIC
has_magic BOOLEAN
has_magic__st TEXT

When mixed data types are detected in the same column in Panoply and Redshift destinations:

  • Original columns: The column is re-named to include the original (first detected) data type and only data of the original type will be stored in this column. For example:age__bi will only storeBIGINT data. If a different data type is detected forage__bi, it will be stored in an additional column and this column will benull.
  • Additional columns: The column’s data type is appended to the column name. Only data of this type will be stored in this column. For example:age__do will only store double data. If a different data type is detected forage__do, it will be stored in the correctly typed column and this column will benull.

The structure of thecustomers table would be altered to the following:

Column nameData type
id [pk] BIGINT
name TEXT
age__bi BIGINT
age__do DOUBLE
has_magic__bo BOOLEAN
has_magic__st VARCHAR

When mixed data types are detected in the same column in PostgreSQL destinations:

  • Original columns: The column is re-named to include the original (first detected) data type and only data of the original type will be stored in this column. For example:age__bi will only storeBIGINT data. If a different data type is detected forage__bi, it will be stored in an additional column and this column will benull.
  • Additional columns: The column’s data type is appended to the column name. Only data of this type will be stored in this column. For example:age__do will only store double data. If a different data type is detected forage__do, it will be stored in the correctly typed column and this column will benull.

The structure of thecustomers table would be altered to the following:

Column nameData type
id [pk] BIGINT
name TEXT
age BIGINT
age__do DOUBLE
has_magic BOOLEAN
has_magic__st TEXT

When mixed data types are detected in the same column in Snowflake destinations:

  • Original columns: The column name remains as-is. The column’s data type will be the first data type Stitch detected for the column. For example:age will only storeBIGINT data. If a different data type is detected forage, it will be stored in an additional column and this column will benull.
  • Additional columns: The column’s data type is appended to the column name. Only data of this type will be stored in this column. For example:age__fl will only store numeric data. If a different data type is detected forage__fl, it will be stored in the correctly typed column and this column will benull.

The structure of thecustomers table would be altered to the following:

Column nameData type
ID [pk] BIGINT
name TEXT
age BIGINT
age__fl FLOAT
has_magic BOOLEAN
has_magic__st TEXT

Adding and removing columns

Note:This section is not applicable to Log-based Incremental Replication. This section only applies when Key-based Incremental or Full Table Replication is being used. Refer to theLog-based Incremental Replication guide for info about schema changes and Log-based Incremental Replication.
  • Adding columns: When a new column is added to a source table and selected for replication, Stitch will append the column to the end of the destination table.

    For Key-based Incremental tables, data for the column will be replicated onward from the savedReplication Key. DefaultNULLs will be placed in existing rows unless:

    1. A historical backfill in the source updated the records’ Replication Key values, or
    2. Atable-level reset is performed and a full re-replication is queued
  • Removing columns: When a column is removed in the source or de-selected from replication, Stitch will place defaultNULLs in the column going forward. Columns will not be removed from the destination.


RelatedTroubleshooting

Questions? Feedback?

Did this article help? If you have questions or feedback, feel free tosubmit a pull request with your suggestions,open an issue on GitHub, orreach out to us.


[8]ページ先頭

©2009-2026 Movatter.jp