Get started with a free trial today
Already have an account? Sign in
From time to time, Stitch will encounter data that can’t be loaded losslessly into the destination table in your destination. When this happens, Stitch may have to alter the structure of the table in order to successfully load the data.
Stitch may need to perform table alterations for several reasons, including:
VARCHAR data of varying widths,In this guide are examples of how Stitch will behave in each of these scenarios for each currently supported destination type.
The examples in this guide will use an example table namedcustomers to demonstrate Stitch’s behavior for each scenario.
Excludingthe_sdc columns, Stitch determines this is the structure of thecustomers table:
| Column name | Data type |
| id | BIGINT |
| name | STRING |
| age | BIGINT |
| has_magic | BOOLEAN |
To preserve your destination’s performance and reduce disk usage, Stitch uses the smallest possibleVARCHAR column when storing string data.
For example: If the maximum width of a string column across all records is currently 127, Stitch will type the destination column asVARCHAR(128).
As string data can vary in width, Stitch will take different actions to accommodate the data, depending on the destination in use. See below for an example.
During the first replication job, Stitch extracts the following records for thecustomers table:
| id (BIGINT) | name (STRING) | age (BIGINT) | has_magic (BOOLEAN) |
| 1 | Finn | 15 | false |
| 2 | Jake | 7 | true |
During the next replication job, Stitch extracts the records in the table below. In this example, thename column contains data that exceeds its previously known width:
| id (BIGINT) | name (STRING) | age (BIGINT) | has_magic (BOOLEAN) |
| 3 | PrincessBubblegumWhoIsTheCurrentIncarnationOfTheCandyElementalAndRulesOverTheCandyKingdomWasBornFromTheMotherGumAfterTheMushroomWarAndHasARivalWhoIsAVampireNamedMarcellineTheVampireQueen | 16 | true |
| 4 | BMO | 11 | false |
How Stitch loads the data depends on the type destination being used. Click the tabs below to see how accommodating this data works for each destination.
When Stitch detects string data that exceeds its previous maximum width in Azure Synapse Analytics destinations, Stitch will automatically widen the column to accommodate the data.
In this example, thename column would have originally beenNVARCHAR(128). Stitch will widen the column toNVARCHAR(256).
The structure of thecustomers table would be altered to the following:
| Column name | Data type |
| id [pk] | BIGINT |
| name | NVARCHAR(256) |
| age | BIGINT |
| has_magic | BOOLEAN |
When Stitch detects string data that exceeds its previous maximum width in BigQuery destinations, Stitch will not widen the column.
All string data is stored asTEXT in BigQuery destinations, which do not require setting a width.
The structure of thecustomers would remain unchanged:
| Column name | Data type |
| id [pk] | BIGINT |
| name | TEXT |
| age | BIGINT |
| has_magic | BOOLEAN |
When Stitch detects string data that exceeds its previous maximum width in Panoply and Redshift destinations, Stitch will automatically widen the column to accommodate the data.
In this example, thename column would have originally beenVARCHAR(128). Stitch will widen the column toVARCHAR(256), which requires dropping and re-creating the column.
Note: Dropping and re-creating columns will affectviews dependent on the column.
The structure of thecustomers table would be altered to the following:
| Column name | Data type |
| id [pk] | BIGINT |
| name | VARCHAR(256) |
| age | BIGINT |
| has_magic | BOOLEAN |
When Stitch detects string data that exceeds its previous maximum width in PostgreSQL destinations, Stitch will not widen the column.
All string data is stored asTEXT in PostgreSQL destinations, which do not require setting a width.
The structure of thecustomers would remain unchanged:
| Column name | Data type |
| id [pk] | BIGINT |
| name | TEXT |
| age | BIGINT |
| has_magic | BOOLEAN |
When Stitch detects string data that exceeds its previous maximum width in Snowflake destinations, Stitch will automatically widen the column to accommodate the data.
In this example, thename column would have originally beenVARCHAR(128). Stitch will widen the column toVARCHAR(256).
The structure of thecustomers table would be altered to the following:
| Column name | Data type |
| id [pk] | BIGINT |
| name | VARCHAR(256) |
| age | BIGINT |
| has_magic | BOOLEAN |
Stitch requires that there only be one data type per column to properly type, load, and store data. If a column contains multiple data types, Stitch will create additional columns and append the data type to the column name. See below for an example.
During the first replication job, the following rows are extracted for thecustomers table:
| id (BIGINT) | name (STRING) | age (BIGINT) | has_magic (BOOLEAN) |
| 1 | Finn | 15 | false |
| 2 | Jake | 7 | true |
During the next replication job, the following rows are extracted:
| id (BIGINT) | name (STRING) | age (DOUBLE) | has_magic (STRING) |
| 3 | Bubblegum | 16.0 | yes |
| 4 | BMO | 11 | false |
Stitch will detect that the data types in these newly replicated rows differ than the ones from the initial replication job. In this case:
age was originally aBIGINT, but can sometimes be a decimalhas_magic was originally aBOOLEAN, but can sometimes be a stringTo accommodate the data, Stitch will create a new column for the newly detected data type and store the data for that data type in the new column.
How columns are named as a result of “splitting” mixed data types depends on the type of destination being used. Click the tabs below to see how accommodating this data works for each destination.
When mixed data types are detected in the same column in Azure Synapse Analytics destinations:
age will only storeBIGINT data. If a different data type is detected forage, it will be stored in an additional column and this column will benull.age__fl will only store float/double data. If a different data type is detected forage__fl, it will be stored in the correctly typed column and this column will benull.The structure of thecustomers table would be altered to the following:
| Column name | Data type |
| id [pk] | BIGINT |
| name | TEXT |
| age | BIGINT |
| age__fl | FLOAT |
| has_magic | BOOLEAN |
| has_magic__st | TEXT |
When mixed data types are detected in the same column in BigQuery destinations:
age will only storeBIGINT data. If a different data type is detected forage, it will be stored in an additional column and this column will benull.age__nu will only store numeric data. If a different data type is detected forage__nu, it will be stored in the correctly typed column and this column will benull.The structure of thecustomers table would be altered to the following:
| Column name | Data type |
| id [pk] | BIGINT |
| name | TEXT |
| age | BIGINT |
| age__nu | NUMERIC |
| has_magic | BOOLEAN |
| has_magic__st | TEXT |
When mixed data types are detected in the same column in Panoply and Redshift destinations:
age__bi will only storeBIGINT data. If a different data type is detected forage__bi, it will be stored in an additional column and this column will benull.age__do will only store double data. If a different data type is detected forage__do, it will be stored in the correctly typed column and this column will benull.The structure of thecustomers table would be altered to the following:
| Column name | Data type |
| id [pk] | BIGINT |
| name | TEXT |
| age__bi | BIGINT |
| age__do | DOUBLE |
| has_magic__bo | BOOLEAN |
| has_magic__st | VARCHAR |
When mixed data types are detected in the same column in PostgreSQL destinations:
age__bi will only storeBIGINT data. If a different data type is detected forage__bi, it will be stored in an additional column and this column will benull.age__do will only store double data. If a different data type is detected forage__do, it will be stored in the correctly typed column and this column will benull.The structure of thecustomers table would be altered to the following:
| Column name | Data type |
| id [pk] | BIGINT |
| name | TEXT |
| age | BIGINT |
| age__do | DOUBLE |
| has_magic | BOOLEAN |
| has_magic__st | TEXT |
When mixed data types are detected in the same column in Snowflake destinations:
age will only storeBIGINT data. If a different data type is detected forage, it will be stored in an additional column and this column will benull.age__fl will only store numeric data. If a different data type is detected forage__fl, it will be stored in the correctly typed column and this column will benull.The structure of thecustomers table would be altered to the following:
| Column name | Data type |
| ID [pk] | BIGINT |
| name | TEXT |
| age | BIGINT |
| age__fl | FLOAT |
| has_magic | BOOLEAN |
| has_magic__st | TEXT |
Adding columns: When a new column is added to a source table and selected for replication, Stitch will append the column to the end of the destination table.
For Key-based Incremental tables, data for the column will be replicated onward from the savedReplication Key. DefaultNULLs will be placed in existing rows unless:
Removing columns: When a column is removed in the source or de-selected from replication, Stitch will place defaultNULLs in the column going forward. Columns will not be removed from the destination.
| Related | Troubleshooting |
Did this article help? If you have questions or feedback, feel free tosubmit a pull request with your suggestions,open an issue on GitHub, orreach out to us.