Load PostgreSQL data into BigQuery

Preview

This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of theService Specific Terms. Pre-GA features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.

Note: To get support or provide feedback for this feature, contactdts-preview-support@google.com.

You can load data from PostgreSQL to BigQuery using theBigQuery Data Transfer Service for PostgreSQL connector. The connectorsupports PostgreSQL instances hosted in your on-premisesenvironment, Cloud SQL, as well as other public cloud providers such asAmazon Web Services (AWS) and Microsoft Azure. With the BigQuery Data Transfer Service, you canschedule recurring transfer jobs that add your latest data fromPostgreSQL to BigQuery.

Limitations

PostgreSQL data transfers are subject to following limitations:

  • The maximum number of simultaneous transfer runs to a single PostgreSQLdatabase is determined bythe maximum number of concurrent connections supportedby the PostgreSQL database.The number of concurrent transferjobs should be limited to a value less than the maximum number of concurrentconnections supported by the PostgreSQL database.
  • A single transfer configuration can only support one data transfer run at a giventime. In the case where a second data transfer is scheduled to run before thefirst transfer is completed, then only the first data transfer completes whileany other data transfers that overlap with the first transfer is skipped.
    • To avoid skipped transfers within a single transfer configuration, werecommend that you increase the duration of time between large datatransfers by configuring theRepeat frequency.

Data ingestion options

The following sections provide information about the data ingestion optionswhen you set up a PostgreSQL data transfer.

TLS configuration

The PostgreSQL connector supports the configuration for transport levelsecurity (TLS) to encrypt your data transfers into BigQuery. ThePostgreSQL connector supports the following TLS configurations:

  • Encrypt data, and verify CA and hostname: This mode performs a fullvalidation of the server using TLS over the TCPS protocol. It encrypts alldata in transit and verifies that the database server's certificate is signedby a trusted Certificate Authority (CA). This mode also checks that thehostname you're connecting to exactly matches the Common Name (CN) or aSubject Alternative Name (SAN) on the server's certificate. This mode preventsattackers from using a valid certificate for a different domain to impersonateyour database server.
    • If your hostname does not match the certificate CN or SAN, the connectionfails. You must configure a DNS resolution to match the certificate or use adifferent security mode.
    • Use this mode for the most secure option to prevent person-in-the-middle(PITM) attacks.
  • Encrypt data, and verify CA only: This mode encrypts all data using TLSover the TCPS protocol and verifies that the server's certificate is signed bya CA that the client trusts. However, this mode does not verify the server'shostname. This mode successfully connects as long as the certificate is validand issued by a trusted VA, regardless of whether the hostname in thecertificate matches the hostname you are connecting to.
    • Use this mode if you want to ensure that you are connecting to a serverwhose certificate is signed by a trusted CA, but the hostname is notverifiable or you don't have control over the hostname configuration.
  • Encryption only: This mode encryptsall data transferred between the client and the server. It doesnot perform any certificate or hostname validation.
    • This mode provides some level of security by protecting data in transit, butit can be vulnerable to PITM attacks.
    • Use this mode if you need to ensure all data is encrypted but can't ordon't want to verify the server's identity. We recommend using this modewhen working with private VPCs.
  • No encryption or verification: This mode does not encrypt any data anddoes not perform any certificate or hostname verification. All data is sent asplain text.
    • We don't recommend using this mode in an environment where sensitive data ishandled.
    • We only recommend using this mode for testing purposes on an isolatednetwork where security is not a concern.

Trusted Server Certificate (PEM)

If you are using either theEncrypt data, and verify CA and hostname mode ortheEncrypt data, and verify CA mode, then you can also provide one or morePEM-encoded certificates. These certificates are required in some scenarioswhere the BigQuery Data Transfer Service needs to verify the identity of yourdatabase server during the TLS connection:

  • If you are using a certificate signed by a private CA within your organizationor a self-signed certificate, you must provide the full certificate chain orthe single self-signed certificate. This is required for certificates issuedby internal CAs of managed cloud provider services, such as the AmazonRelational Database Service (RDS).
  • If your database server certificate is signed by a public CA (for example,Let's Encrypt, DigiCert, or GlobalSign), you don't need to provide acertificate. The root certificates for these public CAs are pre-installed andtrusted by the BigQuery Data Transfer Service.

You can provide PEM-encoded certificates in theTrusted PEM Certificatefield when you create a PostgreSQL transfer configuration, with thefollowing requirements:

  • The certificate must be a valid PEM-encoded certificate chain.
  • The certificate must be entirely correct. Any missing certificates in thechain or incorrect content causes the TLS connection to fail.
  • For a single certificate, you can provide single, self-signed certificatefrom the database server.
  • For a full certificate chain issued by a private CA, you must provide thefull chain of trust. This includes the certificate from the database serverand any intermediate and root CA certificates.

Before you begin

Required roles

If you intend to set up transfer run notifications for Pub/Sub,ensure that you have thepubsub.topics.setIamPolicy Identity and Access Management (IAM)permission. Pub/Sub permissions are not required if you only set upemail notifications. For more information, seeBigQuery Data Transfer Service run notifications.

To get the permissions that you need to create a BigQuery Data Transfer Service data transfer, ask your administrator to grant you theBigQuery Admin (roles/bigquery.admin) IAM role on your project. For more information about granting roles, seeManage access to projects, folders, and organizations.

This predefined role contains the permissions required to create a BigQuery Data Transfer Service data transfer. To see the exact permissions that are required, expand theRequired permissions section:

Required permissions

The following permissions are required to create a BigQuery Data Transfer Service data transfer:

  • BigQuery Data Transfer Service permissions:
    • bigquery.transfers.update
    • bigquery.transfers.get
  • BigQuery permissions:
    • bigquery.datasets.get
    • bigquery.datasets.getIamPolicy
    • bigquery.datasets.update
    • bigquery.datasets.setIamPolicy
    • bigquery.jobs.create

You might also be able to get these permissions withcustom roles or otherpredefined roles.

For more information, seeGrantbigquery.admin access.

Network connections

If a public IP address is not available for the PostgreSQLdatabase connection, you mustset up a network attachment.

For detailed instructions on the required network setup, refer to the followingdocuments:

Set up a PostgreSQL data transfer

Add PostgreSQL data into BigQuery by setting up atransfer configuration using one of the following options:

Console

  1. Go to theData transfers page.

    Go to Data transfers

  2. ClickCreate transfer.

  3. In theSource type section, forSource, selectPostgreSQL.

  4. In theTransfer config name section, forDisplay name,enter a name for the transfer. The transfer name can be any value thatlets you identify the transfer if you need to modify it later.

  5. In theSchedule options section, do the following:

    • Select a repeat frequency. If you select theHours,Days(default),Weeks, orMonths option, you must also specify afrequency. You can also select theCustom option to create a morespecific repeat frequency. If you select theOn-demand option, thisdata transfer only runs when youmanually trigger the transfer.
    • If applicable, select either theStart now orStart at a set timeoption and provide a start date and run time.
  6. In theDestination settings section, forDataset, select thedataset that you created to store your data, or clickCreate new datasetand create one to use as the destination dataset.

  7. In theData source details section, do the following:

    • ForNetwork attachment, select an existing network attachmentor clickCreate Network Attachment. For more information, see theNetwork connections section of this document.
    • ForHost, enter the hostname or IP address of thePostgreSQL database server.
    • ForPort number, enter the port number for the PostgreSQLdatabase server.
    • ForDatabase name, enter the name of the PostgreSQLdatabase.
    • ForUsername, enter the username of the PostgreSQLuserinitiating the PostgreSQL database connection.
    • ForPassword, enter the password of the PostgreSQLuser initiating the PostgreSQL database connection.
    • ForTLS Mode, select an option from the menu. For moreinformation about TLS modes, seeTLS configuration.
    • ForTrusted PEM Certificate, enter the public certificate of thecertificate authority (CA) that issued the TLS certificate of thedatabase server. For more information, seeTrusted Server Certificate (PEM).
    • ForPostgreSQL objects to transfer, do one of the following:

      • ClickBrowse to select the PostgreSQL tablesthat are required for the transfer, and then clickSelect.
      • Manually enter the names of the tables in the PostgreSQLobjects to transfer.
  8. Optional: In theNotification options section, do the following:

    • To enable email notifications, click theEmail notificationstoggle to the on position. When you enable this option, thetransfer administrator receives an email notification when atransfer run fails.
    • To configure Pub/Sub runnotificationsfor your transfer, click thePub/Sub notificationstoggle to the on position. You can select yourtopicname or clickCreate a topic to create one.
  9. ClickSave.

bq

Enter thebq mk commandand supply the transfer creation flag--transfer_config:

bqmk--transfer_config--project_id=PROJECT_ID--data_source=DATA_SOURCE--display_name=DISPLAY_NAME--target_dataset=DATASET--params='PARAMETERS'

Replace the following:

  • PROJECT_ID (optional): your Google Cloud project ID.If the--project_id flag isn't supplied to specify a particular project,the default project is used.
  • DATA_SOURCE: the data source, which ispostgresql.
  • DISPLAY_NAME: the display name for the data transferconfiguration. The transfer name can be any value that lets youidentify the transfer if you need to modify it later.
  • DATASET: the target dataset for the data transferconfiguration.
  • PARAMETERS: the parameters for the created transferconfiguration in JSON format. For example:--params='{"param":"param_value"}'. The following are the parameters fora PostgreSQL transfer:

    • connector.networkAttachment (optional): the name of the network attachment toconnect to the PostgreSQL database.
    • connector.database: the name of the PostgreSQLdatabase.
    • connector.endpoint.host: the hostname or IP address ofthe database.
    • connector.endpoint.port: the port number of the database.
    • connector.authentication.username: the username of thedatabase user.
    • connector.authentication.password: the password of thedatabase user.
    • connector.tls.mode: specify aTLS configuration to usewith this transfer:
      • ENCRYPT_VERIFY_CA_AND_HOST to encrypt data, and verify CA and hostname
      • ENCRYPT_VERIFY_CA to encrypt data, and verify CA only
      • ENCRYPT_VERIFY_NONE for data encryption only
      • DISABLE for no encryption or verification
    • connector.tls.trustedServerCertificate: (optional) provide one or morePEM-encoded certificates.Required only ifconnector.tls.mode isENCRYPT_VERIFY_CA_AND_HOST orENCRYPT_VERIFY_CA.
    • assets: a list of the names of the PostgreSQLtables to be transferred from the PostgreSQL database as part ofthe transfer.

For example, the following command creates a PostgreSQLtransfer calledMy Transfer:

bqmk--transfer_config--target_dataset=mydataset--data_source=postgresql--display_name='My Transfer'--params='{"assets":["DB1/PUBLIC/DEPARTMENT","DB1/PUBLIC/EMPLOYEES"],        "connector.authentication.username": "User1",        "connector.authentication.password":"ABC12345",        "connector.database":"DB1",        "connector.endpoint.host":"192.168.0.1",        "connector.endpoint.port":5432,        "connector.tls.mode": "ENCRYPT_VERIFY_CA_AND_HOST",        "connector.tls.trustedServerCertificate": "PEM-encoded certificate"}'

API

Use theprojects.locations.transferConfigs.create method and supply an instance of theTransferConfig resource.

When you save the transfer configuration, the PostgreSQLconnector automatically triggers a transfer run according to your scheduleoption. With every transfer run, the PostgreSQL connectortransfers all available data from PostgreSQL intoBigQuery.

To manually run a data transfer outside of your regular schedule, you can startabackfill run.

Data type mapping

The following table maps PostgreSQL data types to thecorresponding BigQuery data types.

PostgreSQL data typeBigQuery data type
arraySTRING
bigintINTEGER
bigserialINTEGER
bit(n)STRING
bit varying(n)STRING
booleanBOOLEAN
boxSTRING
byteaBYTES
characterSTRING
character varyingSTRING
cidrSTRING
circleSTRING
circularstringSTRING
compoundcurveSTRING
curvepolygonSTRING
dateDATE
double precisionFLOAT
enumSTRING
geometrycollectionSTRING
inetSTRING
integerINTEGER
intervalSTRING
jsonJSON
jsonbJSON
lineSTRING
linestringSTRING
lsegSTRING
macaddrSTRING
macaddr8STRING
moneySTRING
multicurveSTRING
multilinestringSTRING
multipointSTRING
multipolygonSTRING
multisurfaceSTRING
numeric(precision, scale)/decimal(precision, scale)NUMERIC
pathSTRING
pointSTRING
polygonSTRING
polyhedralsurfaceSTRING
rangeSTRING
realFLOAT
serialINTEGER
smallintINTEGER
smallserialINTEGER
textSTRING
time [ (p) ] [ without timezone ]TIMESTAMP
time [ (p) ] with time zoneTIMESTAMP
tinSTRING
timestamp [ (p) ] [ without timezone ]TIMESTAMP
timestamp [ (p) ] with time zoneTIMESTAMP
triangleSTRING
tsquerySTRING
tsvectorSTRING
uuidSTRING
xmlSTRING

Troubleshoot

If you are having issues setting up your data transfer, seePostgreSQL transfer issues.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.