Migrate from AWS to Google Cloud: Migrate from Amazon RDS and Amazon Aurora for PostgreSQL to Cloud SQL and AlloyDB for PostgreSQL

Google Cloud provides tools, products, guidance, and professionalservices to migrate from Amazon Relational Database Service (RDS) or AmazonAurora to Cloud SQL for PostgreSQL or AlloyDB for PostgreSQL.

This document is intended for cloud and database administrators who want toplan, implement, and validate a database migration project. It's also intendedfor decision makers who are evaluating the opportunity to migrate and want anexample of what a migration might look like.

This document focuses on a homogeneous database migration, which is a migrationwhere the source and destination databases are the same database technology. Inthis migration guide, the source is Amazon RDS or Amazon Aurora for PostgreSQL,and the destination is Cloud SQL for PostgreSQL or AlloyDB for PostgreSQL.

This document is part of a multi-part series about migrating from AWS toGoogle Cloud that includes the following documents:

For this migration to Google Cloud, we recommend that you followthe migration framework described inMigrate to Google Cloud: Get started.

The following diagram illustrates the path of your migration journey.

Migration path with four phases.

You might migrate from your source environment to Google Cloud in a seriesof iterations—for example, you might migrate some workloads first and otherslater. For each separate migration iteration, you follow the phases of thegeneral migration framework:

  1. Assess and discover your workloads and data.
  2. Plan and build a foundation on Google Cloud.
  3. Migrate your workloads and data to Google Cloud.
  4. Optimize your Google Cloud environment.

For more information about the phases of this framework, seeMigrate to Google Cloud: Get started.

To design an effective migration plan, we recommend that you validate each stepof the plan, and ensure that you have a rollback strategy. To help you validateyour migration plan, seeMigrate to Google Cloud: Best practices for validating a migration plan.

Assess the source environment

In the assessment phase, you determine the requirements and dependencies tomigrate your source environment to Google Cloud.

The assessment phase is crucial for the success of your migration. You need togain deep knowledge about the workloads you want to migrate, their requirements,their dependencies, and about your current environment. You need to understandyour starting point to successfully plan and execute a Google Cloudmigration.

The assessment phase consists of the following tasks:

  1. Build a comprehensive inventory of your workloads.
  2. Catalog your workloads according to their properties and dependencies.
  3. Train and educate your teams on Google Cloud.
  4. Build experiments and proofs of concept on Google Cloud.
  5. Calculate the total cost of ownership (TCO) of the target environment.
  6. Choose the migration strategy for your workloads.
  7. Choose your migration tools.
  8. Define the migration plan and timeline.
  9. Validate your migration plan.

The database assessment phase helps you choose the size and specifications ofyour target Cloud SQL database instance that matches the source forsimilar performance needs. Pay special attention to disk size and throughput,IOPS, and number of vCPUs. Migrations might struggle or fail due to incorrecttarget database instance sizing. Incorrect sizing can lead to long migrationtimes, database performance problems, database errors, and applicationperformance problems. When deciding on the Cloud SQL instance, keep inmind that disk performance is based on the disk size and the number of vCPUs.

The following sections rely onMigrate to Google Cloud: Assess and discover your workloads,and integrate the information in that document.

Build an inventory of your Amazon RDS or Amazon Aurora instances

To define the scope of your migration, you create an inventory and collectinformation about your Amazon RDS and Amazon Aurora instances. Ideally, thisshould be an automated process because manual approaches are prone to error andcan lead to incorrect assumptions.

Amazon RDS or Amazon Aurora and Cloud SQL for PostgreSQL orAlloyDB for PostgreSQL might not have similar features, instance specifications,or operation. Some functionalities might be implemented differently or beunavailable. Areas of differences might include infrastructure, storage,authentication and security, replication, backup, high availability, resourcecapacity model and specific database engine feature integrations, andextensions. Depending on the database engine type, instance size, andarchitecture, there are also differences in the default values of databaseparameter settings.

Benchmarking can help you to better understand the workloads that are to bemigrated and contributes to defining the right architecture of the migrationtarget environment. Collecting information about performance is important tohelp estimate the performance needs of the Google Cloud targetenvironment. Benchmarking concepts and tools are detailed in thePerform testing and validationof the migration process, but they also apply to the inventory building stage.

Tools for assessments

For an initial overview assessment of your current infrastructure, we recommendthat you useGoogle Cloud Migration Center along with other specialized database assessment tools such as migVisor andDatabase Migration Assessment Tool (DMA).

With Migration Center, you can perform a complete assessment ofyour application and database landscape, including the technical fit for adatabase migration to Google Cloud. You receive size and configurationrecommendations for each source database, and create a total cost of ownership(TCO) report for servers and databases.

For more information about assessing your AWS environment by usingMigration Center, seeImport data from other cloud providers.

In addition to Migration Center, you can use the specializedtool migVisor. migVisor supports a variety of database engines and isparticularly suitable for heterogeneous migrations. For an introduction tomigVisor, see themigVisor overview.

migVisor can identify artifacts and incompatible proprietary database featuresthat can cause migration defaulting, and can point to workarounds. migVisor canalso recommend a target Google Cloud solution, including initial sizing andarchitecture.

The migVisor database assessment output provides the following:

  • Comprehensive discovery and analysis of current database deployments.
  • Detailed report of migration complexity, based on the proprietaryfeatures used by your database.
  • Financial report with details on cost savings post migration, migrationcosts, and new operating budget.
  • Phased migration plan to move databases and associated applications withminimal disruption to the business.

To see some examples of assessment outputs, seemigVisor - Cloud migration assessment tool.

Note that migVisor temporarily increases database server utilization.Typically, this additional load is less than 3%, and can be run during non-peakhours.

The migVisor assessment output helps you to build a complete inventory of your RDSinstances. The report includes generic properties (database engine version andedition, CPUs, and memory size), as well as details about database topology,backup policies, parameter settings, and special customizations in use.

If you prefer to use open source tools, you can use data collector scripts with(or instead of) the mentioned tools. These scripts can help you collect detailedinformation (about workloads, features, database objects, and database code) andbuild your database inventory. Also, scripts usually provide a detailed databasemigration assessment, including a migration effort estimation.

We recommend the open source tool DMA,which was built by Google engineers. It offers a complete and accurate databaseassessment, including features in use, database logic, and database objects(including schemas, tables, views, functions, triggers, and storedprocedures).

To use DMA, download the collection scripts for your database engine from theGit repository,and follow the instructions. Send the output files to Google Cloud foranalysis. Google Cloud creates and delivers a database assessment readout,and provides the next steps in the migration journey.

Identify and document the migration scope and affordable downtime

At this stage, you document essential information that influences yourmigration strategy and tooling. By now, you can answer the followingquestions:

  • Are your databases larger than 5 TB?
  • Are there any large tables in your database? Are they larger than 16 TB?
  • Where are the databases located(regions and zones), and what's theirproximity to applications?
  • How often does the data change?
  • What is your data consistency model?
  • What are the options for destination databases?
  • How compatible are the source and destination databases?
  • Does the data need to reside in some physical locations?
  • Is there data that can be compressed and archived, or is there data thatdoesn't need migration at all?

To define the migration scope, decide what data to keep and what to migrate. Migratingall your databases might take considerable time and effort. Some data mightremain in your source database backups. For example, old logging tables orarchival data might not be needed. Alternatively, you might decide to move dataafter the migration process, depending on your strategy and tools.

Establish data migration baselines that help you compare and evaluate youroutcomes and impacts. These baselines are reference points that represent thestate of your data before and after the migration and help you make decisions.It's important to take measurements on the source environment that can help youevaluate your data migration's success. Such measurements include thefollowing:

  • The size and structure of your data.
  • The completeness and consistency of your data.
  • The duration and performance of the most important business transactionsand processes.

Determine how much downtime you can afford. What are the business impacts ofdowntime? Are there periods of low database activity, during which there arefewer users affected by downtime? If so, how long are such periods and when dothey occur? Consider having a partial write only downtime, while read-onlyrequests are still served.

Assess your deployment and administration process

After you build the inventories, assess the operational and deploymentprocesses for your database to determine how they need to be adapted tofacilitate your migration. These processes are fundamental to how you prepareand maintain your production environment.

Consider how you complete the following tasks:

  • Define and enforce security policies for your instances. Forexample, you might need to replace Amazon Security Groups. You can useGoogle IAM roles, VPC firewall rules, and VPC Service Controlsto control access to your Cloud SQL for PostgreSQL instances andconstrain the data within a VPC.

  • Patch and configure your instances. Your existing deployment toolsmight need to be updated. For example, you might be using customconfiguration settings in Amazon parameter groups or Amazon option groups.Your provisioning tools might need to be adapted to useCloud Storage or Secret Manager to read such customconfiguration lists.

  • Manage monitoring and alerting infrastructure. Metric categoriesfor your Amazon source database instances provide observability during themigration process. Metric categories might include Amazon CloudWatch,Performance Insights, Enhanced Monitoring, and OS process lists.

Complete the assessment

After you build the inventories from your Amazon RDS or Amazon Auroraenvironment, complete the rest of the activities of the assessment phase asdescribed inMigrate to Google Cloud: Assess and discover your workloads.

Plan and build your foundation

In the plan and build phase, you provision and configure the infrastructure todo the following:

  • Support your workloads in your Google Cloud environment.
  • Connect your source environment and your Google Cloud environment tocomplete the migration.

The plan and build phase is composed of the following tasks:

  1. Build a resource hierarchy.
  2. Configure Google Cloud's Identity and Access Management (IAM).
  3. Set up billing.
  4. Set up network connectivity.
  5. Harden your security.
  6. Set up logging, monitoring, and alerting.

For more information about each of these tasks, see theMigrate to Google Cloud: Plan and build your foundation.

If you plan to use Database Migration Service for migration, seeNetworking methods for Cloud SQL for PostgreSQL to understand the networking configurations available for migration scenarios.

Monitoring and alerting

Use GoogleCloud Monitoring,which includes predefined dashboards for several Google Cloud products,including a Cloud SQL monitoring dashboard. Alternatively, you canconsider using third-party monitoring solutions that are integrated withGoogle Cloud, like Datadog and Splunk. For more information, seeAbout database observability.

Migrate Amazon RDS and Amazon Aurora for PostgreSQL instances to Cloud SQL for PostgreSQL and AlloyDB for PostgreSQL

To migrate your instances, you do the following:

  1. Choose the migration strategy: continuous replication or scheduledmaintenance.

  2. Choose the migration tools, depending on your chosen strategy andrequirements.

  3. Define the migration plan and timeline for each database migration,including preparation and execution tasks.

  4. Define the preparation tasks that must be done to ensure the migrationtool can work properly.

  5. Define the execution tasks, which include work activities thatimplement the migration.

  6. Define fallback scenarios for each execution task.

  7. Perform testing and validation, which can be done in a separate stagingenvironment.

  8. Perform the migration.

  9. Perform the production cut-over.

  10. Clean up the source environment and configure the target instance.

  11. Perform tuning and optimization.

Each phase is described in the following sections.

Choose your migration strategy

At this step, you have enough information to evaluate and select one of thefollowing migration strategies that best suits your use case for eachdatabase:

  • Scheduled maintenance (also called one-time migration orbig bangmigration):This approach is ideal if you can afford downtime. This option isrelatively lower in cost and complexity, because your workloads andservices won't require much refactoring. However, if the migration failsbefore completion, you have to restart the process, which prolongs thedowntime.
  • Continuous replication (also called online migration ortricklemigration):For mission-critical databases, this option offers a lower risk of dataloss and near-zero downtime. The efforts are split into several chunks,so if a failure occurs, rollback and repeat takes less time. However,setup is more complex and takes more planning and time. Additional effortis also required to refactor the applications that connect to your databaseinstances.Consider one of the following variations:

    • Using theY (writing and reading) approach, which is a form of parallel migration, duplicating data in both sourceand destination instances during the migration.
    • Using a data-access microservice, which reduces refactoring effort required bythe Y (writing and reading) approach.

For more information about data migration strategies, seeEvaluating data migration approaches.

The following diagram shows a flowchart based on example questions that youmight have when deciding the migration strategy for a single database:

Flowchart to help you choose the migration strategy.

The preceding flowchart shows the following decision points:

  • Can you afford any downtime?

    • If no, adopt the continuous replication migration strategy.
    • If yes, continue to the next decision point.
  • Can you afford the downtime represented by the cut-over window whilemigrating data?The cut-over window represents the amount of time to takea backup of the database, transfer it to Cloud SQL, restore it,and then switch over your applications.

    • If no, adopt the continuous replication migration strategy.
    • If yes, adopt the scheduled maintenance migration strategy.

Strategies might vary for different databases, even when they're located on thesame instance. A mix of strategies can produce optimal results. For example,migrate small and infrequently used databases by using the scheduled maintenanceapproach, but use continuous replication for mission-critical databases wherehaving downtime is expensive.

Usually, a migration is considered completed when the switch between theinitial source instance and the target instance takes place. Any replication (ifused) is stopped and all reads and writes are done on the target instance.Switching when both instances are in sync means no data loss and minimaldowntime.

For more information about data migration strategies and deployments, seeClassification of database migrations.

Minimize downtime and migration-related impacts

Migration configurations that provide no application downtime require a morecomplicated setup. Find the right balance between setup complexity and downtimescheduled during low-traffic business hours.

Each migration strategy has a tradeoff and some impact associated with themigration process. For example, replication processes involve some additionalload on your source instances and your applications might be affected byreplication lag. Applications (and customers) might have to wait duringapplication downtime, at least as long as the replication lag lasts before usingthe new database. In practice, the following factors might increase downtime:

  • Database queries can take a few seconds to complete. At the time ofmigration, in-flight queries might be aborted.
  • The cache might take some time to fill up if the database is large orhas a substantial buffer memory.
  • Applications stopped in the source and restarted in Google Cloudmight have a small lag until the connection to the Google Clouddatabase instance is established.
  • Network routes to the applications must be rerouted. Depending on howDNS entries are set up, this can take some time. When you update DNSrecords, reduce TTL before the migration.

The following common practices can help minimize downtime and impact:

  • Find a time period when downtime would have a minimal impact on yourworkloads. For example, outside normal business hours, during weekends, orother scheduled maintenance windows.
  • Identify parts of your workloads that can undergo migration andproduction cut-over at different stages. Your applications might havedifferent components that can be isolated, adapted, and migrated with noimpact. For example, frontends, CRM modules, backend services, andreporting platforms. Such modules could have their own databases that canbe scheduled for migration earlier or later in the process.
  • If you can afford some latency on the target database, considerimplementing a gradual migration. Use incremental, batched data transfers,and adapt part of your workloads to work with the stale data on the targetinstance.
  • Consider refactoring your applications to support minimal migrationimpact. For example, adapt your applications to write to both source andtarget databases, and therefore implement an application-level replication.

Choose your migration tools

The most important factor for a successful migration is choosing the rightmigration tool. Once the migration strategy has been decided, review and decideupon the migration tool.

There are many tools available, each optimized for certain migration use cases.Use cases can include the following:

  • Migration strategy (scheduled maintenance or continuous replication).
  • Source and target database engines and engine versions.
  • Environments in which source instances and target instances are located.
  • Database size. The larger the database, the more time it takes tomigrate the initial backup.
  • Frequency of the database changes.
  • Availability to use managed services for migration.

To ensure a seamless migration and cut-over, you can use application deploymentpatterns, infrastructure orchestration, and custom migration applications.However, specialized tools called managed migration services can facilitate theprocess of moving data, applications, or even entire infrastructures from oneenvironment to another. They run the data extraction from the source databases,securely transport data to the target databases, and can optionally modify thedata during transit. With these capabilities, they encapsulate the complex logicof migration and offer migration monitoring capabilities.

Managed migration services provide the following advantages:

  • Minimize downtime: Services use the built-in replicationcapabilities of the database engines when available, and performreplica promotion.

  • Ensure data integrity and security: Data is securely andreliably transferred from the source to the destination database.

  • Provide a consistent migration experience: Different migrationtechniques and patterns can be consolidated into a consistent, commoninterface by using database migration executables, which you can manage andmonitor.

  • Offer resilient and proven migration models: Databasemigrations are infrequent but critical events. To avoid beginner mistakesand issues with existing solutions, you can use tools from experiencedexperts, rather than building a custom solution.

  • Optimize costs: Managed migration services can be more costeffective than provisioning additional servers and resources for custommigration solutions.

The next sections describe the migration tool recommendations, which depend onthe chosen migration strategy.

Tools for scheduled maintenance migrations

The following subsections describe the tools that can be used for one-timemigrations, along with the limitations and best practices of each tool.

For one-time migrations to Cloud SQL for PostgreSQL or toAlloyDB for PostgreSQL, we recommend that you use database engine backups toboth export the data from your source instance and import that data into yourtarget instance. One-time migration jobs are not supported inDatabase Migration Service.

Built-in database engine backups

When significant downtime is acceptable, and your source databases arerelatively static, you can use the database engine's built-in dump and load(also sometimes called backup and restore) capabilities.

Some effort is required for setup and synchronization, especially for a largenumber of databases, but database engine backups are usually readily availableand straightforward to use. This approach is suitable for any database size, andit's usually more effective than other tools for large instances.

Database engine backups have the following general limitations:

  • Backups can be error prone, particularly if performed manually.
  • Data can be unsecured if the snapshots are not properly managed.
  • Backups lack proper monitoring capabilities.
  • Backups require effort to scale when migrating many databases.

You can use the PostgreSQL built-in backup utilities,pg_dump andpg_dumpall, to migrate both Cloud SQL for PostgreSQL andAlloyDB for PostgreSQL. However, thepg_dump andpg_dumapall utilitieshave the following general limitations:

  • The built-in backup utilities should be used to migrate databases thatare 500 GB in size or less. Dumping and restoring large databases can take along time and may require substantial disk space and memory.
  • Thepg_dump utility doesn't include users and roles. To migratethese user accounts and roles, you can use thepg_dumpall utility.
  • Cloud Storage supports a maximum single-object size of up to 5terabytes (TB). If you have databases larger than 5 TB, the exportoperation to Cloud Storage fails. In this case, you need to breakdown your export files into smaller segments.

If you choose to use these utilities, consider the following restrictions andbest practices:

  • Compress data to reduce cost and transfer duration. Use the--jobsoption with thepg_dump command to run a given number of dump jobssimultaneously.
  • Use the-z option with thepg_dump command to specify thecompression level to use. Acceptable values for this option range from 0to 9. The default value is to compress at a 6 level. Higher values decreasethe size of the dump file, but can cause high resource consumptions at theclient host. If the client host has enough resources, higher compressionlevels can further lower the dump file size.
  • Use the correct flags when you create a SQL dump file.
  • Verify the imported database. Monitor the output of thepg_restoreutility for any error messages during the restore process. ReviewPostgreSQL logs for any warnings or errors during the restore process. Runbasic queries and table counts to verify your database integrity.

For further reading about limitations and best practices, see the followingresources:

Other approaches for scheduled maintenance migration

Using other approaches than the built-in backup utilities might provide morecontrol and flexibility in your scheduled maintenance migration. These othertypes of approaches let you perform transformations, checks, or other operationson your data while doing the migration. You can consolidate multiple instancesor databases into a single instance or database. Consolidating instances canhelp mitigate operational costs and ease your scalability issues.

One such alternative to back-up utilities is to use flat files to export andimport your data. For more information about flat file import, seeExport and import using CSV files for Cloud SQL for PostgreSQL.

Another alternative is to use a managed import to set up replication from anexternal PostgreSQL database. When you use a managed import, there is an initialdata load from the external database into the Cloud SQL for PostgreSQLinstance. This initial load uses a service that extracts data from the externalserver - the RDS or Aurora instance - and imports it into theCloud SQL for PostgreSQL instance directly. For more information, seeuse a managed import to set up replication from external databases.

An alternative way to do a one-time data migration of your data is to export thetables from your source PostgreSQL database to CSV or SQL files. You can thenimport the CSV or SQL file into Cloud SQL for PostgreSQL orAlloyDB for PostgreSQL. To export the date of your source instance, you can usetheaws_s3 extension for PostgreSQL. Alternatively, you can use AmazonDatabase Migration Service and an S3 bucket as a target. For detailed information aboutthis approach, see the following resources:

You can also manually import data into an AlloyDB for PostgreSQL instance. Thesupported formats are as follows:

  • CSV: With this file format, each file in this format contains thecontents of one table in the database. You can load the data into the CSVfile by using thepsql command-line program. For more information, seeImport a CSV file.
  • DMP: This file format contains the archive of an entire PostgreSQLdatabase. You import data from this file by using thepg_restore utility.For more information, seeImport a DMP file.
  • SQL: This file format contains the text reconstruction of a PostgreSQLdatabase. The data in this file is processed by using thepsql commandline. For more information, seeImport a SQL file.

Tools for continuous replication migrations

The following diagram shows a flowchart with questions that can help you choosethe migration tool for a single database when you use a continuous replicationmigration strategy:

Flowchart to help you choose a tool for continuous replication migrations.

The preceding flowchart shows the following decision points:

  • Do you prefer to use managed migration services?

    • If yes, can you afford a few seconds of write downtimewhile the replication step takes place?

      • If yes, use Database Migration Service.
      • If no, explore other migration options.
    • If no, you must evaluate whether database engine built-in replicationis supported:

      • If yes, we recommend that you use built-in replication.
      • If no, we recommend that you explore other migration options.

The following sections describe the tools that can be used for continuousmigrations, along with their limitations and best practices.

Database Migration Service for continuous replication migration

Database Migration Service provides a specific job type for continuous migrations.These continuous migration jobs support high-fidelity migrations toCloud SQL for PostgreSQL and to AlloyDB for PostgreSQL.

When you use Database Migration Service to migrate to Cloud SQL for PostgreSQL orAlloyDB for PostgreSQL, there are prerequisites and limitations that areassociated with each target instance:

Database engine built-in replication

Database engine built-in replication is an alternative option to Database Migration Servicefor a continuous migration.

Database replication represents the process of copying and distributing datafrom a database called the primary database to other databases called replicas. It'sintended to increase data accessibility and improve the fault tolerance andreliability of a database system. Although database migration is not the primarypurpose of database replication, it can be successfully used as a tool toachieve this goal. Database replication is usually an ongoing process thatoccurs in real time as data is inserted, updated, or deleted in the primarydatabase. Database replication can be done as a one-time operation, or asequence of batch operations.

Most modern database engines implement different ways of achieving databasereplication. One type of replication can be achieved by copying and sending thewrite ahead log or transaction log of the primary to its replicas. This approachis called physical or binary replication. Other replication types work bydistributing the raw SQL statements that a primary database receives, instead ofreplicating file system level changes. These replicationtypes are called logical replication. For PostgreSQL, there are also third-partyextensions, such aspglogical, that implement logical replication relying onwrite-ahead logging (WAL).

Cloud SQL supports replication for PostgreSQL. However, there are someprerequisites and limitations.

The following are limitations and prerequisites forCloud SQL for PostgreSQL:

  • The following Amazon versions are supported:

    • Amazon RDS 9.6.10 and later, 10.5 and later, 11.1 and later,12, 13, 14
    • Amazon Aurora 10.11 and later, 11.6 and later, 12.4 and later,and 13.3 and later
  • The firewall of the external server must be configured to allow theinternal IP range that has been allocated for the private services accessof the VPC network. The Cloud SQL for PostgreSQL replica database usesthe VPC network as its private network.

  • The firewall of the source database must be configured to allow theentire internal IP range that has been allocated for the private serviceconnection of the VPC network. The Cloud SQL for PostgreSQLdestination instance uses this VPC network in theprivateNetwork field ofitsIpConfiguration setting.

  • When you install the pglogical extension on aCloud SQL for PostgreSQL instance, make sure that you have set theenable_pglogical flag toon (for example,cloudsql.enable_pglogical=on).

  • Make sure theshared_preload_libraries parameter includes thepglogical extension on your source instance (for example,shared_preload_libraries=pg_stat_statements,pglogical).Set therds.logical_replication parameter to 1. This setting enableswrite-ahead logs at the logical level.

    These changes require a restart of the primary instance.

For more information about using Cloud SQL for PostgreSQL for replication,see theexternal server checklist in thereplication section for PostgreSQL and alsoSet up logical replication and decoding for PostgreSQL from the Cloud SQL official documentation.

AlloyDB for PostgreSQL supports replication through the pglogical extension. Toenable the pglogical extension for replication, you can use theCREATE EXTENSION command. Before using that command, you must first set thedatabase flagsalloydb.enable_pglogical andalloydb.logical_decoding toonin the AlloyDB for PostgreSQL instance where you want to use the extension.Setting these flags requires a restart of the instance.

Other approaches for continuous replication migration

You can use Datastream to replicate near real-time changes of yoursource PostgreSQL instance. Datastream uses change data capture(CDC) and replication to replicate and synchronize data. You can then use Datastream for continuous replication from Amazon RDS and AmazonAurora. You use Datastream to replicate changes from yourPostgreSQL instance to either BigQuery or Cloud Storage. Thatreplicated data can then be brought into your Cloud SQL for PostgreSQLand AlloyDB for PostgreSQL instance with Dataflow by using aDataflow Flex Template, or with Dataproc.

For more information about using Datastream and Dataflow, see the following resources:

Third-party tools for continuous replication migration

In some cases, it might be better to use one third-party tool for most databaseengines. Such cases might be if you prefer to use a managed migration serviceand you need to ensure that the target database is always in near-real-time syncwith the source, or if you need more complex transformations like datacleaning, restructuring, and adaptation during the migration process.

If you decide to use a third-party tool, choose one of the followingrecommendations, which you can use for most database engines.

Striim is an end-to-end, in-memory platform for collecting, filtering, transforming,enriching, aggregating, analyzing, and delivering data in real time:

  • Advantages:

    • Handles large data volumes and complex migrations.
    • Built-in change data capture for SQL Server.
    • Preconfigured connection templates and no-code pipelines.
    • Able to handle mission-critical, large databases that operateunder heavy transactional load.
    • Exactly-once delivery.
  • Disadvantages:

For more information about Striim, seeRunning Striim in the Google Cloud.

Debezium is an open source distributed platform for CDC, and can stream data changes toexternal subscribers:

  • Advantages:

    • Open source.
    • Strong community support.
    • Cost effective.
    • Fine-grained control on rows, tables, or databases.
    • Specialized for change capture in real time from database transaction logs.
  • Disadvantages:

    • Requires specific experience with Kafka and ZooKeeper.
    • At-least-once delivery of data changes, which means that youneed duplicates handling.
    • Manual monitoring setup using Grafana and Prometheus.
    • No support for incremental batch replication.

For more information about Debezium migrations, seeNear Real Time Data Replication using Debezium.

Fivetran is an automated data movement platform for moving data out of and across cloud dataplatforms.

  • Advantages:

    • Preconfigured connection templates and no-code pipelines.
    • Propagates any schema changes from your source to the target database.
    • Exactly-once delivery of your data changes, which means that you don'tneed duplicates handling.
  • Disadvantages:

    • Not open source.
    • Support for complex data transformation is limited.

Define the migration plan and timeline

For a successful database migration and production cut-over, we recommend thatyou prepare a well-defined, comprehensive migration plan. To help reduce theimpact on your business, we recommend that you create a list of all thenecessary work items.

Defining the migration scope reveals the work tasks that you must do before,during, and after the database migration process. For example, if you decide notto migrate certain tables from a database, you might need pre-migration orpost-migration tasks to implement this filtering. You also ensure that yourdatabase migration doesn't affect your existing service-level agreement (SLA) and business continuityplan.

We recommend that your migration planning documentation include the followingdocuments:

  • Technical design document (TDD)
  • RACI matrix
  • Timeline (such as a T-Minus plan)

Database migrations are an iterative process, and first migrations are oftenslower than the later ones. Usually, well-planned migrations run without issues,but unplanned issues can still arise. We recommend that you always have a rollbackplan. As a best practice, follow the guidance fromMigrate to Google Cloud: Best practices for validating a migration plan.

TDD

The TDD documents all technical decisions to be made for the project. Includethe following in the TDD:

  • Business requirements and criticality
  • Recovery time objective (RTO)
  • Recovery point objective (RPO)
  • Database migration details
  • Migration effort estimates
  • Migration validation recommendations

RACI matrix

Some migrations projects require a RACI matrix, which is a common projectmanagement document that defines which individuals or groups are responsible fortasks and deliverables within the migration project.

Timeline

Prepare a timeline for each database that needs to be migrated. Include allwork tasks that must be performed, and defined start dates and estimated enddates.

For each migration environment, we recommend that you create a T-minus plan.A T-minus plan is structured as a countdown schedule, and lists all the tasksrequired to complete the migration project, along with the responsible groups andestimated duration.

The timeline should account for not only pre-migration preparation tasksexecution, but also validating, auditing, or testing tasks that happen after thedata transfer takes place.

The duration of migration tasks typically depends on database size, but thereare also other aspects to consider, like business logic complexity, applicationusage, and team availability.

A T-Minus plan might look like the following:

DatePhaseCategoryTasksRoleT-minusStatus
11/1/2023Pre-migrationAssessmentCreate assessment reportDiscovery team-21Complete
11/7/2023Pre-migrationTarget preparationDesign target environment as described by the design documentMigration team-14Complete
11/15/2023Pre-migrationCompany governanceMigration date and T-Minus approvalLeadership-6Complete
11/18/2023MigrationSet up DMSBuild connection profilesCloud migration engineer-3Complete
11/19/2023MigrationSet up DMSBuild and start migration jobsCloud migration engineer-2Not started
11/19/2023MigrationMonitor DMSMonitor DMS Jobs and DDL changes in the source instanceCloud migration engineer-2Not started
11/21/2023MigrationCutover DMSPromote DMS replicaCloud migration engineer0Not started
11/21/2023MigrationMigration validationDatabase migration validationMigration team0Not started
11/21/2023MigrationApplication testRun capabilities and performance testsMigration team0Not started
11/22/2023MigrationCompany governanceMigration validation GO or NO GOMigration team1Not started
11/23/2023Post-migrationValidate monitoringConfigure monitoringInfrastructure team2Not started
11/25/2023Post-migrationSecurityRemove DMS user accountSecurity team4Not started

Multiple database migrations

If you have multiple databases to migrate, your migration plan should containtasks for all of the migrations.

We recommend that you start the process by migrating a smaller, ideallynon-mission-critical database. This approach can help you to build yourknowledge and confidence in the migration process and tooling. You can alsodetect any flaws in the process in the early stages of the overall migrationschedule.

If you have multiple databases to migrate, the timelines can be parallelized.For example, to speed up the migration process, you might choose to migrate agroup of small, static, or less mission-critical databases at the same time, asshown in the following diagram.

Parallel database migration tasks.

In the example shown in the diagram, databases 1-4 are a group of smalldatabases that are migrated at the same time.

Define the preparation tasks

The preparation tasks are all the activities that you need to complete tofulfill the migration prerequisites. Without the preparation tasks, themigration can't take place or the migration results in the migrated databasebeing in an unusable state.

Preparation tasks can be categorized as follows:

  • Preparations and prerequisites for an Amazon RDS or Amazon Aurora instance
  • Source database preparation and prerequisites
  • Cloud SQL for PostgreSQL and AlloyDB for PostgreSQL instance setup
  • Migration-specific setup

Amazon RDS or Amazon Aurora instance preparation and prerequisites

Consider the following common setup and prerequisite tasks:

  • Depending on your migration path, you might need to allow remoteconnections on your RDS instances. If your RDS instance is configured to beprivate in your VPC, privateRFC 1918 connectivity must exist between Amazonand Google Cloud.
  • You might need to configure a new security group to allow remoteconnections on required ports and apply the security group to your AmazonRDS or Amazon Aurora instance:

    • By default, in AWS, network access is turnedoff for database instances.
    • You can specify rules in a security group that allow access froman IP address range, port, or security group. The same rules apply toall database instances that are associated with that security group.
  • If you are migrating from Amazon RDS, make sure that you have enoughfree disk to buffer write-ahead logs for the duration of the full loadoperation on your Amazon RDS instance.

  • For ongoing replication (streaming changes through CDC), you must use afull RDS instance and not a read replica.

  • If you're using built-in replication, you need to set up your Amazon RDSor Amazon Aurora instance for replication for PostgreSQL. Built-inreplication or tools that use built-in replication need logical replicationfor PostgreSQL.

  • If you're using third-party tools, upfront settings and configurationsare usually required before using the tool. Check the documentation fromthe third-party tool.

For more information about instance preparation and prerequisites, seeSet up the external server for replication for PostgreSQL.

Source database preparation and prerequisites

  • If you choose to use Database Migration Service, configure your source databasebefore connecting to it. For more information, seeConfigure your source for PostgreSQL andConfigure your source for PostgreSQL to AlloyDB for PostgreSQL.

  • For tables that don't have primary keys, after Database Migration Servicemigrates the initial backup, onlyINSERT statements will be migrated tothe target database during the CDC phase.DELETE andUPDATE statementsare not migrated for those tables.

  • Consider that large objects can't be replicated by Database Migration Service,as the logical decoding facility in PostgreSQL doesn't support decodingchanges to large objects.

  • If you choose to use built-in replication, consider that logicalreplication has certainlimitations with respect to data definition language (DDL) commands, sequences, and large objects.Primary keys must exist or be added on tables that are to be enabled forCDC and that go through lots of updates.

  • Some third-party migration tools require that all large object columnsare nullable. Any large object columns that areNOT NULL need to have thatconstraint removed during migration.

  • Take baseline measurements on your source environment in productionuse. Consider the following:

    • Measure the size of your data, as well as your workload'sperformance. How long do important queries or transactions take, onaverage? How long during peak times?
    • Document the baseline measurements for later comparison, to helpyou decide if the fidelity of your database migration is satisfactory.Decide if you can switch your production workloads and decommissionyour source environment, or if you still need it for fallback purposes.

Cloud SQL for PostgreSQL and AlloyDB for PostgreSQL instance setup

To have your target instance achieve similar performance levels to that of yoursource instance, choose the size and specifications of your target PostgreSQLdatabase instance to match those of the source instance. Pay special attentionto disk size and throughput, input/output operations per second (IOPS), andnumber of virtual CPUs (vCPUs). Incorrect sizing can lead to long migrationtimes, database performance problems, database errors, and applicationperformance problems. When deciding on the Cloud SQL for PostgreSQL orAlloyDB for PostgreSQL instance, keep in mind that disk performance is based onthe disk size and the number of vCPUs.

You must confirm the following properties and requirements before you createyour Cloud SQL for PostgreSQL and AlloyDB for PostgreSQL instances. If youwant to change these properties and requirements later, you will need torecreate the instances.

  • Choose the project andregion of your targetCloud SQL for PostgreSQL and AlloyDB for PostgreSQL instancescarefully. Instances can't be migrated between Google Cloud projects andregions without data transfer.

  • Migrate to a matching major version on Cloud SQL for PostgreSQLand AlloyDB for PostgreSQL. For example, if you are using PostgreSQL 14.xon Amazon RDS or Amazon Aurora, you should migrate to Cloud SQL orAlloyDB for PostgreSQL for PostgreSQL version 14.x.

  • Replicate user information separately if you are using built-indatabase engine backups or Database Migration Service. For details, review thelimitations in theDatabase engine specific backups section.

  • Review the database engine specific configuration flags and comparetheir source and target instance values. Make sure you understand theirimpact and whether they need to be the same or not. For example, whenworking with PostgreSQL, we recommend comparing the values from thepg_settings table on your source database to the one on theCloud SQL and AlloyDB for PostgreSQL instance. Update flagsettings as needed on the target database instance to replicate the sourcesettings.

  • Depending on the nature of your workload, you might need to enablespecific extensions to support your database. If your workload requiresthese extensions, review the supported PostgreSQL extensions and how toenable them in Cloud SQL and AlloyDB for PostgreSQL.

For more information about Cloud SQL for PostgreSQL setup, seeInstance configuration options,Database engine specific flags,andsupported extensions.

For more information about AlloyDB for PostgreSQL setup, seeSupported database flags andsupported extensions.

Migration specific setup

If you can afford downtime, you can import SQL dump files to Cloud SQLand AlloyDB for PostgreSQL. In such cases you might need to create aCloud Storage bucket to store the database dump.

If you use replication, you must ensure that the Cloud SQL andAlloyDB for PostgreSQL replica has access to your primary (source) database.This access can be gained by using the documented connectivity options.

Depending on your use case and criticality, you might need to implement afallback scenario, which usually includes reversing the direction of thereplication. In this case, you might need an additional replication mechanismfrom your target Cloud SQL and AlloyDB for PostgreSQL back to yoursource Amazon instance.

You can decommission the resources that connect your Amazon andGoogle Cloud environment after the migration is completed and validated.

If you're migrating to AlloyDB for PostgreSQL, consider using aCloud SQL for PostgreSQL instance as a potential destination for your fallbackscenarios. Use the pglogical extension to set up logical replication to thatCloud SQL instance.

For more information, see the following resources:

Define the execution tasks

Execution tasks implement the migration work itself. The tasks depend on yourchosen migration tool, as described in the following subsections.

Built-in database engine backups

Use thepg_dump utility to create a backup. For more information about usingthis utility to import and export data, see the following resources:

Database Migration Service migration jobs

Define and configure migration jobs in Database Migration Service to migrate data from asource instance to the destination database. Migration jobs connect to thesource database instance through user-defined connection profiles.

Test all the prerequisites to ensure the job can run successfully. Choose atime when your workloads can afford a small downtime for the migration andproduction cut-over.

In Database Migration Service, the migration begins with the initial schema dump andrestore without indexes and constraints, followed by the data copy operation.After data copy completes, indexes and constraints are restored. Finally thecontinuous replication of changes from the source to the destination databaseinstance is started.

Database Migration Service uses thepglogical extension for replication from your source to the target database instance. Atthe beginning of migration, this extension sets up replication by requiringexclusive short-term locks on all the tables in your source Amazon RDS or AmazonAurora instance. For this reason, we recommend starting the migration when thedatabase is least busy, and avoiding statements on the source during the dumpand replication phase, as DDL statements are not replicated. If you must performDDL operations, use the 'pglogical' functions to run DDL statements on yoursource instance or manually run the same DDL statements on the migration targetinstance, but only after the initial dump stage finishes.

The migration process with Database Migration Service ends with the promotionoperation. Promoting a database instance disconnects the destination databasefrom the flow of changes coming from the source database, and then the nowstandalone destination instance is promoted to a primary instance. This approachis also called a production switch.

For a fully detailed migration setup process, see the quick start guides forPostgreSQLandPostgreSQL to AlloyDB for PostgreSQL.

Database engine built-in replication

Cloud SQL supports two types of logical replication: the built-inlogical replication of PostgreSQL and logical replication through thepglogical extension. For AlloyDB for PostgreSQL we recommend using thepglogical extension for replication. Each type of logical replication has itsown features and limitations.

Built-in logical replication has the following features and limitations:

  • It's available in PostgreSQL 10 and later.
  • All changes and columns are replicated. Publications are defined at thetable level.
  • Data is only replicated from base tables to base tables.
  • It doesn't perform sequence replication.
  • It's the recommended replication type when there are many tables that haveno primary key. Set up built-in PostgreSQL logical replication. For tableswithout a primary key, apply theREPLICA IDENTITY FULL form so thatbuilt-in replication uses the entire row as the unique identifier insteadof a primary key.

pglogical logical replication has the following features and limitations:

  • It's available in all PostgreSQL versions and offers cross version support.
  • Row filtering is available on the source.
  • It doesn't replicateUNLOGGED andTEMPORARY tables.
  • A primary key or unique key is required on tables to captureUPDATE andDELETE changes.
  • Sequence replication is available.
  • Delayed replication is possible.
  • It provides conflict detection and configurable resolution if there aremultiple publishers or conflicts between replicated data and local changes.

For instructions on how to set up built-in PostgreSQL logical replication froman external server like Amazon RDS or Amazon Aurora for PostgreSQL, see thefollowing resources:

Third-party tools

Define any execution tasks for the third-party tool you've chosen.

This section focuses on Striim as an example. Striim uses applications thatacquire data from various sources, process the data, and then deliver the datato other applications or targets.

You use one or more flows to organize these migration processes within yourcustom applications. To code your custom applications, you have a choice ofusing a graphical programming tool or the Tungsten Query Language (TQL)programming language.

For more information about how to use Striim, see the following resources:

If you decide to use Striim to migrate your data, see the following guides onhow to use Striim to migrate data into Google Cloud:

Define fallback scenarios

Define fallback action items for each migration execution task, to safeguardagainst unforeseen issues that might occur during the migration process. Thefallback tasks usually depend on the migration strategy and tools used.

Fallback might require significant effort. As a best practice, don't performproduction cut-over until your test results are satisfactory. Both the databasemigration and the fallback scenario should be properly tested to avoid a severeoutage.

Define success criteria and timebox all your migration execution tasks. Doing amigration dry run helps collect information about the expected times for eachtask. For example, for a scheduled maintenance migration, you can afford thedowntime represented by the cut-over window. However, it's important to planyour next action in case the one-time migration job or the restore of the backupfails midway. Depending on how much time of your planned downtime has elapsed,you might have to postpone the migration if the migration task doesn't finish inthe expected amount of time.

A fallback plan usually refers to rolling back the migration after you performthe production cut-over, if issues on the target instance appear. If youimplement a fallback plan, remember that it must be treated as a full databasemigration, including planning and testing.

If you choose not to have a fallback plan, make sure you understand thepossible consequences. Having no fallback plan can add unforeseen effort andcause avoidable disruptions in your migration process.

Although a fallback is a last resort, and most database migrations don't end upusing it, we recommend that you always have a fallback strategy.

Simple fallback

In this fallback strategy, you switch your applications back to the originalsource database instance. Adopt this strategy if you can afford downtime whenyou fall back or if you don't need the transactions committed on the new targetsystem.

If you do need all the written data on your target database, and you can affordsome downtime, you can consider stopping writes to your target databaseinstance, taking built-in backups and restoring them on your source instance,and then re-connecting your applications to the initial source databaseinstance. Depending on the nature of your workload and amount of data written onthe target database instance, you could bring it into your initial sourcedatabase system at a later time, especially if your workloads aren'tdependent on any specific record creation time or any time ordering constraints.

Reverse replication

In this strategy, you replicate the writes that happen on your new targetdatabase after production cut-over back to your initial source database. In thisway, you keep the original source in sync with the new target database and havethe writes happening on the new target database instance. Its main disadvantageis that you can't test the replication stream until after you cut-over to thetarget database instance, therefore it doesn't allow end-to-end testing and ithas a small period of no fallback.

Choose this approach when you can still keep your source instance for some timeand you migrate using the continuous replication migration.

Forward replication

This strategy is a variation of reverse replication. You replicate thewrites on your new target database to a third database instance of your choice.You can point your applications to this third database, which connects to theserver and runs read-only queries while the server is unavailable. You can useany replication mechanism, depending on your needs. The advantage of thisapproach is that it can be fully end-to-end tested.

Take this approach when you want to be covered by a fallback at all times orwhen you must discard your initial source database shortly after the productioncut-over.

Duplicate writes

If you choose a Y (writing and reading) or data-access microservice migrationstrategy, this fallback plan is already set. This strategy is more complicated,because you need to refactor applications or develop tools that connect to yourdatabase instances.

Your applications write to both initial source and target database instances,which lets you perform a gradual production cut-over until you are using onlyyour target database instances. If there are any issues, you connect yourapplications back to the initial source with no downtime. You can discard theinitial source and the duplicate writing mechanism when you consider themigration performed with no issues observed.

We recommend this approach when it's critical to have no migration downtime,have a reliable fallback in place, and when you have time and resources toperform application refactoring.

Perform testing and validation

The goals of this step are to test and validate the following:

  • Successful migration of the data in the database.
  • Integration with existing applications after they are switched to usethe new target instance.

Define the key success factors, which are subjective to your migration. Thefollowing are examples of subjective factors:

  • Which data to migrate. For some workloads, it might not be necessary to migrate all of thedata. You might not want to migrate data that is already aggregated,redundant, archived, or old. You might archive that data in aCloud Storage bucket, as a backup.
  • An acceptable percentage of data loss. This particularly applies to dataused for analytics workloads, where losing part of the data does not affectgeneral trends or performance of your workloads.
  • Data quality and quantity criteria, which you can apply to your sourceenvironment and compare to the target environment after the migration.
  • Performance criteria. Some business transactions might be slower in thetarget environment, but the processing time is still within definedexpectations.

The storage configurations in your source environment might not map directly toGoogle Cloud environment targets. For example, configurations from theGeneral Purpose SSD (gp2 and gp3) volumes with IOPS burst performance orProvisioned IOPS SSD.To compare and properly size the target instances, benchmark your sourceinstances, in both the assessment and validation phases.

In the benchmarking process, you apply production-like sequences of operationsto the database instances. During this time, you capture and process metrics tomeasure and compare the relative performance of both source and targetsystems.

For conventional, server based configurations, use relevant measurementsobserved during peak loads. For flexible resource capacity models like AuroraServerless, consider looking at historical metric data to observe your scalingneeds.

The following tools can be used for testing, validation, and databasebenchmarking:

  • HammerDB:an open source database benchmarking and load testing tool. It supportscomplex transactional and analytic workloads, based on industry standards,on multiple database engines (both TPROC-C and TPROC-H). HammerDB hasdetailed documentation and a wide community of users. You can share andcompare results across several database engines and storage configurations.For more information, seeLoad testing SQL Server using HammerDB andBenchmark Amazon RDS SQL Server performance using HammerDB.
  • DBT2 Benchmark Tool:benchmarking specialized for MySQL. A set of database workload kits mimicsan application for a company that owns warehouses and involves a mix ofread and write transactions. Use this tool if you want to use a ready-madeonline transaction processing (OLTP) load test.
  • DbUnit:an open source unit testing tool used to test relational databaseinteractions in Java. The setup and use is straightforward, and it supportsmultiple database engines (MySQL, PostgreSQL, SQL Server, and others).However, the test execution can be slow sometimes, depending on the sizeand complexity of the database. We recommend this tool when simplicity isimportant.
  • DbFit:an open source database testing framework that supports test-driven codedevelopment and automated testing. It uses a basic syntax for creating testcases and features data-driven testing, version control, and test resultreporting. However, support for complex queries and transactions is limitedand it doesn't have large community support or extensive documentation,compared to other tools. We recommend this tool if your queries are notcomplex and you want to perform automated tests and integrate them withyour continuous integration and delivery process.

To run an end-to-end test, including testing of the migration plan, alwaysperform a migration dry run exercise. A dry run performs the full-scope databasemigration without switching any production workloads, and it offers thefollowing advantages:

  • Lets you ensure that all objects and configurations are properly migrated.
  • Helps you define and execute your migration test cases.
  • Offers insights into the time needed for the actual migration, so youcan calibrate your timeline.
  • Represents an occasion to test, validate, and adapt the migration plan.Sometimes you can't plan for everything in advance, so this helps you tospot any gaps.

Data testing can be performed on a small set of the databases to be migrated orthe entire set. Depending on the total number of databases and the tools usedfor implementing their migration, you can decide to adopt a risk based approach.With this approach, you perform data validation on a subset of databasesmigrated through the same tool, especially if this tool is a managed migrationservice.

For testing, you should have access to both source and target databases and dothe following tasks:

  • Compare source and target schemas. Check if all tables and executablesexist. Check row counts and compare data at the database level.
  • Run custom data validation scripts.
  • Test that the migrated data is also visible in the applications thatswitched to use the target database (migrated data is read through theapplication).
  • Perform integration testing between the switched applications and thetarget database by testing various use cases. This testing includes both readingand writing data to the target databases through the applications so thatthe workloads fully support migrated data together with newly created data.
  • Test the performance of the most used database queries to observe ifthere's any degradation due to misconfigurations or wrong sizing.

Ideally, all these migration test scenarios are automated and repeatableon any source system. The automated test cases suite is adapted to performagainst the switched applications.

If you're using Database Migration Service as your migration tool, see either thePostgreSQL orPostgreSQL to AlloyDB for PostgreSQL version of the "Verify a migration" topic.

Data Validation Tool

For performing data validation, we recommend that you use theData Validation Tool (DVT).The DVT is an open sourced Python CLI tool, backed by Google, thatprovides an automated and repeatable solution for validation across differentenvironments.

The DVT can help streamline the data validation process by offering customized,multi-level validation functions to compare source and target tables on thetable, column, and row level. You can also add validation rules.

The DVT covers many Google Cloud data sources, including AlloyDB for PostgreSQL,BigQuery, Cloud SQL, Spanner, JSON, and CSV files onCloud Storage. It can also be integrated with Cloud Run functionsand Cloud Run for event based triggering and orchestration.

The DVT supports the following types of validations:

  • Schema level comparisons
  • Column (including 'AVG', 'COUNT', 'SUM', 'MIN', 'MAX', 'GROUP BY', and 'STRING_AGG')
  • Row (including hash and exact match in field comparisons)
  • Custom query results comparison

For more information about the DVT, see theGit repository andData validation made easy with Google Cloud's Data Validation Tool.

Perform the migration

The migration tasks include the activities to support the transfer from onesystem to another.

Consider the following best practices for your data migration:

  • Inform the involved teams whenever a plan step begins and finishes.
  • If any of the steps take longer than expected, compare the time elapsedwith the maximum amount of time allotted for that step. Issue regularintermediary updates to involved teams when this happens.
  • If the time span is greater than the maximal amount of time reserved foreach step in the plan, consider rolling back.
  • Make "go or no-go" decisions for every step of the migration andcut-over plan.
  • Consider rollback actions or alternative scenarios for each of the steps.

Perform the migration by following yourdefined execution tasks, and refer to the documentation for your selected migration tool.

Perform the production cut-over

The high-level production cut-over process can differ depending on your chosenmigration strategy. If you can have downtime on your workloads, then yourmigration cut-over begins by stopping writes to your source database.

For continuous replication migrations, you typically do the following high-levelsteps in the cut-over process:

  • Stop writing to the source database.
  • Drain the source.
  • Stop the replication process.
  • Deploy the applications that point to the new target database.

After the data has been migrated by using the chosen migration tool, youvalidate the data in the target database. You confirm that the source databaseand the target databases are in sync and the data in the target instance adheresto your chosen migration success standards.

Once the data validation passes your criteria, you can perform the applicationlevel cut-over. Deploy the workloads that have been refactored to use the newtarget instance. You deploy the versions of your applications that point to thenew target database instance. The deployments can be performed either throughrolling updates, staged releases, or by using a blue-green deployment pattern.Some application downtime might be incurred.

Follow the best practices for your production cut-over:

  • Monitor your applications that work with the target database after thecut-over.
  • Define a time period of monitoring to consider whether or not you needto implement your fallback plan.
  • Note that your Cloud SQL or AlloyDB for PostgreSQL instancemight need a restart if you change some database flags.
  • Consider that the effort of rolling back the migration might be greaterthan fixing issues that appear on the target environment.

Cleanup the source environment and configure the Cloud SQL or AlloyDB for PostgreSQL instance

After the cut-over is completed, you can delete the source databases. Werecommend performing the following important actions before the cleanup of yoursource instance:

  • Create a final backup of each source database. These backups provide you withan end state of the source databases. The backups might also be required insome cases for compliance with some data regulations.

  • Collect the database parameter settings of your source instance.Alternatively, check that they match the ones you've gathered in theinventory building phase. Adjust the target instance parameters to matchthe ones from the source instance.

  • Collect database statistics from the source instance and compare themto the ones in the target instance. If the statistics are disparate, it'shard to compare the performance of the source instance and target instance.

In a fallback scenario, you might want to implement the replication of yourwrites on the Cloud SQL instance back to your original source database.The setup resembles the migration process but would run in reverse: the initialsource database would become the new target.

As a best practice to keep the source instances up to date after the cut-over,replicate the writes performed on the target Cloud SQL instances backto the source database. If you need to roll back, you can fall back toyour old source instances with minimal data loss.

Alternatively, you can use another instance and replicate your changes to thatinstance. For example, when AlloyDB for PostgreSQL is a migration destination,consider setting up replication to a Cloud SQL for PostgreSQL instance forfallback scenarios.

In addition to the source environment cleanup, the following criticalconfigurations for your Cloud SQL for PostgreSQL instances must be done:

  • Configure a maintenance window for your primary instance to controlwhen disruptive updates can occur.
  • Configure the storage on the instance so that you have at least 20%available space to accommodate any critical database maintenance operationsthat Cloud SQL may perform. To receive an alert if available diskspace gets lower than 20%, create a metrics-based alerting policy for thedisk utilization metric.

Don't start an administrative operation before the previous operation hascompleted.

For more information about maintenance and best practices onCloud SQL for PostgreSQL and AlloyDB for PostgreSQL instances, see thefollowing resources:

For more information about maintenance and best practices, seeAbout maintenance on Cloud SQL instances.

Optimize your environment after migration

Optimization is the last phase of your migration. In this phase, you iterate onoptimization tasks until your target environment meets your optimizationrequirements. The steps of each iteration are as follows:

  1. Assess your current environment, teams, and optimization loop.
  2. Establish your optimization requirements and goals.
  3. Optimize your environment and your teams.
  4. Tune the optimization loop.

You repeat this sequence until you've achieved your optimization goals.

For more information about optimizing your Google Cloud environment, seeMigrate to Google Cloud: Optimize your environment andGoogle Cloud Well-Architected Framework: Performance optimization.

Establish your optimization requirements

Review the following optimization requirements for your Google Cloudenvironment and choose the ones that best fit your workloads:

Increase the reliability and availability of your database

With Cloud SQL, you can implement a high availability and disasterrecovery strategy that aligns with your recovery time objective (RTO) andrecovery point objective (RPO). To increase reliability and availability,consider the following:

Increase the cost effectiveness of your database infrastructure

To have a positive economic impact, your workloads must use the availableresources and services efficiently. Consider the following options:

Increase the performance of your database infrastructure

Minor database-related performance issues frequently have the potential to impactthe entire operation. To maintain and increase your Cloud SQL instanceperformance, consider the following guidelines:

  • If you have a large number of database tables, they can affect instanceperformance and availability, and cause the instance to lose its SLA coverage.
  • Ensure that your instance isn't constrained on memory or CPU.

    • For performance-intensive workloads, ensure that your instancehas at least 60 GB of memory.
    • For slow database inserts, updates, or deletes, check thelocations of the writer and database; sending data over long distancesintroduces latency.
  • Improve query performance by using the predefined Query Insightsdashboard in Cloud Monitoring (or similar database engine built-infeatures). Identify the most expensive commands and try to optimize them.

  • Prevent database files from becoming unnecessarily large. Setautogrow in MBs rather than as a percentage, using increments appropriate to therequirement.

  • Check reader and database location. Latency affects read performancemore than write performance.

When migrating from Amazon RDS and Aurora for PostgreSQL toCloud SQL for PostgreSQL, consider the following guidelines:

  • Use caching to improve read performance. Inspect the various statisticsfrom thepg_stat_database view. For example, theblks_hit / (blks_hit + blks_read) ratio should begreater than 99%. If this ratio isn't greater than 99%, consider increasingthe size of your instance's RAM. For more information, seePostgreSQL statistics collector.
  • Reclaim space and prevent poor index performance. Depending on how oftenyour data is changing, either set a schedule to run theVACUUM command onyour Cloud SQL for PostgreSQL.
  • Use Cloud SQL Enterprise Plus edition for increased machineconfiguration limits and data cache. For more information aboutCloud SQL Enterprise Plus, seeIntroduction toCloud SQL editions.
  • Switch toAlloyDB for PostgreSQL.If you switch, you can have full PostgreSQL compatibility, bettertransactional processing, and fast transactional analytics workloadssupported on your processing database. You also get a recommendation fornew indexes through the use of the index advisor feature.

For more information about increasing the performance of yourCloud SQL for PostgreSQL database infrastructure, see Cloud SQLperformance improvement documentation forPostgreSQL.

When migrating from Amazon RDS and Aurora for PostgreSQL toAlloyDB for PostgreSQL, consider the following guidelines to increase theperformance of your AlloyDB for PostgreSQL database instance:

Increase database observability capabilities

Diagnosing and troubleshooting issues in applications that connect to databaseinstances can be challenging and time-consuming. For this reason, a centralizedplace where all team members can see what's happening at the database and instancelevel is essential. You can monitor Cloud SQL instances in thefollowing ways:

  • Cloud SQL uses built-in memory custom agents to collect querytelemetry.

    • UseCloud Monitoring to collect measurements of your service and the Google Cloudresources that you use.Cloud Monitoring includes predefined dashboards for several Google Cloud products,including a Cloud SQL monitoring dashboard.
    • You can create custom dashboards that help you monitor metricsand set up alert policies so that you can receive timely notifications.
    • Alternatively, you can consider using third-party monitoring solutionsthat are integrated with Google Cloud, such as Datadog and Splunk.
  • Cloud Logging collects logging data from common application components.

  • Cloud Trace collects latency data and executed query plans from applications to helpyou track how requests propagate through your application.

  • Database Center provides an AI-assisted, centralized database fleet overview. You can monitorthe health of your databases, availability configuration, data protection,security, and industry compliance.

For more information about increasing the observability of your databaseinfrastructure, see the following documentation sections:

General Cloud SQL and AlloyDB for PostgreSQL best practices and operational guidelines

Apply the best practices for Cloud SQL to configure and tune thedatabase.

Some important Cloud SQL general recommendations are as follows:

  • If you have large instances, we recommend that you split them intosmaller instances, when possible.
  • Configure storage to accommodate critical database maintenance. Ensureyou have at least 20% available space to accommodate any critical databasemaintenance operations that Cloud SQL might perform.
  • Having too many database tables can affect database upgrade time.Ideally, aim to have under 10,000 tables per instance.
  • Choose the appropriate size for your instances to account fortransaction (binary) log retention, especially for high write activityinstances.

To be able to efficiently handle any database performance issues that you mightencounter, use the following guidelines until your issue is resolved:

Scale up infrastructure: Increase resources (such as disk throughput, vCPU,and RAM). Depending on the urgency and your team's availability and experience,vertically scaling your instance can resolve most performance issues. Later, youcan further investigate the root cause of the issue in a test environment andconsider options to eliminate it.

Perform and schedule database maintenance operations: Indexdefragmentation, statistics updates, vacuum analyze, and reindex heavily updatedtables. Check if and when these maintenance operations were last performed,especially on the affected objects (tables, indexes). Find out if there was achange from normal database activities. For example, recently adding a newcolumn or having lots of updates on a table.

Perform database tuning and optimization: Are the tables in your databaseproperly structured? Do the columns have the correct data types? Is your datamodel right for the type of workload? Investigate yourslow queries and their execution plans. Are they using the available indexes? Check for indexscans, locks, and waits on other resources. Consider adding indexes to supportyour critical queries. Eliminate non-critical indexes and foreign keys. Considerrewriting complex queries and joins. The time it takes to resolve your issuedepends on the experience and availability of your team and can range from hoursto days.

Scale out your reads: Consider having read replicas. When scalingvertically isn't sufficient for your needs, and database tuning and optimizationmeasures aren't helping, consider scaling horizontally. Routing read queriesfrom your applications to a read replica improves the overall performance ofyour database workload. However, it might require additional effort to changeyour applications to connect to the read replica.

Database re-architecture: Consider partitioning and indexing the database.This operation requires significantly more effort than database tuning andoptimization, and it might involve a data migration, but it can be a long-termfix. Sometimes, poor data model design can lead to performance issues, which canbe partially compensated by vertical scale-up. However, a proper data model is along-term fix. Consider partitioning your tables. Archive data that isn't neededanymore, if possible. Normalize your database structure, but remember thatdenormalizing can also improve performance.

Database sharding: You can scale out your writes by sharding your database.Sharding is a complicated operation and involves re-architecting your databaseand applications in a specific way and performing data migration. You split yourdatabase instance in multiple smaller instances by using a specific partitioningcriteria. The criteria can be based on customer or subject. This option lets youhorizontally scale both your writes and reads. However, it increases thecomplexity of your database and application workloads. It might also lead tounbalanced shards called hotspots, which would outweigh the benefit of sharding.

For Cloud SQL for PostgreSQL and AlloyDB for PostgreSQL, consider thefollowing best practices:

  • To offload traffic from the primary instance, add read replicas. Youcan also use a load balancer such as HAProxy to manage traffic to thereplicas. However, avoid too many replicas as this hinders theVACUUMoperation. For more information on using HAProxy, see the officialHAProxy website.
  • Optimize theVACUUM operation by increasing system memory and themaintenance_work_mem parameter. Increasing system memory means that moretuples can be batched in each iteration.
  • Because larger indexes consume a significant amount of time for theindex scan, set theINDEX_CLEANUP parameter toOFF to quickly clean upand freeze the table data.
  • When usingAlloyDB for PostgreSQL,use the AlloyDB for PostgreSQL System Insights dashboard and audit logs.The AlloyDB for PostgreSQL System Insights dashboard displays metrics ofthe resources that you use, and lets you monitor them. For more details,see the guidelines from theMonitor instances topic in theAlloyDB for PostgreSQL documentation.

For more details, see the following resources:

What's next

Contributors

Authors:

Other contributor:Somdyuti Paul | Data Management Specialist

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2024-09-12 UTC.