Import metadata into Dataproc Metastore Stay organized with collections Save and categorize content based on your preferences.
This page explains how to import metadata into a Dataproc Metastoreservice.
The import metadata feature lets you populate an existingDataproc Metastore service with metadata that's stored in a portablestorage format.
This portable metadata is typicallyexported from anotherDataproc Metastore serviceor from a self-managed Hive Metastore (HMS).
About importing metadata
You can import the following file formats intoDataproc Metastore:
- A set of Avro files stored in a folder.
- A single MySQL dump file stored in a Cloud Storage folder.
The MySQL or Avro files you're importing must be generated from a relationaldatabase.
If your files are in a different format, such as PostgreSQL, you must convertthem to the Avro or MySQL format. After the conversion, you can import them intoDataproc Metastore.
Avro
Avro based imports are only supported for Hive versions 2.3.6 and 3.1.2. Whenimporting Avro files, Dataproc Metastore expects a series of<table-name>.avro files for each table in your database.
To import Avro files, your Dataproc Metastore service can usethe MySQL or Spanner database type.
MySQL
MySQL based imports are supported for all Hive versions. When importing MySQLfiles, Dataproc Metastore expects a single SQL file containingall of your table information.MySQL dumps obtained from a Dataproc cluster using Native SQL arealso supported.
To import MySQL files, your Dataproc Metastore service must usethe MySQL database type. The Spanner database type doesn't support MySQLimports.
Import considerations
Importing overwrites all existing metadata stored in aDataproc Metastore service.
The metadata import feature only imports metadata. Data that's created byApache Hive in internal tables isn't replicated in the import.
Importing doesn't transform database content and doesn't handle filemigration. If you move your data to a different location, you must manuallyupdate your table data locations and schema in yourDataproc Metastore service.
Importing doesn't restore or replace fine-grained IAMpolicies.
If you're usingVPC Service Controls, thenyou can only import data from a Cloud Storage bucket that resides in thesame service perimeter as the Dataproc Metastore service.
Before you begin
- Enable Dataproc Metastore in your project.
- Understand networking requirements specific to your project.
- Create a Dataproc Metastore service.
Required roles
To get the permissions that you need to import metadata into Dataproc Metastore, ask your administrator to grant you the following IAM roles:
- To import metadata:
- Dataproc Metastore Editor (
roles/metastore.editor) on the metadata service. - Dataproc Metastore Administrator (
roles/metastore.admin) on the project.
- Dataproc Metastore Editor (
- For MySQL, to use the Cloud Storage object (SQL dump file) for import:grant your user account and the Dataproc Metastore service agent the Storage Object Viewer role (
roles/storage.objectViewer) on the Cloud Storage bucket containing the metadata dump being imported. - For Avro, to use the Cloud Storage bucket for import:grant your user account and the Dataproc Metastore service agent the Storage Object Viewer role (
roles/storage.objectViewer) on the Cloud Storage bucket containing the metadata dump being imported.
For more information about granting roles, seeManage access to projects, folders, and organizations.
These predefined roles contain the permissions required to import metadata into Dataproc Metastore. To see the exact permissions that are required, expand theRequired permissions section:
Required permissions
The following permissions are required to import metadata into Dataproc Metastore:
- To import metadata:
metastore.imports.createon the metastore service. - For MySQL, to use the Cloud Storage object (SQL dump file) for import, grant your user account and the Dataproc Metastore service agent:
storage.objects.geton the Cloud Storage bucket containing the metadata dump being imported. - For Avro, to use the Cloud Storage bucket for import, grant your user account and the Dataproc Metastore service agent:
storage.objects.geton the Cloud Storage bucket containing the metadata dump being imported.
You might also be able to get these permissions withcustom roles or otherpredefined roles.
For more information about specific Dataproc Metastore roles and permissions, seeDataproc Metastore IAM overview.Import your metadata
The import operation is a two-step process. First, you prepare your import filesand then you import them into Dataproc Metastore.
When you start an import, Dataproc Metastore performs a Hivemetadata schema validation. This validation verifies the tables in the SQL dumpfile and the filenames for Avro. If a table is missing, the import fails with anerror message describing the missing table.
To check Hive metadata compatibility before an import, you can use theDataproc MetastoreToolkit.
Prepare the import files before import
Before you can import your files into Dataproc Metastore, youmust copy your metadata dump files into Cloud Storage, such as yourartifacts Cloud Storagebucket.
Move your files to Cloud Storage
Create a database dump of the external database that you want to import intoDataproc Metastore.
For instructions on creating a database dump, see the following pages:
Upload the files into Cloud Storage.
Make sure you write down the Cloud Storage path that you upload your filesto, you need to use it later to perform the import.
If you're importing MySQL files, upload the SQL file into aCloud Storage bucket.
If you're importing Avro files, upload the files into aCloud Storage folder.
- Your Avro import should include an Avro file for each Hive table, even if the table is empty.
- The Avro filenames must follow the format
<table-name>.avro. The<table-name>must beall caps. For example,AUX_TABLE.avro.
Import the files into Dataproc Metastore
Before importing metadata, review theimportconsiderations.
While an import is running, you can't update a Dataproc Metastoreservice — for example changing configuration settings. However, you canstill use it for normal operations, such as using it to access its metadata fromattached Dataproc or self-managed clusters.
Caution: Import overwrites existing data.Console
In the Google Cloud console, open theDataproc Metastore page:
On theDataproc Metastore page, click the name of the serviceyou want to import metadata into.
TheService detail page opens.

Figure 1. The Dataproc Metastore Service detail page. In the navigation bar, clickImport.
TheImport dialog opens.
Enter theImport name.
In theDestination section, choose eitherMySQL orAvro.
In theDestination URI field, clickBrowse and select theCloud Storage URI where you want toimport your files to.
You can also manually enter your bucket location in the provided text field. Use the following format:
bucket/objectorbucket/folder/object.Optional: Enter aDescription of the import.
You can edit the description on theService detail page.
To update the service, clickImport.
After the import completes, it appears in a table on theService detail page on theImport/Export tab.
gcloud CLI
To import metadata, run the following
gcloud metastore services import gcscommand:gcloud metastore services import gcsSERVICE_ID \ --location=LOCATION \ --import-id=IMPORT_ID \ --description=DESCRIPTION \ --dump-type=DUMP_TYPE \ --database-dump=DATABASE_DUMP
Replace the following:
SERVICE_ID: the ID or fully qualified name of your Dataproc Metastore service.LOCATION: the Google Cloud region in which your Dataproc Metastore service resides.IMPORT_ID: an ID or fully qualified name for your metadata import. For example,import1.DESCRIPTION: Optional: A description of your import.You can edit this later usinggcloud metastore services imports updateIMPORT.DUMP_TYPE: the type of the external database you're importing. Accepted values includemysqlandavro. The default value ismysql.DATABASE_DUMP: the path to the Cloud Storage containing thedatabase files. This path must begin withgs://. For Avro, provide the path to the folder where the Avro files are stored (the Cloud Storage folder). For MySQL, provide the path to the MySQL file (the Cloud Storage object).
Verify that the import was successful.
REST
Follow the API instructions toimport metadata into a serviceby using the APIs Explorer.
Using the API, you can create, list, describe, and update imports, butyou can't delete imports. However, deleting a Dataproc Metastore service deletes all stored nestedimports.
When the import succeeds, Dataproc Metastore automatically returns to theactive state. If the import fails, then Dataproc Metastore rolls back to its previous healthy state.
View import history
To view the import history of a Dataproc Metastore service in the Google Cloud console, complete the following steps:
- In the Google Cloud console, open theDataproc Metastore page.
In the navigation bar, clickImport/Export.
Your import history appears in theImport history table.
The history displays up to the last 25 imports.
Deleting a Dataproc Metastore service also deletes all associated import history.
Troubleshoot common issues
Some common issues include the following:
- Import fails because the Hive versions don't match.
- The service agent or user account doesn't have necessary permissions.
- Job fails because the database file is too large.
For more help solving common troubleshooting issues, seeImport and export error scenarios.
What's next
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-02-19 UTC.