- Enabling AWS Integration
- Catalogs
- DynamoDb Lock Manager
- S3 FileIO
- Progressive Multipart Upload
- S3 Server Side Encryption
- S3 Access Control List
- Object Store File Layout
- S3 Retries
- S3 Strong Consistency
- Hadoop S3A FileSystem
- S3 Write Checksum Verification
- S3 Tags
- S3 Access Points
- S3 Access Grants
- S3 Cross-Region Access
- S3 Acceleration
- S3 Analytics Accelerator
- S3 Dual-stack
- AWS Client Customization
- Run Iceberg on AWS
Iceberg AWS Integrations🔗
Iceberg provides integration with different AWS services through theiceberg-aws
module. This section describes how to use Iceberg with AWS.
Enabling AWS Integration🔗
Theiceberg-aws
module is bundled with Spark and Flink engine runtimes for all versions from0.11.0
onwards.However, the AWS clients are not bundled so that you can use the same client version as your application.You will need to provide the AWS v2 SDK because that is what Iceberg depends on.You can choose to use theAWS SDK bundle, or individual AWS client packages (Glue, S3, DynamoDB, KMS, STS) if you would like to have a minimal dependency footprint.
All the default AWS clients use theApache HTTP Clientfor HTTP connection management.This dependency is not part of the AWS SDK bundle and needs to be added separately.To choose a different HTTP client library such asURL Connection HTTP Client,see the sectionclient customization for more details.
All the AWS module features can be loaded through custom catalog properties,you can go to the documentations of each engine to see how to load a custom catalog.Here are some examples.
Spark🔗
For example, to use AWS features with Spark 3.4 (with scala 2.12) and AWS clients (which is packaged in theiceberg-aws-bundle
), you can start the Spark SQL shell with:
# start Spark SQL client shellspark-sql--packagesorg.apache.iceberg:iceberg-spark-runtime-3.4_2.12:1.9.1,org.apache.iceberg:iceberg-aws-bundle:1.9.1\--confspark.sql.defaultCatalog=my_catalog\--confspark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog\--confspark.sql.catalog.my_catalog.warehouse=s3://my-bucket/my/key/prefix\--confspark.sql.catalog.my_catalog.type=glue\--confspark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO
As you can see, In the shell command, we use--packages
to specify the additionaliceberg-aws-bundle
that contains all relevant AWS dependencies.
Flink🔗
To use AWS module with Flink, you can download the necessary dependencies and specify them when starting the Flink SQL client:
# download Iceberg dependencyICEBERG_VERSION=1.9.1MAVEN_URL=https://repo1.maven.org/maven2ICEBERG_MAVEN_URL=$MAVEN_URL/org/apache/icebergwget$ICEBERG_MAVEN_URL/iceberg-flink-runtime/$ICEBERG_VERSION/iceberg-flink-runtime-$ICEBERG_VERSION.jarwget$ICEBERG_MAVEN_URL/iceberg-aws-bundle/$ICEBERG_VERSION/iceberg-aws-bundle-$ICEBERG_VERSION.jar# start Flink SQL client shell/path/to/bin/sql-client.shembedded\-jiceberg-flink-runtime-$ICEBERG_VERSION.jar\-jiceberg-aws-bundle-$ICEBERG_VERSION.jar\shell
With those dependencies, you can create a Flink catalog like the following:
CREATECATALOGmy_catalogWITH('type'='iceberg','warehouse'='s3://my-bucket/my/key/prefix','catalog-type'='glue','io-impl'='org.apache.iceberg.aws.s3.S3FileIO');
You can also specify the catalog configurations insql-client-defaults.yaml
to preload it:
catalogs:-name:my_catalogtype:icebergwarehouse:s3://my-bucket/my/key/prefixcatalog-type:glueio-impl:org.apache.iceberg.aws.s3.S3FileIO
Hive🔗
To use AWS module with Hive, you can download the necessary dependencies similar to the Flink example,and then add them to the Hive classpath or add the jars at runtime in CLI:
With those dependencies, you can register a Glue catalog and create external tables in Hive at runtime in CLI by:
SETiceberg.engine.hive.enabled=true;SEThive.vectorized.execution.enabled=false;SETiceberg.catalog.glue.type=glue;SETiceberg.catalog.glue.warehouse=s3://my-bucket/my/key/prefix;-- suppose you have an Iceberg table database_a.table_a created by GlueCatalogCREATEEXTERNALTABLEdatabase_a.table_aSTOREDBY'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'TBLPROPERTIES('iceberg.catalog'='glue');
You can also preload the catalog by setting the configurations above inhive-site.xml
.
Catalogs🔗
There are multiple different options that users can choose to build an Iceberg catalog with AWS.
Glue Catalog🔗
Iceberg enables the use ofAWS Glue as theCatalog
implementation.When used, an Iceberg namespace is stored as aGlue Database, an Iceberg table is stored as aGlue Table,and every Iceberg table version is stored as aGlue TableVersion. You can start using Glue catalog by specifying thecatalog-impl
asorg.apache.iceberg.aws.glue.GlueCatalog
or by settingcatalog-type
asglue
,just like what is shown in theenabling AWS integration section above. More details about loading the catalog can be found in individual engine pages, such asSpark andFlink.
Glue Catalog ID🔗
There is a unique Glue metastore in each AWS account and each AWS region.By default,GlueCatalog
chooses the Glue metastore to use based on the user's default AWS client credential and region setup.You can specify the Glue catalog ID throughglue.id
catalog property to point to a Glue catalog in a different AWS account.The Glue catalog ID is your numeric AWS account ID.If the Glue catalog is in a different region, you should configure your AWS client to point to the correct region, see more details inAWS client customization.
Skip Archive🔗
AWS Glue has the ability to archive older table versions and a user can roll back the table to any historical version if needed.By default, the Iceberg Glue Catalog will skip the archival of older table versions.If a user wishes to archive older table versions, they can setglue.skip-archive
to false.Do note for streaming ingestion into Iceberg tables, settingglue.skip-archive
to false will quickly create a lot of Glue table versions.For more details, please readGlue Quotas and theUpdateTable API.
Skip Name Validation🔗
Allow user to skip name validation for table name and namespaces.It is recommended to stick toGlue best practicesto make sure operations are Hive compatible.This is only added for users that have existing conventions using non-standard characters. When database nameand table name validation are skipped, there is no guarantee that downstream systems would all support the names.
Optimistic Locking🔗
By default, Iceberg uses Glue's optimistic locking for concurrent updates to a table.With optimistic locking, each table has a version id. If users retrieve the table metadata, Iceberg records the version id of that table. Users can update the table as long as the version ID on the server side remains unchanged. Version mismatch occurs if someone else modified the table before you did, causing an update failure. Iceberg then refreshes metadata and checks if there is a conflict.If there is no commit conflict, the operation will be retried.Optimistic locking guarantees atomic transaction of Iceberg tables in Glue.It also prevents others from accidentally overwriting your changes.
Info
Please use AWS SDK version >= 2.17.131 to leverage Glue's Optimistic Locking.If the AWS SDK version is below 2.17.131, only in-memory lock is used. To ensure atomic transaction, you need to set up aDynamoDb Lock Manager.
Warehouse Location🔗
Similar to all other catalog implementations,warehouse
is a required catalog property to determine the root path of the data warehouse in storage.By default, Glue only allows a warehouse location in S3 because of the use ofS3FileIO
.To store data in a different local or cloud store, Glue catalog can switch to useHadoopFileIO
or any custom FileIO by setting theio-impl
catalog property.Details about this feature can be found in thecustom FileIO section.
Table Location🔗
By default, the root location for a tablemy_table
of namespacemy_ns
is atmy-warehouse-location/my-ns.db/my-table
.This default root location can be changed at both namespace and table level.
To use a different path prefix for all tables under a namespace, use AWS console or any AWS Glue client SDK you like to update thelocationUri
attribute of the corresponding Glue database.For example, you can update thelocationUri
ofmy_ns
tos3://my-ns-bucket
, then any newly created table will have a default root location under the new prefix.For instance, a new tablemy_table_2
will have its root location ats3://my-ns-bucket/my_table_2
.
To use a completely different root path for a specific table, set thelocation
table property to the desired root path value you want.For example, in Spark SQL you can do:
CREATETABLEmy_catalog.my_ns.my_table(idbigint,datastring,categorystring)USINGicebergOPTIONS('location'='s3://my-special-table-bucket')PARTITIONEDBY(category);
For engines like Spark that support theLOCATION
keyword, the above SQL statement is equivalent to:
CREATETABLEmy_catalog.my_ns.my_table(idbigint,datastring,categorystring)USINGicebergLOCATION's3://my-special-table-bucket'PARTITIONEDBY(category);
DynamoDB Catalog🔗
Iceberg supports using aDynamoDB table to record and manage database and table information.
Configurations🔗
The DynamoDB catalog supports the following configurations:
Property | Default | Description |
---|---|---|
dynamodb.table-name | iceberg | name of the DynamoDB table used by DynamoDbCatalog |
Internal Table Design🔗
The DynamoDB table is designed with the following columns:
Column | Key | Type | Description |
---|---|---|---|
identifier | partition key | string | table identifier such asdb1.table1 , or stringNAMESPACE for namespaces |
namespace | sort key | string | namespace name. Aglobal secondary index (GSI) is created with namespace as partition key, identifier as sort key, no other projected columns |
v | string | row version, used for optimistic locking | |
updated_at | number | timestamp (millis) of the last update | |
created_at | number | timestamp (millis) of the table creation | |
p.<property_key> | string | Iceberg-defined table properties includingtable_type ,metadata_location andprevious_metadata_location or namespace properties |
This design has the following benefits:
- it avoids potentialhot partition issue if there are heavy write traffic to the tables within the same namespace because the partition key is at the table level
- namespace operations are clustered in a single partition to avoid affecting table commit operations
- a sort key to partition key reverse GSI is used for list table operation, and all other operations are single row ops or single partition query. No full table scan is needed for any operation in the catalog.
- a string UUID version field
v
is used instead ofupdated_at
to avoid 2 processes committing at the same millisecond - multi-row transaction is used for
catalog.renameTable
to ensure idempotency - properties are flattened as top level columns so that user can add custom GSI on any property field to customize the catalog. For example, users can store owner information as table property
owner
, and search tables by owner by adding a GSI on thep.owner
column.
RDS JDBC Catalog🔗
Iceberg also supports the JDBC catalog which uses a table in a relational database to manage Iceberg tables.You can configure to use the JDBC catalog with relational database services likeAWS RDS.Readthe JDBC integration page for guides and examples about using the JDBC catalog.Readthis AWS documentation for more details about configuring the JDBC catalog with IAM authentication.
Which catalog to choose?🔗
With all the available options, we offer the following guidelines when choosing the right catalog to use for your application:
- if your organization has an existing Glue metastore or plans to use the AWS analytics ecosystem including Glue,Athena,EMR,Redshift andLakeFormation, Glue catalog provides the easiest integration.
- if your application requires frequent updates to table or high read and write throughput (e.g. streaming write), Glue and DynamoDB catalog provides the best performance through optimistic locking.
- if you would like to enforce access control for tables in a catalog, Glue tables can be managed as anIAM resource, whereas DynamoDB catalog tables can only be managed throughitem-level permission which is much more complicated.
- if you would like to query tables based on table property information without the need to scan the entire catalog, DynamoDB catalog allows you to build secondary indexes for any arbitrary property field and provide efficient query performance.
- if you would like to have the benefit of DynamoDB catalog while also connect to Glue, you can enableDynamoDB stream with Lambda trigger to asynchronously update your Glue metastore with table information in the DynamoDB catalog.
- if your organization already maintains an existing relational database in RDS or usesserverless Aurora to manage tables, the JDBC catalog provides the easiest integration.
DynamoDb Lock Manager🔗
Amazon DynamoDB can be used byHadoopCatalog
orHadoopTables
so that for every commit,the catalog first obtains a lock using a helper DynamoDB table and then try to safely modify the Iceberg table.This is necessary for a file system-based catalog to ensure atomic transaction in storages like S3 that do not provide file write mutual exclusion.
This feature requires the following lock related catalog properties:
- Set
lock-impl
asorg.apache.iceberg.aws.dynamodb.DynamoDbLockManager
. - Set
lock.table
as the DynamoDB table name you would like to use. If the lock table with the given name does not exist in DynamoDB, a new table is created with billing mode set aspay-per-request.
Other lock related catalog properties can also be used to adjust locking behaviors such as heartbeat interval.For more details, please refer toLock catalog properties.
S3 FileIO🔗
Iceberg allows users to write data to S3 throughS3FileIO
.GlueCatalog
by default uses thisFileIO
, and other catalogs can load thisFileIO
using theio-impl
catalog property.
Progressive Multipart Upload🔗
S3FileIO
implements a customized progressive multipart upload algorithm to upload data.Data files are uploaded by parts in parallel as soon as each part is ready,and each file part is deleted as soon as its upload process completes.This provides maximized upload speed and minimized local disk usage during uploads.Here are the configurations that users can tune related to this feature:
Property | Default | Description |
---|---|---|
s3.multipart.num-threads | the available number of processors in the system | number of threads to use for uploading parts to S3 (shared across all output streams) |
s3.multipart.part-size-bytes | 32MB | the size of a single part for multipart upload requests |
s3.multipart.threshold | 1.5 | the threshold expressed as a factor times the multipart size at which to switch from uploading using a single put object request to uploading using multipart upload |
s3.staging-dir | java.io.tmpdir property value | the directory to hold temporary files |
S3 Server Side Encryption🔗
S3FileIO
supports all 3 S3 server side encryption modes:
- SSE-S3: When you use Server-Side Encryption with Amazon S3-Managed Keys (SSE-S3), each object is encrypted with a unique key. As an additional safeguard, it encrypts the key itself with a master key that it regularly rotates. Amazon S3 server-side encryption uses one of the strongest block ciphers available, 256-bit Advanced Encryption Standard (AES-256), to encrypt your data.
- SSE-KMS: Server-Side Encryption with Customer Master Keys (CMKs) Stored in AWS Key Management Service (SSE-KMS) is similar to SSE-S3, but with some additional benefits and charges for using this service. There are separate permissions for the use of a CMK that provides added protection against unauthorized access of your objects in Amazon S3. SSE-KMS also provides you with an audit trail that shows when your CMK was used and by whom. Additionally, you can create and manage customer managed CMKs or use AWS managed CMKs that are unique to you, your service, and your Region.
- DSSE-KMS: Dual-layer Server-Side Encryption with AWS Key Management Service keys (DSSE-KMS) is similar to SSE-KMS, but applies two layers of encryption to objects when they are uploaded to Amazon S3. DSSE-KMS can be used to fulfill compliance standards that require you to apply multilayer encryption to your data and have full control of your encryption keys.
- SSE-C: With Server-Side Encryption with Customer-Provided Keys (SSE-C), you manage the encryption keys and Amazon S3 manages the encryption, as it writes to disks, and decryption when you access your objects.
To enable server side encryption, use the following configuration properties:
Property | Default | Description |
---|---|---|
s3.sse.type | none | none ,s3 ,kms ,dsse-kms orcustom |
s3.sse.key | aws/s3 forkms anddsse-kms types, null otherwise | A KMS Key ID or ARN forkms anddsse-kms types, or a custom base-64 AES256 symmetric key forcustom type. |
s3.sse.md5 | null | If SSE type iscustom , this value must be set as the base-64 MD5 digest of the symmetric key to ensure integrity. |
S3 Access Control List🔗
S3FileIO
supports S3 access control list (ACL) for detailed access control. User can choose the ACL level by setting thes3.acl
property.For more details, please readS3 ACL Documentation.
Object Store File Layout🔗
S3 and many other cloud storage servicesthrottle requests based on object prefix.Data stored in S3 with a traditional Hive storage layout can face S3 request throttling as objects are stored under the same file path prefix.
Iceberg by default uses the Hive storage layout but can be switched to use theObjectStoreLocationProvider
. WithObjectStoreLocationProvider
, a deterministic hash is generated for each stored file, with the hash appended directly after thewrite.data.path
. This ensures files written to S3 are equally distributed across multipleprefixes in the S3 bucket;resulting in minimized throttling and maximized throughput for S3-related IO operations. When usingObjectStoreLocationProvider
having a sharedwrite.data.path
across your Iceberg tables will improve performance.
For more information on how S3 scales API QPS, check out the 2018 re:Invent session onBest Practices for Amazon S3 and Amazon S3 Glacier. At53:39 it covers how S3 scales/partitions & at54:50 it discusses the 30-60 minute wait time before new partitions are created.
To use theObjectStorageLocationProvider
add'write.object-storage.enabled'=true
in the table's properties. Below is an example Spark SQL command to create a table using theObjectStorageLocationProvider
:
CREATETABLEmy_catalog.my_ns.my_table(idbigint,datastring,categorystring)USINGicebergOPTIONS('write.object-storage.enabled'=true,'write.data.path'='s3://my-table-data-bucket/my_table')PARTITIONEDBY(category);
We can then insert a single row into this new table
Which will write the data to S3 with a 20-bit base2 hash (01010110100110110010
) appended directly after thewrite.object-storage.path
, ensuring reads to the table are spread evenly acrossS3 bucket prefixes, and improving performance.Previously provided base64 hash was updated to base2 in order to provide an improved auto-scaling behavior on S3 General Purpose Buckets.
As part of this update, we have also divided the entropy into multiple directories in order to improve the efficiency of theorphan clean up process for Iceberg since directories are used as a mean to divide the work across workers for faster traversal. Youcan see from the example below that we divide the hash to create 4-bit directories with a depth of 3 and attach the final part of the hash tothe end.
s3://my-table-data-bucket/my_ns.db/my_table/0101/0110/1001/10110010/category=orders/00000-0-5affc076-96a4-48f2-9cd2-d5efbc9f0c94-00001.parquet
Note, the path resolution logic forObjectStoreLocationProvider
iswrite.data.path
then<tableLocation>/data
.
However, for the older versions up to 0.12.0, the logic is as follows:
- before 0.12.0,
write.object-storage.path
must be set. - at 0.12.0,
write.object-storage.path
thenwrite.folder-storage.path
then<tableLocation>/data
. - at 2.0.0
write.object-storage.path
andwrite.folder-storage.path
will be removed
For more details, please refer to theLocationProvider Configuration section.
We have also added a new table propertywrite.object-storage.partitioned-paths
that if set to false(default=true), this willomit the partition values from the file path. Iceberg does not need these values in the file path and setting this value to falsecan further reduce the key size. In this case, we also append the final 8 bit of entropy directly to the file name.Inserted key would look like the following with this config set, note thatcategory=orders
is removed:
s3://my-table-data-bucket/my_ns.db/my_table/1101/0100/1011/00111010-00000-0-5affc076-96a4-48f2-9cd2-d5efbc9f0c94-00001.parquet
S3 Retries🔗
Workloads which encounter S3 throttling should persistently retry, with exponential backoff, to make progress while S3automatically scales. We provide the configurations below to adjust S3 retries for this purpose. For workloads that encounterthrottling and fail due to retry exhaustion, we recommend retry count to set 32 in order allow S3 to auto-scale. Note thatworkloads with exceptionally high throughput against tables that S3 has not yet scaled, it may be necessary to increase the retry count further.
Property | Default | Description |
---|---|---|
s3.retry.num-retries | 5 | Number of times to retry S3 operations. Recommended 32 for high-throughput workloads. |
s3.retry.min-wait-ms | 2s | Minimum wait time to retry a S3 operation. |
s3.retry.max-wait-ms | 20s | Maximum wait time to retry a S3 read operation. |
S3 Strong Consistency🔗
In November 2020, S3 announcedstrong consistency for all read operations, and Iceberg is updated to fully leverage this feature.There is no redundant consistency wait and check which might negatively impact performance during IO operations.
Hadoop S3A FileSystem🔗
BeforeS3FileIO
was introduced, many Iceberg users choose to useHadoopFileIO
to write data to S3 through theS3A FileSystem.As introduced in the previous sections,S3FileIO
adopts the latest AWS clients and S3 features for optimized security and performance and is thus recommended for S3 use cases rather than the S3A FileSystem.
S3FileIO
writes data withs3://
URI scheme, but it is also compatible with schemes written by the S3A FileSystem.This means for any table manifests containings3a://
ors3n://
file paths,S3FileIO
is still able to read them.This feature allows people to easily switch from S3A toS3FileIO
.
If for any reason you have to use S3A, here are the instructions:
- To store data using S3A, specify the
warehouse
catalog property to be an S3A path, e.g.s3a://my-bucket/my-warehouse
- For
HiveCatalog
, to also store metadata using S3A, specify the Hadoop config propertyhive.metastore.warehouse.dir
to be an S3A path. - Addhadoop-aws as a runtime dependency of your compute engine.
- Configure AWS settings based onhadoop-aws documentation (make sure you check the version, S3A configuration varies a lot based on the version you use).
S3 Write Checksum Verification🔗
To ensure integrity of uploaded objects, checksum validations for S3 writes can be turned on by setting catalog propertys3.checksum-enabled
totrue
. This is turned off by default.
S3 Tags🔗
Customtags can be added to S3 objects while writing and deleting.For example, to write S3 tags with Spark 3.5, you can start the Spark SQL shell with:
spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \ --conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket/my/key/prefix \ --conf spark.sql.catalog.my_catalog.type=glue \ --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \ --conf spark.sql.catalog.my_catalog.s3.write.tags.my_key1=my_val1 \ --conf spark.sql.catalog.my_catalog.s3.write.tags.my_key2=my_val2
my_key1=my_val1
andmy_key2=my_val2
. Do note that the specified write tags will be saved only while object creation.When the catalog propertys3.delete-enabled
is set tofalse
, the objects are not hard-deleted from S3.This is expected to be used in combination with S3 delete tagging, so objects are tagged and removed usingS3 lifecycle policy.The property is set totrue
by default.
With thes3.delete.tags
config, objects are tagged with the configured key-value pairs before deletion.Users can configure tag-based object lifecycle policy at bucket level to transition objects to different tiers.For example, to add S3 delete tags with Spark 3.5, you can start the Spark SQL shell with:
sh spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \ --conf spark.sql.catalog.my_catalog.warehouse=s3://iceberg-warehouse/s3-tagging \ --conf spark.sql.catalog.my_catalog.type=glue \ --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \ --conf spark.sql.catalog.my_catalog.s3.delete.tags.my_key3=my_val3 \ --conf spark.sql.catalog.my_catalog.s3.delete-enabled=false
For the above example, the objects in S3 will be saved with tags:my_key3=my_val3
before deletion.Users can also use the catalog propertys3.delete.num-threads
to mention the number of threads to be used for adding delete tags to the S3 objects.
When the catalog propertys3.write.table-tag-enabled
ands3.write.namespace-tag-enabled
is set totrue
then the objects in S3 will be saved with tags:iceberg.table=<table-name>
andiceberg.namespace=<namespace-name>
.Users can define access and data retention policy per namespace or table based on these tags.For example, to write table and namespace name as S3 tags with Spark 3.5, you can start the Spark SQL shell with:
sh spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \ --conf spark.sql.catalog.my_catalog.warehouse=s3://iceberg-warehouse/s3-tagging \ --conf spark.sql.catalog.my_catalog.type=glue \ --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \ --conf spark.sql.catalog.my_catalog.s3.write.table-tag-enabled=true \ --conf spark.sql.catalog.my_catalog.s3.write.namespace-tag-enabled=true
S3 Access Points🔗
Access Points can be used to perform S3 operations by specifying a mapping of bucket to access points. This is useful for multi-region access, cross-region access,disaster recovery, etc.
For using cross-region access points, we need to additionally setuse-arn-region-enabled
catalog property totrue
to enableS3FileIO
to make cross-region calls, it's not required for same / multi-region access points.
For example, to use S3 access-point with Spark 3.5, you can start the Spark SQL shell with:
spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \ --conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket2/my/key/prefix \ --conf spark.sql.catalog.my_catalog.type=glue \ --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \ --conf spark.sql.catalog.my_catalog.s3.use-arn-region-enabled=false \ --conf spark.sql.catalog.my_catalog.s3.access-points.my-bucket1=arn:aws:s3::<ACCOUNT_ID>:accesspoint/<MRAP_ALIAS> \ --conf spark.sql.catalog.my_catalog.s3.access-points.my-bucket2=arn:aws:s3::<ACCOUNT_ID>:accesspoint/<MRAP_ALIAS>
my-bucket1
andmy-bucket2
buckets will usearn:aws:s3::<ACCOUNT_ID>:accesspoint/<MRAP_ALIAS>
access-point for all S3 operations.For more details on using access-points, please referUsing access points with compatible Amazon S3 operations,Sample notebook .
S3 Access Grants🔗
S3 Access Grants can be used to grant accesses to S3 data using IAM Principals.In order to enable S3 Access Grants to work in Iceberg, you can set thes3.access-grants.enabled
catalog property totrue
afteryou add theS3 Access Grants Plugin jar to your classpath. A linkto the Maven listing for this plugin can be foundhere.
In addition, we allow thefallback-to-IAM configuration which allowsyou to fallback to using your IAM role (and its permission sets directly) to access your S3 data in the case the S3 Access Grantsis unable to authorize your S3 call. This can be done using thes3.access-grants.fallback-to-iam
boolean catalog property. By default,this property is set tofalse
.
For example, to add the S3 Access Grants Integration with Spark 3.5, you can start the Spark SQL shell with:
spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \ --conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket2/my/key/prefix \ --conf spark.sql.catalog.my_catalog.catalog-impl=org.apache.iceberg.aws.glue.GlueCatalog \ --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \ --conf spark.sql.catalog.my_catalog.s3.access-grants.enabled=true \ --conf spark.sql.catalog.my_catalog.s3.access-grants.fallback-to-iam=true
For more details on using S3 Access Grants, please refer toManaging access with S3 Access Grants.
S3 Cross-Region Access🔗
S3 Cross-Region bucket access can be turned on by setting catalog propertys3.cross-region-access-enabled
totrue
. This is turned off by default to avoid first S3 API call increased latency.
For example, to enable S3 Cross-Region bucket access with Spark 3.5, you can start the Spark SQL shell with:
spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \ --conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket2/my/key/prefix \ --conf spark.sql.catalog.my_catalog.type=glue \ --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \ --conf spark.sql.catalog.my_catalog.s3.cross-region-access-enabled=true
For more details, please refer toCross-Region access for Amazon S3.
S3 Acceleration🔗
S3 Acceleration can be used to speed up transfers to and from Amazon S3 by as much as 50-500% for long-distance transfer of larger objects.
To use S3 Acceleration, we need to sets3.acceleration-enabled
catalog property totrue
to enableS3FileIO
to make accelerated S3 calls.
For example, to use S3 Acceleration with Spark 3.5, you can start the Spark SQL shell with:
spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \ --conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket2/my/key/prefix \ --conf spark.sql.catalog.my_catalog.type=glue \ --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \ --conf spark.sql.catalog.my_catalog.s3.acceleration-enabled=true
For more details on using S3 Acceleration, please refer toConfiguring fast, secure file transfers using Amazon S3 Transfer Acceleration.
S3 Analytics Accelerator🔗
TheAnalytics Accelerator Library for Amazon S3 helps you accelerate access to Amazon S3 data from your applications. This open-source solution reduces processing times and compute costs for your data analytics workloads.
In order to enable S3 Analytics Accelerator Library to work in Iceberg, you can set thes3.analytics-accelerator.enabled
catalog property totrue
. By default, this property is set tofalse
.
For example, to use S3 Analytics Accelerator with Spark, you can start the Spark SQL shell with:
spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \ --conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket2/my/key/prefix \ --conf spark.sql.catalog.my_catalog.type=glue \ --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \ --conf spark.sql.catalog.my_catalog.s3.analytics-accelerator.enabled=true
The Analytics Accelerator Library can work with either theS3 CRT client or theS3AsyncClient. The library recommends that you use the S3 CRT client due to its enhanced connection pool management andhigher throughput on downloads.
Client Configuration🔗
Property | Default | Description |
---|---|---|
s3.crt.enabled | true | Controls if the S3 Async clients should be created using CRT |
s3.crt.max-concurrency | 500 | Max concurrency for S3 CRT clients |
Additional library specific configurations are organized into the following sections:
Logical IO Configuration🔗
Property | Default | Description |
---|---|---|
s3.analytics-accelerator.logicalio.prefetch.footer.enabled | true | Controls whether footer prefetching is enabled |
s3.analytics-accelerator.logicalio.prefetch.page.index.enabled | true | Controls whether page index prefetching is enabled |
s3.analytics-accelerator.logicalio.prefetch.file.metadata.size | 32KB | Size of metadata to prefetch for regular files |
s3.analytics-accelerator.logicalio.prefetch.large.file.metadata.size | 1MB | Size of metadata to prefetch for large files |
s3.analytics-accelerator.logicalio.prefetch.file.page.index.size | 1MB | Size of page index to prefetch for regular files |
s3.analytics-accelerator.logicalio.prefetch.large.file.page.index.size | 8MB | Size of page index to prefetch for large files |
s3.analytics-accelerator.logicalio.large.file.size | 1GB | Threshold to consider a file as large |
s3.analytics-accelerator.logicalio.small.objects.prefetching.enabled | true | Controls prefetching for small objects |
s3.analytics-accelerator.logicalio.small.object.size.threshold | 3MB | Size threshold for small object prefetching |
s3.analytics-accelerator.logicalio.parquet.metadata.store.size | 45 | Size of the parquet metadata store |
s3.analytics-accelerator.logicalio.max.column.access.store.size | 15 | Maximum size of column access store |
s3.analytics-accelerator.logicalio.parquet.format.selector.regex | ^.*.(parquet\|par)$ | Regex pattern to identify parquet files |
s3.analytics-accelerator.logicalio.prefetching.mode | ROW_GROUP | Prefetching mode (valid values:OFF ,ALL ,ROW_GROUP ,COLUMN_BOUND ) |
Physical IO Configuration🔗
Property | Default | Description |
---|---|---|
s3.analytics-accelerator.physicalio.metadatastore.capacity | 50 | Capacity of the metadata store |
s3.analytics-accelerator.physicalio.blocksizebytes | 8MB | Size of blocks for data transfer |
s3.analytics-accelerator.physicalio.readaheadbytes | 64KB | Number of bytes to read ahead |
s3.analytics-accelerator.physicalio.maxrangesizebytes | 8MB | Maximum size of range requests |
s3.analytics-accelerator.physicalio.partsizebytes | 8MB | Size of individual parts for transfer |
s3.analytics-accelerator.physicalio.sequentialprefetch.base | 2.0 | Base factor for sequential prefetch sizing |
s3.analytics-accelerator.physicalio.sequentialprefetch.speed | 1.0 | Speed factor for sequential prefetch growth |
Telemetry Configuration🔗
Property | Default | Description |
---|---|---|
s3.analytics-accelerator.telemetry.level | STANDARD | Telemetry detail level (valid values:CRITICAL ,STANDARD ,VERBOSE ) |
s3.analytics-accelerator.telemetry.std.out.enabled | false | Enable stdout telemetry output |
s3.analytics-accelerator.telemetry.logging.enabled | true | Enable logging telemetry output |
s3.analytics-accelerator.telemetry.aggregations.enabled | false | Enable telemetry aggregations |
s3.analytics-accelerator.telemetry.aggregations.flush.interval.seconds | -1 | Interval to flush aggregated telemetry |
s3.analytics-accelerator.telemetry.logging.level | INFO | Log level for telemetry |
s3.analytics-accelerator.telemetry.logging.name | com.amazon.connector.s3.telemetry | Logger name for telemetry |
s3.analytics-accelerator.telemetry.format | default | Telemetry output format (valid values:json ,default ) |
Object Client Configuration🔗
Property | Default | Description |
---|---|---|
s3.analytics-accelerator.useragentprefix | null | Custom prefix to add to theUser-Agent string in S3 requests |
S3 Dual-stack🔗
S3 Dual-stack allows a client to access an S3 bucket through a dual-stack endpoint. When clients request a dual-stack endpoint, the bucket URL resolves to an IPv6 address if possible, otherwise fallback to IPv4.
To use S3 Dual-stack, we need to sets3.dualstack-enabled
catalog property totrue
to enableS3FileIO
to make dual-stack S3 calls.
For example, to use S3 Dual-stack with Spark 3.5, you can start the Spark SQL shell with:
spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \ --conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket2/my/key/prefix \ --conf spark.sql.catalog.my_catalog.type=glue \ --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \ --conf spark.sql.catalog.my_catalog.s3.dualstack-enabled=true
For more details on using S3 Dual-stack, please referUsing dual-stack endpoints from the AWS CLI and the AWS SDKs
AWS Client Customization🔗
Many organizations have customized their way of configuring AWS clients with their own credential provider, access proxy, retry strategy, etc.Iceberg allows users to plug in their own implementation oforg.apache.iceberg.aws.AwsClientFactory
by setting theclient.factory
catalog property.
Cross-Account and Cross-Region Access🔗
It is a common use case for organizations to have a centralized AWS account for Glue metastore and S3 buckets, and use different AWS accounts and regions for different teams to access those resources.In this case, across-account IAM role is needed to access those centralized resources.Iceberg provides an AWS client factoryAssumeRoleAwsClientFactory
to support this common use case.This also serves as an example for users who would like to implement their own AWS client factory.
This client factory has the following configurable catalog properties:
Property | Default | Description |
---|---|---|
client.assume-role.arn | null, requires user input | ARN of the role to assume, e.g. arn:aws:iam::123456789:role/myRoleToAssume |
client.assume-role.region | null, requires user input | All AWS clients except the STS client will use the given region instead of the default region chain |
client.assume-role.external-id | null | An optionalexternal ID |
client.assume-role.timeout-sec | 1 hour | Timeout of each assume role session. At the end of the timeout, a new set of role session credentials will be fetched through an STS client. |
By using this client factory, an STS client is initialized with the default credential and region to assume the specified role.The Glue, S3 and DynamoDB clients are then initialized with the assume-role credential and region to access resources.Here is an example to start Spark shell with this client factory:
spark-sql--packagesorg.apache.iceberg:iceberg-spark-runtime-3.4_2.12:1.9.1,org.apache.iceberg:iceberg-aws-bundle:1.9.1\--confspark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog\--confspark.sql.catalog.my_catalog.warehouse=s3://my-bucket/my/key/prefix\--confspark.sql.catalog.my_catalog.type=glue\--confspark.sql.catalog.my_catalog.client.factory=org.apache.iceberg.aws.AssumeRoleAwsClientFactory\--confspark.sql.catalog.my_catalog.client.assume-role.arn=arn:aws:iam::123456789:role/myRoleToAssume\--confspark.sql.catalog.my_catalog.client.assume-role.region=ap-northeast-1
HTTP Client Configurations🔗
AWS clients support two types of HTTP Client,URL Connection HTTP Client andApache HTTP Client.By default, AWS clients useApache HTTP Client to communicate with the service. This HTTP client supports various functionalities and customized settings, such as expect-continue handshake and TCP KeepAlive, at the cost of extra dependency and additional startup latency.In contrast, URL Connection HTTP Client optimizes for minimum dependencies and startup latency but supports less functionality than other implementations.
For more details of configuration, see sectionsURL Connection HTTP Client Configurations andApache HTTP Client Configurations.
Configurations for the HTTP client can be set via catalog properties. Below is an overview of available configurations:
Property | Default | Description |
---|---|---|
http-client.type | apache | Types of HTTP Client.urlconnection : URL Connection HTTP Clientapache : Apache HTTP Client |
http-client.proxy-endpoint | null | An optional proxy endpoint to use for the HTTP client. |
URL Connection HTTP Client Configurations🔗
URL Connection HTTP Client has the following configurable properties:
Property | Default | Description |
---|---|---|
http-client.urlconnection.socket-timeout-ms | null | An optionalsocket timeout in milliseconds |
http-client.urlconnection.connection-timeout-ms | null | An optionalconnection timeout in milliseconds |
Users can use catalog properties to override the defaults. For example, to configure the socket timeout for URL Connection HTTP Client when starting a spark shell, one can add:
Apache HTTP Client Configurations🔗
Apache HTTP Client has the following configurable properties:
Property | Default | Description |
---|---|---|
http-client.apache.socket-timeout-ms | null | An optionalsocket timeout in milliseconds |
http-client.apache.connection-timeout-ms | null | An optionalconnection timeout in milliseconds |
http-client.apache.connection-acquisition-timeout-ms | null | An optionalconnection acquisition timeout in milliseconds |
http-client.apache.connection-max-idle-time-ms | null | An optionalconnection max idle timeout in milliseconds |
http-client.apache.connection-time-to-live-ms | null | An optionalconnection time to live in milliseconds |
http-client.apache.expect-continue-enabled | null, disabled by default | An optionaltrue/false setting that controls whetherexpect continue is enabled |
http-client.apache.max-connections | null | An optionalmax connections in integer |
http-client.apache.tcp-keep-alive-enabled | null, disabled by default | An optionaltrue/false setting that controls whethertcp keep alive is enabled |
http-client.apache.use-idle-connection-reaper-enabled | null, enabled by default | An optionaltrue/false setting that controls whetheruse idle connection reaper is used |
Users can use catalog properties to override the defaults. For example, to configure the max connections for Apache HTTP Client when starting a spark shell, one can add:
Run Iceberg on AWS🔗
Amazon Athena🔗
Amazon Athena provides a serverless query engine that could be used to perform read, write, update and optimization tasks against Iceberg tables.More details could be foundhere.
Amazon EMR🔗
Amazon EMR can provision clusters withSpark (EMR 6 for Spark 3, EMR 5 for Spark 2),Hive,Flink,Trino that can run Iceberg.
Starting with EMR version 6.5.0, EMR clusters can be configured to have the necessary Apache Iceberg dependencies installed without requiring bootstrap actions. Please refer to theofficial documentation on how to create a cluster with Iceberg installed.
For versions before 6.5.0, you can use abootstrap action similar to the following to pre-install all necessary dependencies:
#!/bin/bashICEBERG_VERSION=1.9.1MAVEN_URL=https://repo1.maven.org/maven2ICEBERG_MAVEN_URL=$MAVEN_URL/org/apache/iceberg# NOTE: this is just an example shared class path between Spark and Flink,# please choose a proper class path for production.LIB_PATH=/usr/share/aws/aws-java-sdk/ICEBERG_PACKAGES=("iceberg-spark-runtime-3.5_2.12""iceberg-flink-runtime""iceberg-aws-bundle")install_dependencies(){install_path=$1download_url=$2version=$3shiftpkgs=("$@")forpkgin"${pkgs[@]}";dosudowget-P$install_path$download_url/$pkg/$version/$pkg-$version.jardone}install_dependencies$LIB_PATH$ICEBERG_MAVEN_URL$ICEBERG_VERSION"${ICEBERG_PACKAGES[@]}"
AWS Glue🔗
AWS Glue provides a serverless data integration servicethat could be used to perform read, write and update tasks against Iceberg tables.More details could be foundhere.
AWS EKS🔗
AWS Elastic Kubernetes Service (EKS) can be used to start any Spark, Flink, Hive, Presto or Trino clusters to work with Iceberg.Search theIceberg blogs page for tutorials around running Iceberg with Docker and Kubernetes.
Amazon Kinesis🔗
Amazon Kinesis Data Analytics provides a platform to run fully managed Apache Flink applications. You can include Iceberg in your application Jar and run it in the platform.
AWS Redshift🔗
AWS Redshift Spectrum or Redshift Serverless supports querying Apache Iceberg tables cataloged in the AWS Glue Data Catalog.
Amazon Data Firehose🔗
You can useFirehose to directly deliver streaming data to Apache Iceberg Tables in Amazon S3. With this feature, you can route records from a single stream into different Apache Iceberg Tables, and automatically apply insert, update, and delete operations to records in the Apache Iceberg Tables. This feature requires using the AWS Glue Data Catalog.