Movatterモバイル変換


[0]ホーム

URL:


Skip to content
DEV Community
Log in Create account

DEV Community

Aerospike profile imageEugene R.
Eugene R. forAerospike

Posted on • Originally published atdeveloper.aerospike.com

     

Aerospike Trino Connector - Chapter Two

Photo byDanny Sleeuwenhoek onUnsplash

Speed up your SQL queries using Aerospike secondary indexes

WithAerospike Server 6.0, we have enhanced secondary index queries to allow for querying by partition, throttling, etc. The Trino connector now leverages this functionality to accelerate your secondary key queries at scale.

What is a secondary index?

Asecondary index is a data structure used to quickly locate all the records in a namespace, or a set within it, based on a bin value in the record. When a value is updated in the indexed record, the secondary index automatically updates. You can retrieve records whose indexed value matches specified criteria using a secondary index query.

A query in Trino against a non-primary key bin that does not have a secondary index would invariably trigger a set (table) scan leading to an inefficient query. You can imagine the performance penalty that you would incur when dealing with a billion record table.Creating a secondary index (sindex) on a high-cardinality bin can help significantly speed up your Trino queries.

As shown in Figure 1, Aerospike secondary indexes are stored in DRAM for fast look-up, built on every partition in each cluster node, and are co-located with the primary index. Each secondary index entry contains only references to records local to the node.

Figure 1: Secondary index query

How do I get started?

Creating a secondary index on a bin in Aerospike is easy. You can use an Aerospike tool, such as asadm, or the API to dynamically create and remove indexes based on bins and data types you want to index. For an indexed bin, updating the record to include the bin updates the index. Seehow to create a secondary index for more information.

The Trino connector automatically analyzes the query and applies a secondary index to it, if one is available. If you know the cardinality of the secondary indexes, you can declare which sindex should be used for your query using thesindex_namesession property. The__sindex table is created for each schema and provides details on available secondary indexes. You can change its name using theaerospike.index-table-name configuration property.

Here is an example of secondary indexes that we’ll see in action later:

We recommend that you provide the sindex to use in the query. The statement below shows an example of declaring a sindex to use:

set session aerospike.sindex_name=idx1;

If you do not declare a sindex for your query when multiple sindexes are available, the connector will pick the sindex with the highest lexical order from the list of available secondary indexes.

Will this really improve my Trino SQL query performance?

Prior to supporting secondary indexes, each query that was not a primary key search ended up with a full table scan on the Aerospike side. To illustrate my point, I will use a large (~20M records) data set to compare query response times with and without a secondary index.

A query for a particularpkup_datetime:

With a string secondary index:

Without a secondary index:

Another example of a range query on theid column:

Using a numeric secondary index:

Without a secondary index:

The response time is in seconds and we can see that the performance gain is significant. The primary index (PI) query was ~80 times slower than the secondary index (SI) query.

Summary

Aerospike secondary indexes can make a huge performance difference to your SQL queries. Make sure you read the official documentation regarding the compatibility and installation.

And finally, process your data faster than ever with the newAerospike Connect for Presto!

Top comments(0)

Subscribe
pic
Create template

Templates let you quickly answer FAQs or store snippets for re-use.

Dismiss

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment'spermalink.

For further actions, you may consider blocking this person and/orreporting abuse

NoSQL Data Platfom

Are you developing on Aerospike and want to contribute to our dev community?
Our community repo is on GitHub and we’d love to hear how you’re using Aerospike by contributing articles to our community.

More fromAerospike

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Log in Create account

[8]ページ先頭

©2009-2025 Movatter.jp