Movatterモバイル変換


[0]ホーム

URL:


Skip to content
DEV Community
Log in Create account

DEV Community

Cover image for Amazon DynamoDB: pointer strategy
AWS Community Builders  profile imagePawel Zubkiewicz
Pawel Zubkiewicz forAWS Community Builders

Posted on

     

Amazon DynamoDB: pointer strategy

The transition from relational databases to the world of NoSQL is not easy. Literally, everything is different. Sometimes this can be so overwhelming that you don't know how to solve a problem so trivial that virtually nonexistent in the world of SQL databases.

Today, I want to tell you about such a problem.

Imagine that in your database you storeOrder records in the table. And you need information about the last (newest) order number. (Why? It's not relevant now.)

Last Order number in SQL

When using SQL this problem is trivial. All you need to do is write:

SELECTorderIdFROMOrdersORDERBYorderIdDESCLIMIT1;
Enter fullscreen modeExit fullscreen mode

The query is simple and very efficient (especially when we have indexes). Well done SQL 👏

Last Order number in DynamoDB

And this is where it starts to be problematic. I won't give you a solution right away because to fully understand it some knowledge about DynamoDB is required.

A bit of theory about DynamoDB

Let's start by saying that the equivalent of SQL'sSELECT in DynamoDB are two commandsscan andquery. Both are used to retrieve information from a table (in DynamoDB we only have the concept of a table, not the entire database). However, they are significantly different from each other. Thescan method scans the contents of thewhole table and returns it to us as a collection of elements.

For the following tablescan returns6 items because there are six orders in the whole table.
Table 1

When usingscan we could retrieve information about all orders and process them in thecode, e.g. in the Lambda function, filter the data and return the last order number.

This approach is obviously theworst possible, as we are retrieving much more data from the database than we want (collection of Orders vs single Order number). This will affect the running time of our application, but also the cost becausein DynamoDB we pay for each query / amount of data returned. To make matters worse, when dealing with large amounts of data, we have to page through the results.

So let us consider other options.

Thequery method is used to retrieve data from alocal collection, that is, elements in the database that share a common (same value)Partition Key.

I'm trying to make this article concise and simple, so I'm not going to discuss how Amazon DynamoDB is internally designed. However, I strongly encourage you to learn it on your own, because it is simply a good architecture school and it will help you understand why this database is used the way it is.

In our order table (above), each order has a differentPartition Key value, so using thequery method simply doesn't make sense here since we don't have local collections.

On a side note, a local collection for an Order could be a list of goods in the single order. In that case, each item'sPrimary Key would consist of two values (a composite key):

  1. Partition Key
  2. Sort Key

The table would change to the state presented below. Then callingquery with the parameterorder#2 would return us two items because in this order someone bought two products: DataLake Training and Consulting.

Table 2

However, this is a different data access pattern in DynamoDB, and I've presented it to you as a side note.

Let's move back to the main problem.

What can we do since neitherscan norquery are suitable to return the latest Order number?

The GetItem method

There is alsoGetItem, a method that returns us a single item from the database when given a specificPrimary Key value. It's kind of likegetById(id) or in SQL:

SELECT*FROMOrdersWHEREorderId='orderNumber';
Enter fullscreen modeExit fullscreen mode

That's cool, but how could we get thisorderNumber if this is what we want to pull from the database in the first place? 🤔

Pointer strategy

This is where the pointer strategy comes in.

We can replace an unknown value with something known, some constant, and refer to the database through that constant. This element with the constantPartition Key will beonly one in the entire table. Thus we can use it as a pointer to store the value of the most recent order.

To our first table, we add another element with aPartition Key ofalways equal toLAST_ORDER (ourconstant string). Such an element has one attribute namedOrderId with the value of the most recent order. Each time we add a new Order to the table, we also update the value of theLAST_ORDER element to the value of the neworderId.

Table 3

Now, all we need to do is call a simplegetItem('LAST_ORDER') method that returns the last order number for us in anoptimal way.

Advanced techniques

Taking into account that the DynamoDB table is usually part of a larger system, let's consider what happens when multiple processes write to the database in parallel (directly or through the SQS queue - doesn't matter). Certainly, at some point it will happen that the last item written to the database, will not be the last order. Then theLAST_ORDER pointer will wrongly point to an older Order.

We could solve this problem by using theSQS FIFO queue, but there is a much simpler and cheaper solution for this.

We just need to apply aConditionExpression when saving a new version of theLAST_ORDER element, which will check if the neworderId value is greater than the one currently stored in the database. If it is it will update it, if not, it will not.

Thanks to this, using one write to the database, we can update the value without downloading and checking iton the code side. In addition, this method isidempotent (in case we get the same event multiple times, it will not change the database state on subsequent calls).

How to do it?

It's time for some code. My sample implementation in #"http://www.w3.org/2000/svg" width="20px" height="20px" viewbox="0 0 24 24">Enter fullscreen modeExit fullscreen mode

Code listing explanation:

  • Line 4 - the Pointer class implements a method that converts aPointer object into the JSON that the DynamoDB API expects. Below is the implementation of this method.
  • Line 7 - conditional expression: when to write to the database and when not to? Theattribute_not_exists(#orderId) is needed so that when the database is empty, the code also executes and writes the item for the first time. From this point on, only the second part#orderId < :newId of the condition will have meaning.
  • Line 19 - if the condition is not met, the DynamoDB API returns aConditionalCheckFailedException, which in our case is expected sooner or later.

And here is aPointer class implementation.

classPointer{constructor({orderId,createdAt=newDate()}){this.orderId=parseInt(orderId)this.createdAt=createdAtinstanceofDate?createdAt:newDate(createdAt)}key(){return{PK:{S:'LAST_ORDER'}}}staticfromItem(item){returnnewPointer({orderId:item.orderId.N,createdAt:item.createdAt.S})}toItem(){return{...this.key(),orderId:{N:this.orderId.toString()},createdAt:{S:this.createdAt.toISOString()},}}}
Enter fullscreen modeExit fullscreen mode

Please note, the style of this class isripped off from Alex DeBrie 😃

He is an author ofThe DynamoDB Book which is the best thing to teach yourself how to use this database. Since I read that book I stared using many advanced techniques that he described. I highly recommend his book.

Summary

I hope this simplified, but taken from my real system, example helped you better understand how to use DynamoDB. I've only covered a small area of knowledge, but at the same time I've shown you how to solve a specific problem you may encounter in your projects.

I'm aware that DynamoDB has a steep learning curve - there's no denying that - but honestly,I can't imagine serverless systems without Amazon DynamoDB database. In 9 out of 10 cases, I would choose DynamoDB over AWS RDS (including Aurora Serverless).

Top comments(0)

Subscribe
pic
Create template

Templates let you quickly answer FAQs or store snippets for re-use.

Dismiss

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment'spermalink.

For further actions, you may consider blocking this person and/orreporting abuse

AWS Community Builders
Build On!

Would you like to become an AWS Community Builder? Learn more about the program and apply to join when applications are open next.

Learn more

More fromAWS Community Builders

Serverless Meets Secure Networking: Payment Gateway building block with VPC Lattice
#webdev#aws#serverless#vpclattice
Serverless FastAPI Testing: Use Moto and Just Mock It!
#python#fastapi#pytest#dynamodb
How to Crater Your Database, Part Five - Summary
#aws#serverless#dynamodb#architecture
DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Log in Create account

[8]ページ先頭

©2009-2025 Movatter.jp