Posted onApr 6, 2022

Database Indexes, please be careful when using it!

#database #sql

Lets do disclaimer before we start 😉.

This is my current understanding of Database Index.
If there is any mistake that write that lead to misunderstanding, please let me know and I will update this post ASAP.

Let's begin 🚀

What is database index?

Database index is a built-in tools that will help us for searching data inside database. Index will quickly locate record of our queries.

The problem

To help to understand, I will make up some use case. Example, we have ausers table that haveid, full_name & gender

Thatusers table will have 1000 rows. 2 of them ismale and the rest (998) of them isfemale.
Say that we need to find ourmale user, we will create query something like thisSELECT * FROM users WHERE gender = 'male'.
After we executed that query, we will have result of 2 users.

Behind the scene, database engine willsearch all of 1000 rows to find that 2 rows.
Yes, that sound inefficient. But that is how database engine works. 😵

What is the solution?

The solution is to use Database Index. 💡
With index, database engine will make some of algorithm that map our table based on the index we chose.

Okay, so we create an index ongender column insideusers table.
Back again to our query, now if we executeSELECT * FROM users WHERE gender = 'male', we will get 2 users again as result (of course!).

The difference is, right now database engine willsearch on 2 rows to find that 2 rows.
Now that sound efficient❗

But wait...

After we know that, maybe there is a thinking like this:

"let's create index for all of our column, so we can get quick result every time we search something 🤔"

Thats right, you will get a quick query every search.
But please don't over-used it. Even though Index have upside, Index also have downside.

If we create an index, behind the scene database engine will write additional data to our database.

Lets back again to our users table. The calculation is like this:

2 index (1 for ID, 1 for gender) * 1000 rows = 2000# PK is always get index

Imagine if we put index to all of our column

4 index (ID, gender, full_name, created_at) * 1000 rows = 4000# Thats only 4 columns.# The more column we create, the more usage it will take when CREATE, UPDATE, DELETE, etc.

the conclusion is: please use it wisely. 😇

Bonus: How to create index

PostgreSql

CREATETABLE"customers"("id"SERIALPRIMARYKEY,"gender"varchar,"full_name"varchar,"created_at"timestamp);CREATEINDEXON"customers"("gender");

MySql

CREATETABLE`customers`(`id`intPRIMARYKEYAUTO_INCREMENT,`gender`varchar(255),`full_name`varchar(255),`created_at`timestamp);CREATEINDEX`customers_index_0`ON`customers`(`gender`);

Top comments(2)

Vincent Milum Jr

I've been coding for over 20 years now! (WOAH, do I feel old)I've touched just about every resource imaginable under the Sun (too bad they were bought out by Oracle)

Email
dev.to@darkain.com
Location
Seattle, WA
Work
Software Engineer / DBA
Joined
Oct 17, 2018

• Apr 10 '22

Copy link

If you have low cardinality of your data, in PostgreSQL, you can use a partial index instead.

In your example, you have 2 items on one side, and 998 on the other. This is the perfect scenario for a partial index. This has the ability to store an index for the 2 items, while ignoring the other 998 rows. So instead of creating 1000 index entries, only 2 are created, keeping storage usage minimal, AND still retaining the performance benefits!

postgresql.org/docs/current/indexe...