
Lets do disclaimer before we start 😉.
This is my current understanding of Database Index.
If there is any mistake that write that lead to misunderstanding, please let me know and I will update this post ASAP.
Let's begin 🚀
What is database index?
Database index is a built-in tools that will help us for searching data inside database. Index will quickly locate record of our queries.
The problem
To help to understand, I will make up some use case. Example, we have ausers table that haveid, full_name & gender
Thatusers table will have 1000 rows. 2 of them ismale and the rest (998) of them isfemale.
Say that we need to find ourmale user, we will create query something like thisSELECT * FROM users WHERE gender = 'male'
.
After we executed that query, we will have result of 2 users.
Behind the scene, database engine willsearch all of 1000 rows to find that 2 rows.
Yes, that sound inefficient. But that is how database engine works. 😵
What is the solution?
The solution is to use Database Index. 💡
With index, database engine will make some of algorithm that map our table based on the index we chose.
Okay, so we create an index ongender
column insideusers table.
Back again to our query, now if we executeSELECT * FROM users WHERE gender = 'male'
, we will get 2 users again as result (of course!).
The difference is, right now database engine willsearch on 2 rows to find that 2 rows.
Now that sound efficient❗
But wait...
After we know that, maybe there is a thinking like this:
"let's create index for all of our column, so we can get quick result every time we search something 🤔"
Thats right, you will get a quick query every search.
But please don't over-used it. Even though Index have upside, Index also have downside.
If we create an index, behind the scene database engine will write additional data to our database.
Lets back again to our users table. The calculation is like this:
2 index (1 for ID, 1 for gender) * 1000 rows = 2000# PK is always get index
Imagine if we put index to all of our column
4 index (ID, gender, full_name, created_at) * 1000 rows = 4000# Thats only 4 columns.# The more column we create, the more usage it will take when CREATE, UPDATE, DELETE, etc.
the conclusion is: please use it wisely. 😇
Bonus: How to create index
- PostgreSql
CREATETABLE"customers"("id"SERIALPRIMARYKEY,"gender"varchar,"full_name"varchar,"created_at"timestamp);CREATEINDEXON"customers"("gender");
- MySql
CREATETABLE`customers`(`id`intPRIMARYKEYAUTO_INCREMENT,`gender`varchar(255),`full_name`varchar(255),`created_at`timestamp);CREATEINDEX`customers_index_0`ON`customers`(`gender`);
Top comments(2)

- Email
- LocationSeattle, WA
- WorkSoftware Engineer / DBA
- Joined
If you have low cardinality of your data, in PostgreSQL, you can use a partial index instead.
In your example, you have 2 items on one side, and 998 on the other. This is the perfect scenario for a partial index. This has the ability to store an index for the 2 items, while ignoring the other 998 rows. So instead of creating 1000 index entries, only 2 are created, keeping storage usage minimal, AND still retaining the performance benefits!

- Email
- LocationIndonesia
- WorkSoftware Engineer
- Joined
Thanks for the addition Vincent!
I didn't know before that we could create conditions on index
For further actions, you may consider blocking this person and/orreporting abuse