Movatterモバイル変換


[0]ホーム

URL:


Skip to content
DEV Community
Log in Create account

DEV Community

Supabase profile imageMark Burggraf
Mark Burggraf forSupabase

Posted on

     

Speeding Up Bulk Loading in PostgreSQL

Speeding Up Bulk Loading in PostgreSQL

Testing 4 ways to bulk load data into PostgreSQL

The Need For Speed

If you only need to load a few hundred records into your database, you probably aren't too concerned about efficiency. But what happens when try to insert thousands, or even millions of records? Now, data-loading efficiency can mean the difference between success and failure for your project, or at the very least the difference between a project that’s delivered timely and one that’s woefully overdue.

PostgreSQL has a greatcopy command that’s optimized for this task:https://www.postgresql.org/docs/current/sql-copy.html. But that’s only a good solution if your data is specifically in a CSV (or Binary) file. But what if you need to load data from pure SQL? Then what’s the fastest way?

Four Ways to Insert Data

Basic Insert Commands

Let’s look at the structure for some basic SQL insert commands:

createtableusers(idinteger,firstnametext,lastnametext);insertintousers(id,firstname,lastname)values(1,'George','Washington');insertintousers(id,firstname,lastname)values(2,'John','Adams');insertintousers(id,firstname,lastname)values(3,'Thomas','Jefferson');
Enter fullscreen modeExit fullscreen mode

Now we have some basic SQL for inserting records into our user table. This will get the data into our table, alright, but it's the slowest way to get data into our table. Let's look at some ways we can speed things up.

Transactions

A quick and easy way to speed things up is simply to put large batches of insert statements inside a transaction:

begintransaction;insertintousers(id,firstname,lastname)values(1,'George','Washington');insertintousers(id,firstname,lastname)values(2,'John','Adams');insertintousers(id,firstname,lastname)values(3,'Thomas','Jefferson');commit;
Enter fullscreen modeExit fullscreen mode

In my Windows test, this doubled the speed of the insert of 100k user records. Under MacOS, the speed tripled. While you can technically create batches with billions of records in them, you'll probably want to experiment with batch sizes of, say 1000, 10000, 100000, or something like that to see what works best based on your hardware, bandwidth, and record size.

Batched Inserts

Another way to speed things up is to use the SQL batch insert syntax when doing your insert. For example:

insertintousers(id,firstname,lastname)values(1,'George','Washington'),(2,'John','Adams'),(3,'Thomas','Jefferson');
Enter fullscreen modeExit fullscreen mode

This method speeds things up considerably. In my tests, it was about 6 times faster. The same rules apply to batch sizes as with transactions -- you'll want to test different batch sizes to optimize things. I generally tend to start with a batch of around 10000 records for most applications and if that works well enough, I leave it there.

What About Both?

Can you combine transactions and batch insert statements for even more speed? Well, yes, and no. You certainly can combine them, but the speed increase is negligible (or in my Windows test case it even slowed things down just a bit.)

begintransaction;insertintousers(id,firstname,lastname)values(1,'George','Washington'),(2,'John','Adams'),(3,'Thomas','Jefferson');commit;
Enter fullscreen modeExit fullscreen mode

So, while using both techniques here is perfectly valid, it may not be the fastest way to load data.

The Downsides

What are the potential downsides of using transactions or batched inserts? Error handling is the main one. If any one of the records in your batch fails, the entire batch will fail and no data will be inserted into your table from that batch. So you'll need to make sure your data is valid or else have some way to break up and fix failed batches.

If the failure is caused by a unique constraint, you can use theon conflict clause in your insert statement, but if your insert fails for any other reason, it'll throw out the whole batch.

Other Speed Considerations

There are many other factors that can affect your data insert speed and ways you can make things even faster. Removing indexes until after inserting data, creating non-logged tables, and avoiding unnecessary unique keys are just a few of these. These other optimizations will improve performance, but probably not nearly as dramatically as the basic techniques described here.

Conclusion

If you need to deal with large amounts of data, it pays to plan ahead when you're writing your SQL insert code. A few small changes can potentially save you hours (or sometimes even days) of processing time.

Appendix: Sample Test Results

See my GitHub Repopostgresql-bulk-load-tests for some code to test these methods. My test run results are listed below.

Results

===========================Windows VM (UTM Windows 11)===========================create 100k users with individual insert statements30.0 secondscreate 100k users with individual insert statements in a transaction14.0 secondscreate 100k users with batch insert statement4.3 secondscreate 100k users with batch insert statement in a transaction4.6 seconds====================MacBook Pro (M1 Max)====================create 100k users with individual insert statementsreal    0m9.112suser    0m0.509ssys     0m0.337screate 100k users with individual insert statements in a transactionreal    0m2.540suser    0m0.457ssys     0m0.325screate 100k users with batch insert statementreal    0m1.360suser    0m0.179ssys     0m0.042screate 100k users with batch insert statement in a transactionreal    0m1.154suser    0m0.189ssys     0m0.041s
Enter fullscreen modeExit fullscreen mode

Top comments(1)

Subscribe
pic
Create template

Templates let you quickly answer FAQs or store snippets for re-use.

Dismiss
CollapseExpand
 
mbaneshi profile image
Mehdi Baneshi
IT Professional
  • Email
  • Location
    Shiraz Iran
  • Education
    Civil Engileering
  • Pronouns
    he/him
  • Joined

Another good article would be how scrap web and persistent gathered data with supabase
Uploading image

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment'spermalink.

For further actions, you may consider blocking this person and/orreporting abuse

Build in a weekend. Scale to millions.

More fromSupabase

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Log in Create account

[8]ページ先頭

©2009-2025 Movatter.jp