forked frompostgres/postgres
- Notifications
You must be signed in to change notification settings - Fork6
Commita45c78e
committed
Rearrange pg_dump's handling of large objects for better efficiency.
Commitc0d5be5 caused pg_dump to create a separate BLOB metadata TOCentry for each large object (blob), but it did not touch the ancientdecision to put all the blobs' data into a single "BLOBS" TOC entry.This is bad for a few reasons: for databases with millions of blobs,the TOC becomes unreasonably large, causing performance issues;selective restore of just some blobs is quite impossible; and wecannot parallelize either dump or restore of the blob data, since ourarchitecture for that relies on farming out whole TOC entries toworker processes.To improve matters, let's group multiple blobs into each blob metadataTOC entry, and then make corresponding per-group blob data TOC entries.Selective restore using pg_restore's -l/-L switches is then possible,though only at the group level. (Perhaps we should provide a switchto allow forcing one-blob-per-group for users who need preciseselective restore and don't have huge numbers of blobs. This patchdoesn't do that, instead just hard-wiring the maximum number of blobsper entry at 1000.)The blobs in a group must all have the same owner, since the TOC entryformat only allows one owner to be named. In this implementationwe also require them to all share the same ACL (grants); the archiveformat wouldn't require that, but pg_dump's representation ofDumpableObjects does. It seems unlikely that either restrictionwill be problematic for databases with huge numbers of blobs.The metadata TOC entries now have a "desc" string of "BLOB METADATA",and their "defn" string is just a newline-separated list of blob OIDs.The restore code has to generate creation commands, ALTER OWNERcommands, and drop commands (for --clean mode) from that. We wouldneed special-case code for ALTER OWNER and drop in any case, so thealternative of keeping the "defn" as directly executable SQL codefor creation wouldn't buy much, and it seems like it'd bloat thearchive to little purpose.Since we require the blobs of a metadata group to share the same ACL,we can furthermore store only one copy of that ACL, and then makepg_restore regenerate the appropriate commands for each blob. Thissaves space in the dump file not only by removing duplicative SQLcommand strings, but by not needing a separate TOC entry for eachblob's ACL. In turn, that reduces client-side memory requirements forhandling many blobs.ACL TOC entries that need this special processing are labeled as"ACL"/"LARGE OBJECTS nnn..nnn". If we have a blob with a unique ACL,continue to label it as "ACL"/"LARGE OBJECT nnn". We don't actuallyhave to make such a distinction, but it saves a few cycles duringrestore for the easy case, and it seems like a good idea to not changethe TOC contents unnecessarily.The data TOC entries ("BLOBS") are exactly the same as before,except that now there can be more than one, so we'd better give themidentifying tag strings.Also, commitc0d5be5 put the new BLOB metadata TOC entries intoSECTION_PRE_DATA, which perhaps is defensible in some ways, butit's a rather odd choice considering that we go out of our way totreat blobs as data. Moreover, because parallel restore handlesthe PRE_DATA section serially, this means we'd only get part of theparallelism speedup we could hope for. Move these entries intoSECTION_DATA, letting us parallelize the lo_create calls not just thedata loading when there are many blobs. Add dependencies to ensurethat we won't try to load data for a blob we've not yet created.As this stands, we still generate a separate TOC entry for any commentor security label attached to a blob. I feel comfortable in believingthat comments and security labels on blobs are rare, so this patchshould be enough to get most of the useful TOC compression for blobs.We have to bump the archive file format version number, since existingversions of pg_restore wouldn't know they need to do something specialfor BLOB METADATA, plus they aren't going to work correctly withmultiple BLOBS entries or multiple-large-object ACL entries.The directory and tar-file format handlers need some workfor multiple BLOBS entries: they used to hard-wire the file nameas "blobs.toc", which is replaced here with "blobs_<dumpid>.toc".The 002_pg_dump.pl test script also knows about that and requiresminor updates. (I had to drop the test for manually-compressedblobs.toc files with LZ4, because lz4's obtuse command linedesign requires explicit specification of the output file namewhich seems impractical here. I don't think we're losing anyuseful test coverage thereby; that test stanza seems completelyduplicative with the gzip and zstd cases anyway.)In passing, centralize management of the lo_buf used to hold datawhile restoring blobs. The code previously had each format handlercreate lo_buf, which seems rather pointless given that the formathandlers all make it the same way. Moreover, the format handlersnever use lo_buf directly, making this setup a failure from aseparation-of-concerns standpoint. Let's move the responsibility intopg_backup_archiver.c, which is the only module concerned with lo_buf.The reason to do this in this patch is that it allows a centralizedfix for the now-false assumption that we never restore blobs inparallel. Also, get rid of dead code in DropLOIfExists: it's been along time since we had any need to be able to restore to a pre-9.0server.Discussion:https://postgr.es/m/a9f9376f1c3343a6bb319dce294e20ac@EX13D05UWC001.ant.amazon.com1 parent5eac8ce commita45c78e
File tree
11 files changed
+530
-261
lines changed- src/bin/pg_dump
- t
11 files changed
+530
-261
lines changedLines changed: 26 additions & 0 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
47 | 47 |
| |
48 | 48 |
| |
49 | 49 |
| |
| 50 | + | |
| 51 | + | |
50 | 52 |
| |
51 | 53 |
| |
52 | 54 |
| |
| |||
700 | 702 |
| |
701 | 703 |
| |
702 | 704 |
| |
| 705 | + | |
| 706 | + | |
| 707 | + | |
| 708 | + | |
| 709 | + | |
| 710 | + | |
| 711 | + | |
| 712 | + | |
| 713 | + | |
| 714 | + | |
| 715 | + | |
| 716 | + | |
| 717 | + | |
| 718 | + | |
| 719 | + | |
| 720 | + | |
| 721 | + | |
| 722 | + | |
| 723 | + | |
| 724 | + | |
| 725 | + | |
| 726 | + | |
| 727 | + | |
| 728 | + | |
703 | 729 |
| |
704 | 730 |
| |
705 | 731 |
| |
|
Lines changed: 80 additions & 26 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
512 | 512 |
| |
513 | 513 |
| |
514 | 514 |
| |
515 |
| - | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
516 | 529 |
| |
517 | 530 |
| |
518 | 531 |
| |
| |||
528 | 541 |
| |
529 | 542 |
| |
530 | 543 |
| |
531 |
| - | |
| 544 | + | |
532 | 545 |
| |
533 | 546 |
| |
534 | 547 |
| |
535 | 548 |
| |
536 |
| - | |
| 549 | + | |
537 | 550 |
| |
538 | 551 |
| |
539 | 552 |
| |
| |||
1290 | 1303 |
| |
1291 | 1304 |
| |
1292 | 1305 |
| |
1293 |
| - | |
| 1306 | + | |
1294 | 1307 |
| |
1295 | 1308 |
| |
1296 | 1309 |
| |
| |||
1309 | 1322 |
| |
1310 | 1323 |
| |
1311 | 1324 |
| |
1312 |
| - | |
| 1325 | + | |
1313 | 1326 |
| |
1314 | 1327 |
| |
1315 | 1328 |
| |
| |||
1343 | 1356 |
| |
1344 | 1357 |
| |
1345 | 1358 |
| |
| 1359 | + | |
| 1360 | + | |
| 1361 | + | |
| 1362 | + | |
| 1363 | + | |
| 1364 | + | |
1346 | 1365 |
| |
1347 | 1366 |
| |
1348 | 1367 |
| |
| |||
2988 | 3007 |
| |
2989 | 3008 |
| |
2990 | 3009 |
| |
2991 |
| - | |
2992 |
| - | |
2993 |
| - | |
| 3010 | + | |
| 3011 | + | |
| 3012 | + | |
2994 | 3013 |
| |
2995 | 3014 |
| |
2996 | 3015 |
| |
2997 | 3016 |
| |
| 3017 | + | |
2998 | 3018 |
| |
2999 |
| - | |
| 3019 | + | |
3000 | 3020 |
| |
3001 |
| - | |
| 3021 | + | |
3002 | 3022 |
| |
3003 |
| - | |
| 3023 | + | |
3004 | 3024 |
| |
3005 | 3025 |
| |
3006 | 3026 |
| |
| |||
3035 | 3055 |
| |
3036 | 3056 |
| |
3037 | 3057 |
| |
| 3058 | + | |
3038 | 3059 |
| |
3039 |
| - | |
| 3060 | + | |
3040 | 3061 |
| |
3041 |
| - | |
| 3062 | + | |
3042 | 3063 |
| |
3043 |
| - | |
| 3064 | + | |
3044 | 3065 |
| |
3045 | 3066 |
| |
3046 | 3067 |
| |
| |||
3607 | 3628 |
| |
3608 | 3629 |
| |
3609 | 3630 |
| |
3610 |
| - | |
| 3631 | + | |
| 3632 | + | |
3611 | 3633 |
| |
3612 |
| - | |
| 3634 | + | |
3613 | 3635 |
| |
3614 | 3636 |
| |
3615 | 3637 |
| |
| 3638 | + | |
| 3639 | + | |
| 3640 | + | |
| 3641 | + | |
| 3642 | + | |
| 3643 | + | |
| 3644 | + | |
3616 | 3645 |
| |
3617 | 3646 |
| |
3618 | 3647 |
| |
3619 | 3648 |
| |
3620 | 3649 |
| |
3621 | 3650 |
| |
| 3651 | + | |
| 3652 | + | |
| 3653 | + | |
| 3654 | + | |
| 3655 | + | |
| 3656 | + | |
| 3657 | + | |
| 3658 | + | |
| 3659 | + | |
3622 | 3660 |
| |
3623 | 3661 |
| |
3624 | 3662 |
| |
| |||
3639 | 3677 |
| |
3640 | 3678 |
| |
3641 | 3679 |
| |
3642 |
| - | |
| 3680 | + | |
| 3681 | + | |
| 3682 | + | |
| 3683 | + | |
3643 | 3684 |
| |
3644 |
| - | |
3645 |
| - | |
| 3685 | + | |
| 3686 | + | |
| 3687 | + | |
| 3688 | + | |
| 3689 | + | |
| 3690 | + | |
| 3691 | + | |
3646 | 3692 |
| |
3647 |
| - | |
3648 |
| - | |
3649 |
| - | |
3650 |
| - | |
3651 |
| - | |
3652 |
| - | |
3653 |
| - | |
| 3693 | + | |
| 3694 | + | |
| 3695 | + | |
| 3696 | + | |
| 3697 | + | |
| 3698 | + | |
| 3699 | + | |
| 3700 | + | |
| 3701 | + | |
| 3702 | + | |
| 3703 | + | |
| 3704 | + | |
3654 | 3705 |
| |
3655 | 3706 |
| |
3656 | 3707 |
| |
| |||
4749 | 4800 |
| |
4750 | 4801 |
| |
4751 | 4802 |
| |
| 4803 | + | |
| 4804 | + | |
| 4805 | + | |
4752 | 4806 |
| |
4753 | 4807 |
| |
4754 | 4808 |
| |
|
Lines changed: 6 additions & 1 deletion
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
68 | 68 |
| |
69 | 69 |
| |
70 | 70 |
| |
| 71 | + | |
| 72 | + | |
71 | 73 |
| |
72 | 74 |
| |
73 | 75 |
| |
74 |
| - | |
| 76 | + | |
75 | 77 |
| |
76 | 78 |
| |
77 | 79 |
| |
| |||
448 | 450 |
| |
449 | 451 |
| |
450 | 452 |
| |
| 453 | + | |
| 454 | + | |
| 455 | + | |
451 | 456 |
| |
452 | 457 |
| |
453 | 458 |
| |
|
Lines changed: 2 additions & 9 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
140 | 140 |
| |
141 | 141 |
| |
142 | 142 |
| |
143 |
| - | |
144 |
| - | |
145 |
| - | |
146 |
| - | |
147 | 143 |
| |
148 | 144 |
| |
149 | 145 |
| |
| |||
342 | 338 |
| |
343 | 339 |
| |
344 | 340 |
| |
345 |
| - | |
| 341 | + | |
346 | 342 |
| |
347 | 343 |
| |
348 | 344 |
| |
| |||
402 | 398 |
| |
403 | 399 |
| |
404 | 400 |
| |
405 |
| - | |
| 401 | + | |
406 | 402 |
| |
407 | 403 |
| |
408 | 404 |
| |
| |||
902 | 898 |
| |
903 | 899 |
| |
904 | 900 |
| |
905 |
| - | |
906 |
| - | |
907 |
| - | |
908 | 901 |
| |
909 | 902 |
| |
910 | 903 |
| |
|
0 commit comments
Comments
(0)