forked frompostgres/postgres
- Notifications
You must be signed in to change notification settings - Fork6
Commit9a9db08
committed
Fix replica backward scan race condition.
It was possible for the logic used by backward scans (which must reasonabout concurrent page splits/deletions in its own peculiar way) tobecome confused when running on a replica. Concurrent replay of a WALrecord that describes the second phase of page deletion could cause_bt_walk_left() to get confused. btree_xlog_unlink_page() simply failedto adhere to the same locking protocol that we use on the primary, whichis obviously wrong once you consider these two disparate functionstogether. This bug is present in all stable branches.More concretely, the problem was that nothing stopped _bt_walk_left()from observing inconsistencies between the deletion's target page andits original sibling pages when running on a replica. This is true eventhough the second phase of page deletion is supposed to work as a singleatomic action. Queries running on replicas raised "could not find leftsibling of block %u in index %s" can't-happen errors when they went backto their scan's "original" page and observed that the page has not beenmarked deleted (even though it really was concurrently deleted).There is no evidence that this actually happened in the real world. Theissue came to light during unrelated feature development work. Notethat _bt_walk_left() is the only code that cares about the differencebetween a half-dead page and a fully deleted page that isn't alsoexclusively used by nbtree VACUUM (unless you include contrib/amcheckcode). It seems very likely that backward scans are the only thing thatcould become confused by the inconsistency. Even amcheck's complexbt_right_page_check_scankey() dance was unaffected.To fix, teach btree_xlog_unlink_page() to lock the left sibling, target,and right sibling pages in that order before releasing any locks (justlike _bt_unlink_halfdead_page()). This is the simplest possibleapproach. There doesn't seem to be any opportunity to be more cleverabout lock acquisition in the REDO routine, and it hardly seems worththe trouble in any case.This fix might enable contrib/amcheck verification of leaf page siblinglinks with only an AccessShareLock on the relation. An amcheck patchfrom Andrey Borodin was rejected back in January because it clashed withbtree_xlog_unlink_page()'s lax approach to locking pages. It now seemslikely that the real problem was with btree_xlog_unlink_page(), not thepatch.This is a low severity, low likelihood bug, so no backpatch.Author: Michail NikolaevDiagnosed-By: Michail NikolaevDiscussion:https://postgr.es/m/CANtu0ohkR-evAWbpzJu54V8eCOtqjJyYp3PQ_SGoBTRGXWhWRw@mail.gmail.com1 parenta451b7d commit9a9db08
2 files changed
+72
-34
lines changedLines changed: 18 additions & 0 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
572 | 572 |
| |
573 | 573 |
| |
574 | 574 |
| |
| 575 | + | |
| 576 | + | |
| 577 | + | |
| 578 | + | |
| 579 | + | |
| 580 | + | |
| 581 | + | |
| 582 | + | |
| 583 | + | |
| 584 | + | |
| 585 | + | |
| 586 | + | |
| 587 | + | |
| 588 | + | |
| 589 | + | |
| 590 | + | |
| 591 | + | |
| 592 | + | |
575 | 593 |
| |
576 | 594 |
| |
577 | 595 |
| |
|
Lines changed: 54 additions & 34 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
774 | 774 |
| |
775 | 775 |
| |
776 | 776 |
| |
777 |
| - | |
| 777 | + | |
| 778 | + | |
| 779 | + | |
778 | 780 |
| |
779 | 781 |
| |
780 | 782 |
| |
| |||
783 | 785 |
| |
784 | 786 |
| |
785 | 787 |
| |
786 |
| - | |
787 |
| - | |
788 |
| - | |
789 |
| - | |
| 788 | + | |
| 789 | + | |
| 790 | + | |
| 791 | + | |
| 792 | + | |
| 793 | + | |
| 794 | + | |
| 795 | + | |
| 796 | + | |
| 797 | + | |
790 | 798 |
| |
791 | 799 |
| |
792 |
| - | |
793 |
| - | |
794 |
| - | |
795 |
| - | |
796 |
| - | |
797 |
| - | |
798 |
| - | |
799 |
| - | |
800 |
| - | |
801 |
| - | |
802 |
| - | |
803 |
| - | |
804 |
| - | |
805 | 800 |
| |
806 | 801 |
| |
807 | 802 |
| |
808 |
| - | |
| 803 | + | |
809 | 804 |
| |
810 |
| - | |
| 805 | + | |
811 | 806 |
| |
812 | 807 |
| |
813 | 808 |
| |
814 | 809 |
| |
815 |
| - | |
| 810 | + | |
816 | 811 |
| |
817 |
| - | |
818 |
| - | |
819 | 812 |
| |
| 813 | + | |
| 814 | + | |
820 | 815 |
| |
821 | 816 |
| |
822 |
| - | |
823 |
| - | |
| 817 | + | |
| 818 | + | |
824 | 819 |
| |
825 |
| - | |
| 820 | + | |
826 | 821 |
| |
827 | 822 |
| |
828 | 823 |
| |
| |||
832 | 827 |
| |
833 | 828 |
| |
834 | 829 |
| |
835 |
| - | |
836 |
| - | |
| 830 | + | |
| 831 | + | |
| 832 | + | |
| 833 | + | |
| 834 | + | |
| 835 | + | |
| 836 | + | |
| 837 | + | |
| 838 | + | |
| 839 | + | |
| 840 | + | |
| 841 | + | |
| 842 | + | |
| 843 | + | |
| 844 | + | |
| 845 | + | |
| 846 | + | |
| 847 | + | |
| 848 | + | |
| 849 | + | |
| 850 | + | |
837 | 851 |
| |
838 | 852 |
| |
839 | 853 |
| |
| |||
845 | 859 |
| |
846 | 860 |
| |
847 | 861 |
| |
| 862 | + | |
| 863 | + | |
| 864 | + | |
| 865 | + | |
| 866 | + | |
848 | 867 |
| |
849 |
| - | |
| 868 | + | |
| 869 | + | |
850 | 870 |
| |
851 |
| - | |
852 |
| - | |
| 871 | + | |
| 872 | + | |
853 | 873 |
| |
854 |
| - | |
| 874 | + | |
855 | 875 |
| |
856 | 876 |
| |
857 | 877 |
| |
| |||
870 | 890 |
| |
871 | 891 |
| |
872 | 892 |
| |
873 |
| - | |
874 |
| - | |
| 893 | + | |
| 894 | + | |
875 | 895 |
| |
876 | 896 |
| |
877 | 897 |
| |
|
0 commit comments
Comments
(0)