Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[PGPRO-12159] Added functions for exploring the pages of the rum index.#150

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
arseny114 wants to merge19 commits intopostgrespro:master
base:master
Choose a base branch
Loading
fromarseny114:PGPRO-12159
Open
Show file tree
Hide file tree
Changes fromall commits
Commits
Show all changes
19 commits
Select commitHold shift + click to select a range
a3229ed
[PGPRO-12159] Added pageinspect functions for rum.
arseny114Feb 19, 2025
49d811d
[PGPRO-12159] Added three missing pageinspect functions for rum.
arseny114Feb 21, 2025
15b54fd
[PGPRO-12159] Added the output of tsv lexemes positions.
Apr 21, 2025
499ba6f
[PGPRO-12159] Added the output of weights.
Apr 28, 2025
84ab8c6
[PGPRO-12159] Fixed test crashes on version 15 of PostgreSQL.
May 14, 2025
343ffb5
[PGPRO-12159] Fixed test crashes on version 14 of PostgreSQL.
May 14, 2025
05d2f09
[PGPRO-12159] Fixed test crashes on version 13 of PostgreSQL.
May 14, 2025
65dab05
[PGPRO-12159] Code review.
Jun 23, 2025
8bf1ee8
[PGPRO-12159] Code review.
Jun 25, 2025
299d621
[PGPRO-12159] Fixed incorrect behaviour on 32-bit machines.
Jun 30, 2025
fd455a8
[PGPRO-12159] Added tests for rum_debug_funcs.
Jun 26, 2025
2071875
[PGPRO-12159] Code review.
Jul 4, 2025
a4c02b4
[PGPRO-12159] Fixed the search for the key attribute number.
Aug 22, 2025
1b142f1
[PGPRO-12159] Added a script for updating RUM to version 1.4.
Sep 18, 2025
8e305a5
[PGPRO-12159] Cosmetic fixes.
Sep 19, 2025
a75d3ed
[PGPRO-12159] Fixed the build on version 12 of PostgreSQL.
Sep 22, 2025
6fdc6eb
[PGPRO-12159] Added a description of rum_debug_funcs in README.md
Sep 22, 2025
6189098
[PGPRO-12159] A typo has been fixed.
Oct 7, 2025
b4016c5
[PGPRO-12159] Added perl test for rum_debug_funcs.
Oct 7, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletionsMakefile
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -2,17 +2,17 @@

MODULE_big = rum
EXTENSION = rum
EXTVERSION = 1.3
EXTVERSION = 1.4
PGFILEDESC = "RUM index access method"

OBJS = src/rumsort.o src/rum_ts_utils.o src/rumtsquery.o \
src/rumbtree.o src/rumbulk.o src/rumdatapage.o \
src/rumentrypage.o src/rumget.o src/ruminsert.o \
src/rumscan.o src/rumutil.o src/rumvacuum.o src/rumvalidate.o \
src/btree_rum.o src/rum_arr_utils.o $(WIN32RES)
src/btree_rum.o src/rum_arr_utils.osrc/rum_debug_funcs.o$(WIN32RES)

DATA = rum--1.0--1.1.sql rum--1.1--1.2.sql \
rum--1.2--1.3.sql
rum--1.2--1.3.sql rum--1.3--1.4.sql

DATA_built = $(EXTENSION)--$(EXTVERSION).sql

Expand Down
128 changes: 128 additions & 0 deletionsREADME.md
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -302,6 +302,134 @@ For type: `anyarray`
This operator class stores `anyarray` elements with any supported by module
field.

## Functions for low-level inspect of the RUM index pages

The RUM index provides several functions for low-level research of all types of its pages:

### `rum_metapage_info(rel_name text, blk_num int4) returns record`

`rum_metapage_info` returns information about a RUM index metapage. For example:

```SQL
SELECT * FROM rum_metapage_info('rum_index', 0);
-[ RECORD 1 ]----+-----------
pending_head | 4294967295
pending_tail | 4294967295
tail_free_size | 0
n_pending_pages | 0
n_pending_tuples | 0
n_total_pages | 87
n_entry_pages | 80
n_data_pages | 6
n_entries | 1650
version | 0xC0DE0002
```

### `rum_page_opaque_info(rel_name text, blk_num int4) returns record`

`rum_page_opaque_info` returns information about a RUM index opaque area: `left` and `right` links, `maxoff` -- the number of elements that are stored on the page (this parameter is used differently for different types of pages), `freespace` -- free space on the page.

For example:

```SQL
SELECT * FROM rum_page_opaque_info('rum_index', 10);
leftlink | rightlink | maxoff | freespace | flags
----------+-----------+--------+-----------+--------
6 | 11 | 0 | 0 | {leaf}
```

### `rum_internal_entry_page_items(rel_name text, blk_num int4) returns set of record`

`rum_internal_entry_page_items` returns information that is stored on the internal pages of the entry tree (it is extracted from `IndexTuples`). For example:

```SQL
SELECT * FROM rum_internal_entry_page_items('rum_index', 1);
key | attrnum | category | down_link
---------------------------------+---------+------------------+-----------
3d | 1 | RUM_CAT_NORM_KEY | 3
6k | 1 | RUM_CAT_NORM_KEY | 2
a8 | 1 | RUM_CAT_NORM_KEY | 4
...
Tue May 10 21:21:22.326724 2016 | 2 | RUM_CAT_NORM_KEY | 83
Sat May 14 19:21:22.326724 2016 | 2 | RUM_CAT_NORM_KEY | 84
Wed May 18 17:21:22.326724 2016 | 2 | RUM_CAT_NORM_KEY | 85
+inf | | | 86
(79 rows)
```

RUM (like GIN) on the internal pages of the entry tree packs the downward link and the key in pairs of the following type: `(P_n, K_{n+1})`. It turns out that there is no key for `P_0` (it is assumed to be equal to `-inf`), and for the last key `K_{n+1}` there is no downward link (it is assumed that it is the largest key (or high key) in the subtree to which the `P_n` link leads). For this reason (the key is `+inf` because it is the rightmost page at the current level of the tree), in the example above, the last line contains the key `+inf` (this key does not have a downward link).

### `rum_leaf_data_page_items(rel_name text, blk_num int4) returns set of record`

`rum_leaf_data_page_items` returns information that is stored on the entry tree leaf pages (it is extracted from compressed posting lists). For example:

```SQL
SELECT * FROM rum_leaf_entry_page_items('rum_index', 10);
key | attrnum | category | tuple_id | add_info_is_null | add_info | is_posting_tree | posting_tree_root
-----+---------+------------------+----------+------------------+----------+------------------+--------------------
ay | 1 | RUM_CAT_NORM_KEY | (0,16) | t | | f |
ay | 1 | RUM_CAT_NORM_KEY | (0,23) | t | | f |
ay | 1 | RUM_CAT_NORM_KEY | (2,1) | t | | f |
...
az | 1 | RUM_CAT_NORM_KEY | (0,15) | t | | f |
az | 1 | RUM_CAT_NORM_KEY | (0,22) | t | | f |
az | 1 | RUM_CAT_NORM_KEY | (1,4) | t | | f |
...
b9 | 1 | RUM_CAT_NORM_KEY | | | | t | 7
...
(1602 rows)
```

Each posting list is an `IndexTuple` that stores the key value and a compressed list of `tids`. In the function `rum_leaf_data_page_items`, the key value is attached to each `tid` for convenience, but on the page it is stored in a single instance.

If the number of `tids` is too large, then instead of a posting list, a posting tree will be used for storage. In the example above, a posting tree was created (the key in the posting tree is the `tid`) for the key with the value `b9`. In this case, instead of the posting list, the magic number and the page number, which is the root of the posting tree, are stored inside the `IndexTuple`.

### `rum_internal_data_page_items(rel_name text, blk_num int4) returns set of record`

`rum_internal_data_page_items` returns information that is stored on the internal pages of the posting tree (it is extracted from arrays of `PostingItem` structures). For example:

```SQL
SELECT * FROM rum_internal_data_page_items('rum_index', 7);
is_high_key | block_number | tuple_id | add_info_is_null | add_info
-------------+--------------+----------+------------------+----------
t | | (0,0) | t |
f | 9 | (138,79) | t |
f | 8 | (0,0) | t |
(3 rows)
```

Each element on the internal pages of the posting tree contains the high key (`tid`) value for the child page and a link to this child page (as well as additional information if it was added when creating the index).

At the beginning of the internal pages of the posting tree, the high key of this page is always stored (if it has the value `(0,0)`, this is equivalent to `+inf`; this is always performed if the page is the rightmost).

At the moment, RUM does not support storing (as additional information) the data type that is pass by reference on the internal pages of the posting tree. Therefore, this output is possible:

```SQL
is_high_key | block_number | tuple_id | add_info_is_null | add_info
-------------+--------------+----------+------------------+------------------------------------------------
...
f | 23 | (39,43) | f | varlena types in posting tree is not supported
f | 22 | (74,9) | f | varlena types in posting tree is not supported
...
```

### `rum_leaf_entry_page_items(rel_name text, blk_num int4) returns set of record`

`rum_leaf_entry_page_items` the function returns information that is stored on the leaf pages of the postnig tree (it is extracted from compressed posting lists). For example:

```SQL
SELECT * FROM rum_leaf_data_page_items('rum_idx', 9);
is_high_key | tuple_id | add_info_is_null | add_info
-------------+-----------+------------------+----------
t | (138,79) | t |
f | (0,9) | t |
f | (1,23) | t |
f | (3,5) | t |
f | (3,22) | t |
```

Unlike entry tree leaf pages, on posting tree leaf pages, compressed posting lists are not stored in an `IndexTuple`. The high key is the largest key on the page.

## Todo

- Allow multiple additional information (lexemes positions + timestamp).
Expand Down
3 changes: 2 additions & 1 deletionmeson.build
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -4,7 +4,7 @@
# of the contrib source tree.

extension = 'rum'
extversion = '1.3'
extversion = '1.4'

rum_sources = files(
'src/btree_rum.c',
Expand DownExpand Up@@ -49,6 +49,7 @@ install_data(
'rum--1.0--1.1.sql',
'rum--1.1--1.2.sql',
'rum--1.2--1.3.sql',
'rum--1.3--1.4.sql',
kwargs: contrib_data_args,
)

Expand Down
131 changes: 131 additions & 0 deletionsrum--1.3--1.4.sql
View file
Open in desktop
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
/*
* RUM version 1.4
*/

/*--------------------RUM debug functions-----------------------*/

CREATE FUNCTION rum_metapage_info(
IN rel_name text,
IN blk_num int4,
OUT pending_head bigint,
OUT pending_tail bigint,
OUT tail_free_size int4,
OUT n_pending_pages bigint,
OUT n_pending_tuples bigint,
OUT n_total_pages bigint,
OUT n_entry_pages bigint,
OUT n_data_pages bigint,
OUT n_entries bigint,
OUT version varchar)
AS 'MODULE_PATHNAME', 'rum_metapage_info'
LANGUAGE C STRICT PARALLEL SAFE;

CREATE FUNCTION rum_page_opaque_info(
IN rel_name text,
IN blk_num int4,
OUT leftlink bigint,
OUT rightlink bigint,
OUT maxoff int4,
OUT freespace int4,
OUT flags text[])
AS 'MODULE_PATHNAME', 'rum_page_opaque_info'
LANGUAGE C STRICT PARALLEL SAFE;

CREATE OR REPLACE FUNCTION
rum_page_items_info(rel_name text, blk_num int4, page_type int4)
RETURNS SETOF record
AS 'MODULE_PATHNAME', 'rum_page_items_info'
LANGUAGE C STRICT;

CREATE FUNCTION rum_leaf_data_page_items(
rel_name text,
blk_num int4
)
RETURNS TABLE(
is_high_key bool,
tuple_id tid,
add_info_is_null bool,
add_info varchar
)
AS $$
SELECT *
FROM rum_page_items_info(rel_name, blk_num, 0)
AS rum_page_items_info(
is_high_key bool,
tuple_id tid,
add_info_is_null bool,
add_info varchar
);
$$ LANGUAGE sql;

CREATE FUNCTION rum_internal_data_page_items(
rel_name text,
blk_num int4
)
RETURNS TABLE(
is_high_key bool,
block_number int4,
tuple_id tid,
add_info_is_null bool,
add_info varchar
)
AS $$
SELECT *
FROM rum_page_items_info(rel_name, blk_num, 1)
AS rum_page_items_info(
is_high_key bool,
block_number int4,
tuple_id tid,
add_info_is_null bool,
add_info varchar
);
$$ LANGUAGE sql;

CREATE FUNCTION rum_leaf_entry_page_items(
rel_name text,
blk_num int4
)
RETURNS TABLE(
key varchar,
attrnum int4,
category varchar,
tuple_id tid,
add_info_is_null bool,
add_info varchar,
is_postring_tree bool,
postring_tree_root int4
)
AS $$
SELECT *
FROM rum_page_items_info(rel_name, blk_num, 2)
AS rum_page_items_info(
key varchar,
attrnum int4,
category varchar,
tuple_id tid,
add_info_is_null bool,
add_info varchar,
is_postring_tree bool,
postring_tree_root int4
);
$$ LANGUAGE sql;

CREATE FUNCTION rum_internal_entry_page_items(
rel_name text,
blk_num int4
)
RETURNS TABLE(
key varchar,
attrnum int4,
category varchar,
down_link int4)
AS $$
SELECT *
FROM rum_page_items_info(rel_name, blk_num, 3)
AS rum_page_items_info(
key varchar,
attrnum int4,
category varchar,
down_link int4
);
$$ LANGUAGE sql;
2 changes: 1 addition & 1 deletionrum.control
View file
Open in desktop
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
# RUM extension
comment = 'RUM index access method'
default_version = '1.3'
default_version = '1.4'
module_pathname = '$libdir/rum'
relocatable = true
Loading

[8]ページ先頭

©2009-2025 Movatter.jp