| PostgreSQL 9.4.1 Documentation | |||
|---|---|---|---|
| Prev | Up | Appendix F. Additional Supplied Modules | Next |
F.21. ltree
This module implements a data typeltree for representing labels of data stored in a hierarchical tree-like structure. Extensive facilities for searching through label trees are provided.
F.21.1. Definitions
Alabel is a sequence of alphanumeric characters and underscores (for example, in C locale the charactersA-Za-z0-9_ are allowed). Labels must be less than 256 bytes long.
Examples:42,Personal_Services
Alabel path is a sequence of zero or more labels separated by dots, for exampleL1.L2.L3, representing a path from the root of a hierarchical tree to a particular node. The length of a label path must be less than 65Kb, but keeping it under 2Kb is preferable. In practice this is not a major limitation; for example, the longest label path in the DMOZ catalog (http://www.dmoz.org) is about 240 bytes.
Example:Top.Countries.Europe.Russia
Theltree module provides several data types:
ltree stores a label path.
lquery represents a regular-expression-like pattern for matchingltree values. A simple word matches that label within a path. A star symbol (*) matches zero or more labels. For example:
fooMatch the exact label pathfoo*.foo.*Match any label path containing the labelfoo*.fooMatch any label path whose last label isfoo
Star symbols can also be quantified to restrict how many labels they can match:
*{n}Match exactlyn labels*{n,}Match at leastn labels*{n,m}Match at leastn but not more thanm labels*{,m}Match at mostm labels — same as *{0,m}There are several modifiers that can be put at the end of a non-star label inlquery to make it match more than just the exact match:
@Match case-insensitively, for examplea@ matchesA*Match any label with this prefix, for examplefoo* matchesfoobar%Match initial underscore-separated words
The behavior of% is a bit complicated. It tries to match words rather than the entire label. For examplefoo_bar% matchesfoo_bar_baz but notfoo_barbaz. If combined with*, prefix matching applies to each word separately, for examplefoo_bar%* matchesfoo1_bar2_baz but notfoo1_br2_baz.
Also, you can write several possibly-modified labels separated with| (OR) to match any of those labels, and you can put! (NOT) at the start to match any label that doesn't match any of the alternatives.
Here's an annotated example oflquery:
Top.*{0,2}.sport*@.!football|tennis.Russ*|Spaina. b. c. d. e.This query will match any label path that:
begins with the labelTop
and next has zero to two labels before
a label beginning with the case-insensitive prefixsport
then a label not matchingfootball nortennis
and then ends with a label beginning withRuss or exactly matchingSpain.
ltxtquery represents a full-text-search-like pattern for matchingltree values. Anltxtquery value contains words, possibly with the modifiers@,*,% at the end; the modifiers have the same meanings as inlquery. Words can be combined with& (AND),| (OR),! (NOT), and parentheses. The key difference fromlquery is thatltxtquery matches words without regard to their position in the label path.
Here's an exampleltxtquery:
Europe & Russia*@ & !Transportation
This will match paths that contain the labelEurope and any label beginning withRussia (case-insensitive), but not paths containing the labelTransportation. The location of these words within the path is not important. Also, when% is used, the word can be matched to any underscore-separated word within a label, regardless of position.
Note:ltxtquery allows whitespace between symbols, butltree andlquery do not.
F.21.2. Operators and Functions
Typeltree has the usual comparison operators=,<>,<,>,<=,>=. Comparison sorts in the order of a tree traversal, with the children of a node sorted by label text. In addition, the specialized operators shown inTable F-12 are available.
Table F-12.ltree Operators
| Operator | Returns | Description |
|---|---|---|
| ltree@>ltree | boolean | is left argument an ancestor of right (or equal)? |
| ltree<@ltree | boolean | is left argument a descendant of right (or equal)? |
| ltree~lquery | boolean | doesltree matchlquery? |
| lquery~ltree | boolean | doesltree matchlquery? |
| ltree?lquery[] | boolean | doesltree match anylquery in array? |
| lquery[]?ltree | boolean | doesltree match anylquery in array? |
| ltree@ltxtquery | boolean | doesltree matchltxtquery? |
| ltxtquery@ltree | boolean | doesltree matchltxtquery? |
| ltree||ltree | ltree | concatenateltree paths |
| ltree||text | ltree | convert text toltree and concatenate |
| text||ltree | ltree | convert text toltree and concatenate |
| ltree[]@>ltree | boolean | does array contain an ancestor ofltree? |
| ltree<@ltree[] | boolean | does array contain an ancestor ofltree? |
| ltree[]<@ltree | boolean | does array contain a descendant ofltree? |
| ltree@>ltree[] | boolean | does array contain a descendant ofltree? |
| ltree[]~lquery | boolean | does array contain any path matchinglquery? |
| lquery~ltree[] | boolean | does array contain any path matchinglquery? |
| ltree[]?lquery[] | boolean | doesltree array contain any path matching anylquery? |
| lquery[]?ltree[] | boolean | doesltree array contain any path matching anylquery? |
| ltree[]@ltxtquery | boolean | does array contain any path matchingltxtquery? |
| ltxtquery@ltree[] | boolean | does array contain any path matchingltxtquery? |
| ltree[]?@>ltree | ltree | first array entry that is an ancestor ofltree; NULL if none |
| ltree[]?<@ltree | ltree | first array entry that is a descendant ofltree; NULL if none |
| ltree[]?~lquery | ltree | first array entry that matcheslquery; NULL if none |
| ltree[]?@ltxtquery | ltree | first array entry that matchesltxtquery; NULL if none |
The operators<@,@>,@ and~ have analogues^<@,^@>,^@,^~, which are the same except they do not use indexes. These are useful only for testing purposes.
The available functions are shown inTable F-13.
Table F-13.ltree Functions
| Function | Return Type | Description | Example | Result |
|---|---|---|---|---|
subltree(ltree, int start, int end) | ltree | subpath ofltree from positionstart to positionend-1 (counting from 0) | subltree('Top.Child1.Child2',1,2) | Child1 |
subpath(ltree, int offset, int len) | ltree | subpath ofltree starting at positionoffset, lengthlen. Ifoffset is negative, subpath starts that far from the end of the path. Iflen is negative, leaves that many labels off the end of the path. | subpath('Top.Child1.Child2',0,2) | Top.Child1 |
subpath(ltree, int offset) | ltree | subpath ofltree starting at positionoffset, extending to end of path. Ifoffset is negative, subpath starts that far from the end of the path. | subpath('Top.Child1.Child2',1) | Child1.Child2 |
nlevel(ltree) | integer | number of labels in path | nlevel('Top.Child1.Child2') | 3 |
index(ltree a, ltree b) | integer | position of first occurrence ofb ina; -1 if not found | index('0.1.2.3.5.4.5.6.8.5.6.8','5.6') | 6 |
index(ltree a, ltree b, int offset) | integer | position of first occurrence ofb ina, searching starting atoffset; negativeoffset means start-offset labels from the end of the path | index('0.1.2.3.5.4.5.6.8.5.6.8','5.6',-4) | 9 |
text2ltree(text) | ltree | casttext toltree | ||
ltree2text(ltree) | text | castltree totext | ||
lca(ltree, ltree, ...) | ltree | lowest common ancestor, i.e., longest common prefix of paths (up to 8 arguments supported) | lca('1.2.2.3','1.2.3.4.5.6') | 1.2 |
lca(ltree[]) | ltree | lowest common ancestor, i.e., longest common prefix of paths | lca(array['1.2.2.3'::ltree,'1.2.3']) | 1.2 |
F.21.3. Indexes
ltree supports several types of indexes that can speed up the indicated operators:
B-tree index overltree:<,<=,=,>=,>
GiST index overltree:<,<=,=,>=,>,@>,<@,@,~,?
Example of creating such an index:
CREATE INDEX path_gist_idx ON test USING GIST (path);
GiST index overltree[]:ltree[] <@ ltree,ltree @> ltree[],@,~,?
Example of creating such an index:
CREATE INDEX path_gist_idx ON test USING GIST (array_path);
Note: This index type is lossy.
F.21.4. Example
This example uses the following data (also available in filecontrib/ltree/ltreetest.sql in the source distribution):
CREATE TABLE test (path ltree);INSERT INTO test VALUES ('Top');INSERT INTO test VALUES ('Top.Science');INSERT INTO test VALUES ('Top.Science.Astronomy');INSERT INTO test VALUES ('Top.Science.Astronomy.Astrophysics');INSERT INTO test VALUES ('Top.Science.Astronomy.Cosmology');INSERT INTO test VALUES ('Top.Hobbies');INSERT INTO test VALUES ('Top.Hobbies.Amateurs_Astronomy');INSERT INTO test VALUES ('Top.Collections');INSERT INTO test VALUES ('Top.Collections.Pictures');INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy');INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Stars');INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Galaxies');INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Astronauts');CREATE INDEX path_gist_idx ON test USING gist(path);CREATE INDEX path_idx ON test USING btree(path);Now, we have a tabletest populated with data describing the hierarchy shown below:
Top / | \ Science Hobbies Collections / | \ Astronomy Amateurs_Astronomy Pictures / \ |Astrophysics Cosmology Astronomy / | \ Galaxies Stars Astronauts
We can do inheritance:
ltreetest=> SELECT path FROM test WHERE path <@ 'Top.Science'; path------------------------------------ Top.Science Top.Science.Astronomy Top.Science.Astronomy.Astrophysics Top.Science.Astronomy.Cosmology(4 rows)
Here are some examples of path matching:
ltreetest=> SELECT path FROM test WHERE path ~ '*.Astronomy.*'; path----------------------------------------------- Top.Science.Astronomy Top.Science.Astronomy.Astrophysics Top.Science.Astronomy.Cosmology Top.Collections.Pictures.Astronomy Top.Collections.Pictures.Astronomy.Stars Top.Collections.Pictures.Astronomy.Galaxies Top.Collections.Pictures.Astronomy.Astronauts(7 rows)ltreetest=> SELECT path FROM test WHERE path ~ '*.!pictures@.*.Astronomy.*'; path------------------------------------ Top.Science.Astronomy Top.Science.Astronomy.Astrophysics Top.Science.Astronomy.Cosmology(3 rows)
Here are some examples of full text search:
ltreetest=> SELECT path FROM test WHERE path @ 'Astro*% & !pictures@'; path------------------------------------ Top.Science.Astronomy Top.Science.Astronomy.Astrophysics Top.Science.Astronomy.Cosmology Top.Hobbies.Amateurs_Astronomy(4 rows)ltreetest=> SELECT path FROM test WHERE path @ 'Astro* & !pictures@'; path------------------------------------ Top.Science.Astronomy Top.Science.Astronomy.Astrophysics Top.Science.Astronomy.Cosmology(3 rows)
Path construction using functions:
ltreetest=> SELECT subpath(path,0,2)||'Space'||subpath(path,2) FROM test WHERE path <@ 'Top.Science.Astronomy'; ?column?------------------------------------------ Top.Science.Space.Astronomy Top.Science.Space.Astronomy.Astrophysics Top.Science.Space.Astronomy.Cosmology(3 rows)
We could simplify this by creating a SQL function that inserts a label at a specified position in a path:
CREATE FUNCTION ins_label(ltree, int, text) RETURNS ltree AS 'select subpath($1,0,$2) || $3 || subpath($1,$2);' LANGUAGE SQL IMMUTABLE;ltreetest=> SELECT ins_label(path,2,'Space') FROM test WHERE path <@ 'Top.Science.Astronomy'; ins_label------------------------------------------ Top.Science.Space.Astronomy Top.Science.Space.Astronomy.Astrophysics Top.Science.Space.Astronomy.Cosmology(3 rows)
F.21.5. Authors
All work was done by Teodor Sigaev (<teodor@stack.net>) and Oleg Bartunov (<oleg@sai.msu.su>). Seehttp://www.sai.msu.su/~megera/postgres/gist/ for additional information. Authors would like to thank Eugeny Rodichev for helpful discussions. Comments and bug reports are welcome.