Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit113bb9b

Browse files
committed
XML conversion utility, requires expat library.
John Gray
1 parentd4cafeb commit113bb9b

File tree

8 files changed

+764
-1
lines changed

8 files changed

+764
-1
lines changed

‎contrib/README

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11

22
The PostgreSQL contrib tree
3-
~~~~~~~~~~~~~~~~~~~~~~~~~~~
3+
---------------------------
44

55
This subtree contains tools, modules, and examples that are not
66
maintained as part of the core PostgreSQL system, mainly because
@@ -177,3 +177,7 @@ userlock -
177177
vacuumlo -
178178
Remove orphaned large objects
179179
by Peter T Mount <peter@retep.org.uk>
180+
181+
xml -
182+
Storing XML in PostgreSQL
183+
by John Gray <jgray@beansindustry.co.uk>

‎contrib/xml/Makefile

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
#-------------------------------------------------------------------------
2+
#
3+
# Makefile--
4+
# Adapted from tutorial makefile
5+
#-------------------------------------------------------------------------
6+
7+
subdir = contrib/xml
8+
top_builddir = ../..
9+
include$(top_builddir)/src/Makefile.global
10+
11+
overrideCFLAGS+=$(CFLAGS_SL)
12+
13+
14+
#
15+
# DLOBJS is the dynamically-loaded object files. The "funcs" queries
16+
# include CREATE FUNCTIONs that load routines from these files.
17+
#
18+
DLOBJS= pgxml$(DLSUFFIX)
19+
20+
21+
QUERIES= pgxml.sql
22+
23+
all:$(DLOBJS)$(QUERIES)
24+
25+
# Requires the expat library
26+
27+
%.so:%.o
28+
$(CC) -shared -lexpat -o$@$<
29+
30+
31+
%.sql:%.source
32+
if [-z"$$USER" ];then USER=$$LOGNAME;fi;\
33+
if [-z"$$USER" ];then USER=`whoami`;fi;\
34+
if [-z"$$USER" ];thenecho'Cannot deduce $$USER.';exit 1;fi;\
35+
rm -f$@;\
36+
C=`pwd`;\
37+
sed -e"s:_CWD_:$$C:g"\
38+
-e"s:_OBJWD_:$$C:g"\
39+
-e"s:_DLSUFFIX_:$(DLSUFFIX):g"\
40+
-e"s/_USER_/$$USER/g"<$<>$@
41+
42+
clean:
43+
rm -f$(DLOBJS)$(QUERIES)

‎contrib/xml/README

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
This package contains a couple of simple routines for hooking the
2+
expat XML parser up to PostgreSQL. This is a work-in-progress and all
3+
very basic at the moment (see the file TODO for some outline of what
4+
remains to be done).
5+
6+
At present, two functions are defined, one which checks
7+
well-formedness, and the other which performs very simple XPath-type
8+
queries.
9+
10+
Prerequisite:
11+
12+
expat parser 1.95.0 or newer (http://expat.sourceforge.net)
13+
14+
I used a shared library version -I'm sure you could use a static
15+
library if you wished though. I had no problems compiling from source.
16+
17+
Function documentation and usage:
18+
---------------------------------
19+
20+
pgxml_parse(text) returns bool
21+
parses the provided text and returns true or false if it is
22+
well-formed or not. It returns NULL if the parser couldn't be
23+
created for any reason.
24+
25+
pgxml_xpath(text doc, text xpath, int n) returns text
26+
parses doc and returns the cdata of the nth occurence of
27+
the "XPath" listed. See below for details on the syntax.
28+
29+
30+
Example:
31+
32+
Given a table docstore:
33+
34+
Attribute | Type | Modifier
35+
-----------+---------+----------
36+
docid | integer |
37+
document | text |
38+
39+
containing documents such as (these are archaeological site
40+
descriptions, in case anyone is wondering):
41+
42+
<?XML version="1.0"?>
43+
<site provider="Foundations" sitecode="ak97" version="1">
44+
<name>Church Farm, Ashton Keynes</name>
45+
<invtype>watching brief</invtype>
46+
<location scheme="osgb">SU04209424</location>
47+
</site>
48+
49+
one can type:
50+
51+
select docid,
52+
pgxml_xpath(document,'/site/name',1) as sitename,
53+
pgxml_xpath(document,'/site/location',1) as location
54+
from docstore;
55+
56+
and get as output:
57+
58+
docid | sitename | location
59+
-------+-----------------------------+------------
60+
1 | Church Farm, Ashton Keynes | SU04209424
61+
2 | Glebe Farm, Long Itchington | SP41506500
62+
(2 rows)
63+
64+
65+
"XPath" syntax supported
66+
------------------------
67+
68+
At present it only supports paths of the form:
69+
'tag1/tag2' or '/tag1/tag2'
70+
71+
The first case will find any <tag2> within a <tag1>, the second will
72+
find any <tag2> within a <tag1> at the top level of the document.
73+
74+
The real XPath is much more complex (see TODO file).
75+
76+
77+
John Gray <jgray@azuli.co.uk> 26 July 2001
78+

‎contrib/xml/TODO

Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
PGXML TODO List
2+
===============
3+
4+
Some of these items still require much more thought! The data model
5+
for XML documents and the parsing model of expat don't really fit so
6+
well with a standard SQL model.
7+
8+
1. Generalised XML parsing support
9+
10+
Allow a user to specify handlers (in any PL) to be used by the parser.
11+
This must permit distinct sets of parser settings -user may want some
12+
documents in a database to parsed with one set of handlers, others
13+
with a different set.
14+
15+
i.e. the pgxml_parse function would take as parameters (document,
16+
parsername) where parsername was the identifier for a collection of
17+
handler etc. settings.
18+
19+
"Stub" handlers in the pgxml code would invoke the functions through
20+
the standard fmgr interface. The parser interface would define the
21+
prototype for these functions. How does the handler function know
22+
which document/context has resulted it in being called?
23+
24+
Mechanism for defining collection of parser settings (in a table? -but
25+
maybe copied for efficiency into a structure when first required by a
26+
query?)
27+
28+
2. Support for other parsers
29+
30+
Expat may not be the best choice as a parser because a new parser
31+
instance is needed for each document i.e. all the handlers must be set
32+
again for each document. Another parser may have a more efficient way
33+
of parsing a set of documents identically.
34+
35+
3. XPath support
36+
37+
Proper XPath support. I really need to sit down and plough
38+
through the specification...
39+
40+
The very simple text comparison system currently used is too
41+
basic. Need to convert the path to an ordered list of nodes. Each node
42+
is an element qualifier, and may have a list of attribute
43+
qualifications attached. This probably requires lexx/yacc combination.
44+
(James Clark has written a yacc grammar for XPath). Not all the
45+
features of XPath are necessarily relevant.
46+
47+
An option to return subdocuments (i.e. subelements AND cdata, not just
48+
cdata). This should maybe be the default.
49+
50+
4. Multiple occurences of elements.
51+
52+
This section is all very sketchy, and has various weaknesses.
53+
54+
Is there a good way to optimise/index the results of certain XPath
55+
operations to make them faster?:
56+
57+
select docid, pgxml_xpath(document,'/site/location',1) as location
58+
where pgxml_xpath(document,'/site/name',1) = 'Church Farm';
59+
60+
and with multiple element occurences in a document?
61+
62+
select d.docid, pgxml_xpath(d.document,'/site/location',1)
63+
from docstore d,
64+
pgxml_xpaths('docstore','document','feature/type','docid') ft
65+
where ft.key = d.docid and ft.value ='Limekiln';
66+
67+
pgxml_xpaths params are relname, attrname, xpath, returnkey. It would
68+
return a set of two-element tuples (key,value) consisting of the value of
69+
returnkey, and the cdata value of the xpath. The XML document would be
70+
defined by relname and attrname.
71+
72+
The pgxml_xpaths function could be the basis of a functional index,
73+
which could speed up the above query very substantially, working
74+
through the normal query planner mechanism. Syntax above is fragile
75+
through using names rather than OID.
76+
77+
John Gray <jgray@azuli.co.uk>
78+
79+
80+
81+
82+
83+

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp