PostgreSQL 9.4.1 Documentation | |||
---|---|---|---|
Prev | Up | Appendix F. Additional Supplied Modules | Next |
F.32. postgres_fdw
Thepostgres_fdw module provides the foreign-data wrapperpostgres_fdw, which can be used to access data stored in externalPostgreSQL servers.
The functionality provided by this module overlaps substantially with the functionality of the olderdblink module. Butpostgres_fdw provides more transparent and standards-compliant syntax for accessing remote tables, and can give better performance in many cases.
To prepare for remote access usingpostgres_fdw:
Install thepostgres_fdw extension usingCREATE EXTENSION.
Create a foreign server object, usingCREATE SERVER, to represent each remote database you want to connect to. Specify connection information, exceptuser andpassword, as options of the server object.
Create a user mapping, usingCREATE USER MAPPING, for each database user you want to allow to access each foreign server. Specify the remote user name and password to use asuser andpassword options of the user mapping.
Create a foreign table, usingCREATE FOREIGN TABLE, for each remote table you want to access. The columns of the foreign table must match the referenced remote table. You can, however, use table and/or column names different from the remote table's, if you specify the correct remote names as options of the foreign table object.
Now you need onlySELECT from a foreign table to access the data stored in its underlying remote table. You can also modify the remote table usingINSERT,UPDATE, orDELETE. (Of course, the remote user you have specified in your user mapping must have privileges to do these things.)
It is generally recommended that the columns of a foreign table be declared with exactly the same data types, and collations if applicable, as the referenced columns of the remote table. Althoughpostgres_fdw is currently rather forgiving about performing data type conversions at need, surprising semantic anomalies may arise when types or collations do not match, due to the remote server interpretingWHERE clauses slightly differently from the local server.
Note that a foreign table can be declared with fewer columns, or with a different column order, than its underlying remote table has. Matching of columns to the remote table is by name, not position.
F.32.1. FDW Options of postgres_fdw
F.32.1.1. Connection Options
A foreign server using thepostgres_fdw foreign data wrapper can have the same options thatlibpq accepts in connection strings, as described inSection 31.1.2, except that these options are not allowed:
user andpassword (specify these for a user mapping, instead)
client_encoding (this is automatically set from the local server encoding)
fallback_application_name (always set topostgres_fdw)
Only superusers may connect to foreign servers without password authentication, so always specify thepassword option for user mappings belonging to non-superusers.
F.32.1.2. Object Name Options
These options can be used to control the names used in SQL statements sent to the remotePostgreSQL server. These options are needed when a foreign table is created with names different from the underlying remote table's names.
- schema_name
This option, which can be specified for a foreign table, gives the schema name to use for the foreign table on the remote server. If this option is omitted, the name of the foreign table's schema is used.
- table_name
This option, which can be specified for a foreign table, gives the table name to use for the foreign table on the remote server. If this option is omitted, the foreign table's name is used.
- column_name
This option, which can be specified for a column of a foreign table, gives the column name to use for the column on the remote server. If this option is omitted, the column's name is used.
F.32.1.3. Cost Estimation Options
postgres_fdw retrieves remote data by executing queries against remote servers, so ideally the estimated cost of scanning a foreign table should be whatever it costs to be done on the remote server, plus some overhead for communication. The most reliable way to get such an estimate is to ask the remote server and then add something for overhead — but for simple queries, it may not be worth the cost of an additional remote query to get a cost estimate. Sopostgres_fdw provides the following options to control how cost estimation is done:
- use_remote_estimate
This option, which can be specified for a foreign table or a foreign server, controls whetherpostgres_fdw issues remoteEXPLAIN commands to obtain cost estimates. A setting for a foreign table overrides any setting for its server, but only for that table. The default isfalse.
- fdw_startup_cost
This option, which can be specified for a foreign server, is a numeric value that is added to the estimated startup cost of any foreign-table scan on that server. This represents the additional overhead of establishing a connection, parsing and planning the query on the remote side, etc. The default value is100.
- fdw_tuple_cost
This option, which can be specified for a foreign server, is a numeric value that is used as extra cost per-tuple for foreign-table scans on that server. This represents the additional overhead of data transfer between servers. You might increase or decrease this number to reflect higher or lower network delay to the remote server. The default value is0.01.
Whenuse_remote_estimate is true,postgres_fdw obtains row count and cost estimates from the remote server and then addsfdw_startup_cost andfdw_tuple_cost to the cost estimates. Whenuse_remote_estimate is false,postgres_fdw performs local row count and cost estimation and then addsfdw_startup_cost andfdw_tuple_cost to the cost estimates. This local estimation is unlikely to be very accurate unless local copies of the remote table's statistics are available. RunningANALYZE on the foreign table is the way to update the local statistics; this will perform a scan of the remote table and then calculate and store statistics just as though the table were local. Keeping local statistics can be a useful way to reduce per-query planning overhead for a remote table — but if the remote table is frequently updated, the local statistics will soon be obsolete.
F.32.1.4. Updatability Options
By default all foreign tables usingpostgres_fdw are assumed to be updatable. This may be overridden using the following option:
- updatable
This option controls whetherpostgres_fdw allows foreign tables to be modified usingINSERT,UPDATE andDELETE commands. It can be specified for a foreign table or a foreign server. A table-level option overrides a server-level option. The default istrue.
Of course, if the remote table is not in fact updatable, an error would occur anyway. Use of this option primarily allows the error to be thrown locally without querying the remote server. Note however that theinformation_schema views will report apostgres_fdw foreign table to be updatable (or not) according to the setting of this option, without any check of the remote server.
F.32.2. Connection Management
postgres_fdw establishes a connection to a foreign server during the first query that uses a foreign table associated with the foreign server. This connection is kept and re-used for subsequent queries in the same session. However, if multiple user identities (user mappings) are used to access the foreign server, a connection is established for each user mapping.
F.32.3. Transaction Management
During a query that references any remote tables on a foreign server,postgres_fdw opens a transaction on the remote server if one is not already open corresponding to the current local transaction. The remote transaction is committed or aborted when the local transaction commits or aborts. Savepoints are similarly managed by creating corresponding remote savepoints.
The remote transaction usesSERIALIZABLE isolation level when the local transaction hasSERIALIZABLE isolation level; otherwise it usesREPEATABLE READ isolation level. This choice ensures that if a query performs multiple table scans on the remote server, it will get snapshot-consistent results for all the scans. A consequence is that successive queries within a single transaction will see the same data from the remote server, even if concurrent updates are occurring on the remote server due to other activities. That behavior would be expected anyway if the local transaction usesSERIALIZABLE orREPEATABLE READ isolation level, but it might be surprising for aREAD COMMITTED local transaction. A futurePostgreSQL release might modify these rules.
F.32.4. Remote Query Optimization
postgres_fdw attempts to optimize remote queries to reduce the amount of data transferred from foreign servers. This is done by sending queryWHERE clauses to the remote server for execution, and by not retrieving table columns that are not needed for the current query. To reduce the risk of misexecution of queries,WHERE clauses are not sent to the remote server unless they use only built-in data types, operators, and functions. Operators and functions in the clauses must beIMMUTABLE as well.
The query that is actually sent to the remote server for execution can be examined usingEXPLAIN VERBOSE.
F.32.5. Cross-Version Compatibility
postgres_fdw can be used with remote servers dating back toPostgreSQL 8.3. Read-only capability is available back to 8.1. A limitation however is thatpostgres_fdw generally assumes that immutable built-in functions and operators are safe to send to the remote server for execution, if they appear in aWHERE clause for a foreign table. Thus, a built-in function that was added since the remote server's release might be sent to it for execution, resulting in"function does not exist" or a similar error. This type of failure can be worked around by rewriting the query, for example by embedding the foreign table reference in a sub-SELECT withOFFSET 0 as an optimization fence, and placing the problematic function or operator outside the sub-SELECT.
F.32.6. Examples
Here is an example of creating a foreign table withpostgres_fdw. First install the extension:
CREATE EXTENSION postgres_fdw;
Then create a foreign server usingCREATE SERVER. In this example we wish to connect to aPostgreSQL server on host192.83.123.89 listening on port5432. The database to which the connection is made is namedforeign_db on the remote server:
CREATE SERVER foreign_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host '192.83.123.89', port '5432', dbname 'foreign_db');
A user mapping, defined withCREATE USER MAPPING, is needed as well to identify the role that will be used on the remote server:
CREATE USER MAPPING FOR local_user SERVER foreign_server OPTIONS (user 'foreign_user', password 'password');
Now it is possible to create a foreign table withCREATE FOREIGN TABLE. In this example we wish to access the table namedsome_schema.some_table on the remote server. The local name for it will beforeign_table:
CREATE FOREIGN TABLE foreign_table ( id serial NOT NULL, data text) SERVER foreign_server OPTIONS (schema_name 'some_schema', table_name 'some_table');
It's essential that the data types and other properties of the columns declared inCREATE FOREIGN TABLE match the actual remote table. Column names must match as well, unless you attachcolumn_name options to the individual columns to show how they are named in the remote table.
F.32.7. Author
Shigeru Hanada<shigeru.hanada@gmail.com>