Movatterモバイル変換

43.11. PL/pgSQL Under the Hood
Prev	Up	Chapter 43. PL/pgSQL -SQL Procedural Language	Home	Next

43.11. PL/pgSQL Under the Hood

43.11.1. Variable Substitution
43.11.2. Plan Caching

This section discusses some implementation details that are frequently important forPL/pgSQL users to know.

43.11.1. Variable Substitution

SQL statements and expressions within aPL/pgSQL function can refer to variables and parameters of the function. Behind the scenes,PL/pgSQL substitutes query parameters for such references. Parameters will only be substituted in places where a parameter or column reference is syntactically allowed. As an extreme case, consider this example of poor programming style:

INSERT INTO foo (foo) VALUES (foo);

The first occurrence offoo must syntactically be a table name, so it will not be substituted, even if the function has a variable namedfoo. The second occurrence must be the name of a column of the table, so it will not be substituted either. Only the third occurrence is a candidate to be a reference to the function's variable.

Note

PostgreSQL versions before 9.0 would try to substitute the variable in all three cases, leading to syntax errors.

Since the names of variables are syntactically no different from the names of table columns, there can be ambiguity in statements that also refer to tables: is a given name meant to refer to a table column, or a variable? Let's change the previous example to

INSERT INTO dest (col) SELECT foo + bar FROM src;

Here,dest andsrc must be table names, andcol must be a column ofdest, butfoo andbar might reasonably be either variables of the function or columns ofsrc.

By default,PL/pgSQL will report an error if a name in a SQL statement could refer to either a variable or a table column. You can fix such a problem by renaming the variable or column, or by qualifying the ambiguous reference, or by tellingPL/pgSQL which interpretation to prefer.

The simplest solution is to rename the variable or column. A common coding rule is to use a different naming convention forPL/pgSQL variables than you use for column names. For example, if you consistently name function variablesv_something while none of your column names start withv_, no conflicts will occur.

Alternatively you can qualify ambiguous references to make them clear. In the above example,src.foo would be an unambiguous reference to the table column. To create an unambiguous reference to a variable, declare it in a labeled block and use the block's label (seeSection 43.2). For example,

<<block>>DECLARE    foo int;BEGIN    foo := ...;    INSERT INTO dest (col) SELECT block.foo + bar FROM src;

Hereblock.foo means the variable even if there is a columnfoo insrc. Function parameters, as well as special variables such asFOUND, can be qualified by the function's name, because they are implicitly declared in an outer block labeled with the function's name.

Sometimes it is impractical to fix all the ambiguous references in a large body ofPL/pgSQL code. In such cases you can specify thatPL/pgSQL should resolve ambiguous references as the variable (which is compatible withPL/pgSQL's behavior beforePostgreSQL 9.0), or as the table column (which is compatible with some other systems such asOracle).

To change this behavior on a system-wide basis, set the configuration parameterplpgsql.variable_conflict to one oferror,use_variable, oruse_column (whereerror is the factory default). This parameter affects subsequent compilations of statements inPL/pgSQL functions, but not statements already compiled in the current session. Because changing this setting can cause unexpected changes in the behavior ofPL/pgSQL functions, it can only be changed by a superuser.

You can also set the behavior on a function-by-function basis, by inserting one of these special commands at the start of the function text:

#variable_conflict error#variable_conflict use_variable#variable_conflict use_column

These commands affect only the function they are written in, and override the setting ofplpgsql.variable_conflict. An example is

CREATE FUNCTION stamp_user(id int, comment text) RETURNS void AS $$    #variable_conflict use_variable    DECLARE        curtime timestamp := now();    BEGIN        UPDATE users SET last_modified = curtime, comment = comment          WHERE users.id = id;    END;$$ LANGUAGE plpgsql;

In theUPDATE command,curtime,comment, andid will refer to the function's variable and parameters whether or notusers has columns of those names. Notice that we had to qualify the reference tousers.id in theWHERE clause to make it refer to the table column. But we did not have to qualify the reference tocomment as a target in theUPDATE list, because syntactically that must be a column ofusers. We could write the same function without depending on thevariable_conflict setting in this way:

CREATE FUNCTION stamp_user(id int, comment text) RETURNS void AS $$    <<fn>>    DECLARE        curtime timestamp := now();    BEGIN        UPDATE users SET last_modified = fn.curtime, comment = stamp_user.comment          WHERE users.id = stamp_user.id;    END;$$ LANGUAGE plpgsql;

Variable substitution does not happen in the command string given toEXECUTE or one of its variants. If you need to insert a varying value into such a command, do so as part of constructing the string value, or useUSING, as illustrated inSection 43.5.4.

Variable substitution currently works only inSELECT,INSERT,UPDATE, andDELETE commands, because the main SQL engine allows query parameters only in these commands. To use a non-constant name or value in other statement types (generically called utility statements), you must construct the utility statement as a string andEXECUTE it.

43.11.2. Plan Caching

ThePL/pgSQL interpreter parses the function's source text and produces an internal binary instruction tree the first time the function is called (within each session). The instruction tree fully translates thePL/pgSQL statement structure, but individualSQL expressions and SQL commands used in the function are not translated immediately.

As each expression andSQL command is first executed in the function, thePL/pgSQL interpreter parses and analyzes the command to create a prepared statement, using the SPI manager'sSPI_prepare function. Subsequent visits to that expression or command reuse the prepared statement. Thus, a function with conditional code paths that are seldom visited will never incur the overhead of analyzing those commands that are never executed within the current session. A disadvantage is that errors in a specific expression or command cannot be detected until that part of the function is reached in execution. (Trivial syntax errors will be detected during the initial parsing pass, but anything deeper will not be detected until execution.)

PL/pgSQL (or more precisely, the SPI manager) can furthermore attempt to cache the execution plan associated with any particular prepared statement. If a cached plan is not used, then a fresh execution plan is generated on each visit to the statement, and the current parameter values (that is,PL/pgSQL variable values) can be used to optimize the selected plan. If the statement has no parameters, or is executed many times, the SPI manager will consider creating ageneric plan that is not dependent on specific parameter values, and caching that for re-use. Typically this will happen only if the execution plan is not very sensitive to the values of thePL/pgSQL variables referenced in it. If it is, generating a plan each time is a net win. See PREPARE for more information about the behavior of prepared statements.

BecausePL/pgSQL saves prepared statements and sometimes execution plans in this way, SQL commands that appear directly in aPL/pgSQL function must refer to the same tables and columns on every execution; that is, you cannot use a parameter as the name of a table or column in an SQL command. To get around this restriction, you can construct dynamic commands using thePL/pgSQLEXECUTE statement — at the price of performing new parse analysis and constructing a new execution plan on every execution.

The mutable nature of record variables presents another problem in this connection. When fields of a record variable are used in expressions or statements, the data types of the fields must not change from one call of the function to the next, since each expression will be analyzed using the data type that is present when the expression is first reached.EXECUTE can be used to get around this problem when necessary.

If the same function is used as a trigger for more than one table,PL/pgSQL prepares and caches statements independently for each such table — that is, there is a cache for each trigger function and table combination, not just for each function. This alleviates some of the problems with varying data types; for instance, a trigger function will be able to work successfully with a column namedkey even if it happens to have different types in different tables.

Likewise, functions having polymorphic argument types have a separate statement cache for each combination of actual argument types they have been invoked for, so that data type differences do not cause unexpected failures.

Statement caching can sometimes have surprising effects on the interpretation of time-sensitive values. For example there is a difference between what these two functions do:

CREATE FUNCTION logfunc1(logtxt text) RETURNS void AS $$    BEGIN        INSERT INTO logtable VALUES (logtxt, 'now');    END;$$ LANGUAGE plpgsql;

and:

CREATE FUNCTION logfunc2(logtxt text) RETURNS void AS $$    DECLARE        curtime timestamp;    BEGIN        curtime := 'now';        INSERT INTO logtable VALUES (logtxt, curtime);    END;$$ LANGUAGE plpgsql;

In the case oflogfunc1, thePostgreSQL main parser knows when analyzing theINSERT that the string'now' should be interpreted astimestamp, because the target column oflogtable is of that type. Thus,'now' will be converted to atimestamp constant when theINSERT is analyzed, and then used in all invocations oflogfunc1 during the lifetime of the session. Needless to say, this isn't what the programmer wanted. A better idea is to use thenow() orcurrent_timestamp function.

In the case oflogfunc2, thePostgreSQL main parser does not know what type'now' should become and therefore it returns a data value of typetext containing the stringnow. During the ensuing assignment to the local variablecurtime, thePL/pgSQL interpreter casts this string to thetimestamp type by calling thetextout andtimestamp_in functions for the conversion. So, the computed time stamp is updated on each execution as the programmer expects. Even though this happens to work as expected, it's not terribly efficient, so use of thenow() function would still be a better idea.

Prev	Up	Next
43.10. Trigger Functions	Home	43.12. Tips for Developing inPL/pgSQL

epub pdf