This PEP describes the addition of statically nested scoping(lexical scoping) for Python 2.2, and as a source level optionfor python 2.1. In addition, Python 2.1 will issue warnings aboutconstructs whose meaning may change when this feature is enabled.
The old language definition (2.0 and before) defines exactly threenamespaces that are used to resolve names – the local, global,and built-in namespaces. The addition of nested scopes allowsresolution of unbound local names in enclosing functions’namespaces.
The most visible consequence of this change is that lambdas (andother nested functions) can reference variables defined in thesurrounding namespace. Currently, lambdas must often use defaultarguments to explicitly creating bindings in the lambda’snamespace.
This proposal changes the rules for resolving free variables inPython functions. The new name resolution semantics will takeeffect with Python 2.2. These semantics will also be available inPython 2.1 by adding “from __future__ import nested_scopes” to thetop of a module. (SeePEP 236.)
The Python 2.0 definition specifies exactly three namespaces tocheck for each name – the local namespace, the global namespace,and the builtin namespace. According to this definition, if afunction A is defined within a function B, the names bound in Bare not visible in A. The proposal changes the rules so thatnames bound in B are visible in A (unless A contains a namebinding that hides the binding in B).
This specification introduces rules for lexical scoping that arecommon in Algol-like languages. The combination of lexicalscoping and existing support for first-class functions isreminiscent of Scheme.
The changed scoping rules address two problems – the limitedutility of lambda expressions (and nested functions in general),and the frequent confusion of new users familiar with otherlanguages that support nested lexical scopes, e.g. the inabilityto define recursive functions except at the module level.
The lambda expression yields an unnamed function that evaluates asingle expression. It is often used for callback functions. Inthe example below (written using the Python 2.0 rules), any nameused in the body of the lambda must be explicitly passed as adefault argument to the lambda.
fromTkinterimport*root=Tk()Button(root,text="Click here",command=lambdaroot=root:root.test.configure(text="..."))
This approach is cumbersome, particularly when there are severalnames used in the body of the lambda. The long list of defaultarguments obscures the purpose of the code. The proposedsolution, in crude terms, implements the default argument approachautomatically. The “root=root” argument can be omitted.
The new name resolution semantics will cause some programs tobehave differently than they did under Python 2.0. In some cases,programs will fail to compile. In other cases, names that werepreviously resolved using the global namespace will be resolvedusing the local namespace of an enclosing function. In Python2.1, warnings will be issued for all statements that will behavedifferently.
Python is a statically scoped language with block structure, inthe traditional of Algol. A code block or region, such as amodule, class definition, or function body, is the basic unit of aprogram.
Names refer to objects. Names are introduced by name bindingoperations. Each occurrence of a name in the program text refersto the binding of that name established in the innermost functionblock containing the use.
The name binding operations are argument declaration, assignment,class and function definition, import statements, for statements,and except clauses. Each name binding occurs within a blockdefined by a class or function definition or at the module level(the top-level code block).
If a name is bound anywhere within a code block, all uses of thename within the block are treated as references to the currentblock. (Note: This can lead to errors when a name is used withina block before it is bound.)
If the global statement occurs within a block, all uses of thename specified in the statement refer to the binding of that namein the top-level namespace. Names are resolved in the top-levelnamespace by searching the global namespace, i.e. the namespace ofthe module containing the code block, and in the builtinnamespace, i.e. the namespace of the__builtin__ module. Theglobal namespace is searched first. If the name is not foundthere, the builtin namespace is searched. The global statementmust precede all uses of the name.
If a name is used within a code block, but it is not bound thereand is not declared global, the use is treated as a reference tothe nearest enclosing function region. (Note: If a region iscontained within a class definition, the name bindings that occurin the class block are not visible to enclosed functions.)
A class definition is an executable statement that may containuses and definitions of names. These references follow the normalrules for name resolution. The namespace of the class definitionbecomes the attribute dictionary of the class.
The following operations are name binding operations. If theyoccur within a block, they introduce new local names in thecurrent block unless there is also a global declaration.
Functiondefinition:defname...Argumentdeclaration:deff(...name...),lambda...name...Classdefinition:classname...Assignmentstatement:name=...Importstatement:importname,importmoduleasname,frommoduleimportnameImplicitassignment:namesareboundbyforstatementsandexceptclauses
There are several cases where Python statements are illegal whenused in conjunction with nested scopes that contain freevariables.
If a variable is referenced in an enclosed scope, it is an errorto delete the name. The compiler will raise aSyntaxError for‘del name’.
If the wild card form of import (import*) is used in a functionand the function contains a nested block with free variables, thecompiler will raise aSyntaxError.
If exec is used in a function and the function contains a nestedblock with free variables, the compiler will raise aSyntaxErrorunless the exec explicitly specifies the local namespace for theexec. (In other words, “exec obj” would be illegal, but“exec obj in ns” would be legal.)
If a name bound in a function scope is also the name of a moduleglobal name or a standard builtin name, and the function containsa nested function scope that references the name, the compilerwill issue a warning. The name resolution rules will result indifferent bindings under Python 2.0 than under Python 2.2. Thewarning indicates that the program may not run correctly with allversions of Python.
The specified rules allow names defined in a function to bereferenced in any nested function defined with that function. Thename resolution rules are typical for statically scoped languages,with three primary exceptions:
Names in class scope are not accessible. Names are resolved inthe innermost enclosing function scope. If a class definitionoccurs in a chain of nested scopes, the resolution process skipsclass definitions. This rule prevents odd interactions betweenclass attributes and local variable access. If a name bindingoperation occurs in a class definition, it creates an attribute onthe resulting class object. To access this variable in a method,or in a function nested within a method, an attribute referencemust be used, either via self or via the class name.
An alternative would have been to allow name binding in classscope to behave exactly like name binding in function scope. Thisrule would allow class attributes to be referenced either viaattribute reference or simple name. This option was ruled outbecause it would have been inconsistent with all other forms ofclass and instance attribute access, which always use attributereferences. Code that used simple names would have been obscure.
The global statement short-circuits the normal rules. Under theproposal, the global statement has exactly the same effect that itdoes for Python 2.0. It is also noteworthy because it allows namebinding operations performed in one block to change bindings inanother block (the module).
Variables are not declared. If a name binding operation occursanywhere in a function, then that name is treated as local to thefunction and all references refer to the local binding. If areference occurs before the name is bound, a NameError is raised.The only kind of declaration is the global statement, which allowsprograms to be written using mutable global variables. As aconsequence, it is not possible to rebind a name defined in anenclosing scope. An assignment operation can only bind a name inthe current scope or in the global scope. The lack ofdeclarations and the inability to rebind names in enclosing scopesare unusual for lexically scoped languages; there is typically amechanism to create name bindings (e.g. lambda and let in Scheme)and a mechanism to change the bindings (set! in Scheme).
A few examples are included to illustrate the way the rules work.
>>>defmake_adder(base):...defadder(x):...returnbase+x...returnadder>>>add5=make_adder(5)>>>add5(6)11>>>defmake_fact():...deffact(n):...ifn==1:...return1L...else:...returnn*fact(n-1)...returnfact>>>fact=make_fact()>>>fact(7)5040L>>>defmake_wrapper(obj):...classWrapper:...def__getattr__(self,attr):...ifattr[0]!='_':...returngetattr(obj,attr)...else:...raiseAttributeError,attr...returnWrapper()>>>classTest:...public=2..._private=3>>>w=make_wrapper(Test())>>>w.public2>>>w._privateTraceback (most recent call last): File"<stdin>", line1, in?AttributeError:_private
An example from Tim Peters demonstrates the potential pitfalls ofnested scopes in the absence of declarations:
i=6deff(x):defg():printi# ...# skip to the next page# ...foriinx:# ah, i *is* local to f, so this is what g seespassg()
The call tog() will refer to the variable i bound inf() by the forloop. Ifg() is called before the loop is executed, a NameError willbe raised.
There are two kinds of compatibility problems caused by nestedscopes. In one case, code that behaved one way in earlierversions behaves differently because of nested scopes. In theother cases, certain constructs interact badly with nested scopesand will trigger SyntaxErrors at compile time.
The following example from Skip Montanaro illustrates the firstkind of problem:
x=1deff1():x=2definner():printxinner()
Under the Python 2.0 rules, the print statement insideinner()refers to the global variable x and will print 1 iff1() iscalled. Under the new rules, it refers to thef1()’s namespace,the nearest enclosing scope with a binding.
The problem occurs only when a global variable and a localvariable share the same name and a nested function uses that nameto refer to the global variable. This is poor programmingpractice, because readers will easily confuse the two differentvariables. One example of this problem was found in the Pythonstandard library during the implementation of nested scopes.
To address this problem, which is unlikely to occur often, thePython 2.1 compiler (when nested scopes are not enabled) issues awarning.
The other compatibility problem is caused by the use ofimport*and ‘exec’ in a function body, when that function contains anested scope and the contained scope has free variables. Forexample:
y=1deff():exec"y = 'gotcha'"# or from module import *defg():returny...
At compile-time, the compiler cannot tell whether an exec thatoperates on the local namespace or animport* will introducename bindings that shadow the global y. Thus, it is not possibleto tell whether the reference to y ing() should refer to theglobal or to a local name inf().
In discussion of the python-list, people argued for both possibleinterpretations. On the one hand, some thought that the referenceing() should be bound to a local y if one exists. One problemwith this interpretation is that it is impossible for a humanreader of the code to determine the binding of y by localinspection. It seems likely to introduce subtle bugs. The otherinterpretation is to treat exec and import * as dynamic featuresthat do not effect static scoping. Under this interpretation, theexec and import * would introduce local names, but those nameswould never be visible to nested scopes. In the specific exampleabove, the code would behave exactly as it did in earlier versionsof Python.
Since each interpretation is problematic and the exact meaningambiguous, the compiler raises an exception. The Python 2.1compiler issues a warning when nested scopes are not enabled.
A brief review of three Python projects (the standard library,Zope, and a beta version of PyXPCOM) found four backwardscompatibility issues in approximately 200,000 lines of code.There was one example of case #1 (subtle behavior change) and twoexamples ofimport* problems in the standard library.
(The interpretation of theimport* and exec restriction that wasimplemented in Python 2.1a2 was much more restrictive, based onlanguage that in the reference manual that had never beenenforced. These restrictions were relaxed following the release.)
The implementation causes several Python C API functions tochange, includingPyCode_New(). As a result, C extensions mayneed to be updated to work correctly with Python 2.1.
These functions return a dictionary containing the current scope’slocal variables. Modifications to the dictionary do not affectthe values of variables. Under the current rules, the use oflocals() andglobals() allows the program to gain access to allthe namespaces in which names are resolved.
An analogous function will not be provided for nested scopes.Under this proposal, it will not be possible to gaindictionary-style access to all visible scopes.
The compiler will issue warnings in Python 2.1 to help identifyprograms that may not compile or run correctly under futureversions of Python. Under Python 2.2 or Python 2.1 if thenested_scopes future statement is used, which are collectivelyreferred to as “future semantics” in this section, the compilerwill issue SyntaxErrors in some cases.
The warnings typically apply when a function that contains anested function that has free variables. For example, if functionF contains a function G and G uses the builtinlen(), then F is afunction that contains a nested function (G) with a free variable(len). The label “free-in-nested” will be used to describe thesefunctions.
The language reference specifies thatimport* may only occurin a module scope. (Sec. 6.11) The implementation of CPython has supportedimport* at the function scope.
Ifimport* is used in the body of a free-in-nested function,the compiler will issue a warning. Under future semantics,the compiler will raise aSyntaxError.
The exec statement allows two optional expressions followingthe keyword “in” that specify the namespaces used for localsand globals. An exec statement that omits both of thesenamespaces is a bare exec.
If a bare exec is used in the body of a free-in-nestedfunction, the compiler will issue a warning. Under futuresemantics, the compiler will raise aSyntaxError.
If a free-in-nested function has a binding for a localvariable that (1) is used in a nested function and (2) is thesame as a global variable, the compiler will issue a warning.
There are technical issues that make it difficult to supportrebinding of names in enclosing scopes, but the primary reasonthat it is not allowed in the current proposal is that Guido isopposed to it. His motivation: it is difficult to support,because it would require a new mechanism that would allow theprogrammer to specify that an assignment in a block is supposed torebind the name in an enclosing block; presumably a keyword orspecial syntax (x := 3) would make this possible. Given that thiswould encourage the use of local variables to hold state that isbetter stored in a class instance, it’s not worth adding newsyntax to make this possible (in Guido’s opinion).
The proposed rules allow programmers to achieve the effect ofrebinding, albeit awkwardly. The name that will be effectivelyrebound by enclosed functions is bound to a container object. Inplace of assignment, the program uses modification of thecontainer to achieve the desired effect:
defbank_account(initial_balance):balance=[initial_balance]defdeposit(amount):balance[0]=balance[0]+amountreturnbalancedefwithdraw(amount):balance[0]=balance[0]-amountreturnbalancereturndeposit,withdraw
Support for rebinding in nested scopes would make this codeclearer. A class that definesdeposit() andwithdraw() methodsand the balance as an instance variable would be clearer still.Since classes seem to achieve the same effect in a morestraightforward manner, they are preferred.
The implementation for C Python uses flat closures[1]. Each defor lambda expression that is executed will create a closure if thebody of the function or any contained function has freevariables. Using flat closures, the creation of closures issomewhat expensive but lookup is cheap.
The implementation adds several new opcodes and two new kinds ofnames in code objects. A variable can be either a cell variableor a free variable for a particular code object. A cell variableis referenced by containing scopes; as a result, the functionwhere it is defined must allocate separate storage for it on eachinvocation. A free variable is referenced via a function’sclosure.
The choice of free closures was made based on three factors.First, nested functions are presumed to be used infrequently,deeply nested (several levels of nesting) still less frequently.Second, lookup of names in a nested scope should be fast.Third, the use of nested scopes, particularly where a functionthat access an enclosing scope is returned, should not preventunreferenced objects from being reclaimed by the garbagecollector.
Source:https://github.com/python/peps/blob/main/peps/pep-0227.rst
Last modified:2025-02-01 08:55:40 GMT