Movatterモバイル変換


[0]ホーム

URL:


Following system colour schemeSelected dark colour schemeSelected light colour scheme

Python Enhancement Proposals

PEP 335 – Overloadable Boolean Operators

Author:
Gregory Ewing <greg.ewing at canterbury.ac.nz>
Status:
Rejected
Type:
Standards Track
Created:
29-Aug-2004
Python-Version:
3.3
Post-History:
05-Sep-2004, 30-Sep-2011, 25-Oct-2011

Table of Contents

Rejection Notice

This PEP was rejected.Seehttps://mail.python.org/pipermail/python-dev/2012-March/117510.html

Abstract

This PEP proposes an extension to permit objects to define their ownmeanings for the boolean operators ‘and’, ‘or’ and ‘not’, and suggestsan efficient strategy for implementation. A prototype of thisimplementation is available for download.

Background

Python does not currently provide any ‘__xxx__’ special methodscorresponding to the ‘and’, ‘or’ and ‘not’ boolean operators. In thecase of ‘and’ and ‘or’, the most likely reason is that these operatorshave short-circuiting semantics, i.e. the second operand is notevaluated if the result can be determined from the first operand. Theusual technique of providing special methods for these operatorstherefore would not work.

There is no such difficulty in the case of ‘not’, however, and itwould be straightforward to provide a special method for thisoperator. The rest of this proposal will therefore concentrate mainlyon providing a way to overload ‘and’ and ‘or’.

Motivation

There are many applications in which it is natural to provide custommeanings for Python operators, and in some of these, having booleanoperators excluded from those able to be customised can beinconvenient. Examples include:

  1. NumPy, in which almost all the operators are defined onarrays so as to perform the appropriate operation betweencorresponding elements, and return an array of the results. Forconsistency, one would expect a boolean operation between twoarrays to return an array of booleans, but this is not currentlypossible.

    There is a precedent for an extension of this kind: comparisonoperators were originally restricted to returning boolean results,and rich comparisons were added so that comparisons of NumPyarrays could return arrays of booleans.

  2. A symbolic algebra system, in which a Python expression isevaluated in an environment which results in it constructing a treeof objects corresponding to the structure of the expression.
  3. A relational database interface, in which a Python expression isused to construct an SQL query.

A workaround often suggested is to use the bitwise operators ‘&’, ‘|’and ‘~’ in place of ‘and’, ‘or’ and ‘not’, but this has somedrawbacks:

  • The precedence of these is different in relation to the other operators,and they may already be in use for other purposes (as in example 1).
  • It is aesthetically displeasing to force users to use something otherthan the most obvious syntax for what they are trying to express. Thiswould be particularly acute in the case of example 3, considering thatboolean operations are a staple of SQL queries.
  • Bitwise operators do not provide a solution to the problem ofchained comparisons such as ‘a < b < c’ which involve an implicit‘and’ operation. Such expressions currently cannot be used at allon data types such as NumPy arrays where the result of a comparisoncannot be treated as having normal boolean semantics; they must beexpanded into something like (a < b) & (b < c), losing a considerableamount of clarity.

Rationale

The requirements for a successful solution to the problem of allowingboolean operators to be customised are:

  1. In the default case (where there is no customisation), the existingshort-circuiting semantics must be preserved.
  2. There must not be any appreciable loss of speed in the defaultcase.
  3. Ideally, the customisation mechanism should allow the object toprovide either short-circuiting or non-short-circuiting semantics,at its discretion.

One obvious strategy, that has been previously suggested, is to passinto the special method the first argument and a function forevaluating the second argument. This would satisfy requirements 1 and3, but not requirement 2, since it would incur the overhead ofconstructing a function object and possibly a Python function call onevery boolean operation. Therefore, it will not be considered furtherhere.

The following section proposes a strategy that addresses all threerequirements. Aprototype implementation of this strategy isavailable for download.

Specification

Special Methods

At the Python level, objects may define the following special methods.

UnaryBinary, phase 1Binary, phase 2
  • __not__(self)
  • __and1__(self)
  • __or1__(self)
  • __and2__(self, other)
  • __or2__(self, other)
  • __rand2__(self, other)
  • __ror2__(self, other)

The __not__ method, if defined, implements the ‘not’ operator. If itis not defined, or it returns NotImplemented, existing semantics areused.

To permit short-circuiting, processing of the ‘and’ and ‘or’ operatorsis split into two phases. Phase 1 occurs after evaluation of the firstoperand but before the second. If the first operand defines therelevant phase 1 method, it is called with the first operand asargument. If that method can determine the result without needing thesecond operand, it returns the result, and further processing isskipped.

If the phase 1 method determines that the second operand is needed, itreturns the special value NeedOtherOperand. This triggers theevaluation of the second operand, and the calling of a relevantphase 2 method. During phase 2, the __and2__/__rand2__ and__or2__/__ror2__ method pairs work as for other binary operators.

Processing falls back to existing semantics if at any stage a relevantspecial method is not found or returns NotImplemented.

As a special case, if the first operand defines a phase 2 method butno corresponding phase 1 method, the second operand is alwaysevaluated and the phase 2 method called. This allows an object whichdoes not want short-circuiting semantics to simply implement thephase 2 methods and ignore phase 1.

Bytecodes

The patch adds four new bytecodes, LOGICAL_AND_1, LOGICAL_AND_2,LOGICAL_OR_1 and LOGICAL_OR_2. As an example of their use, thebytecode generated for an ‘and’ expression looks like this:

...evaluatefirstoperandLOGICAL_AND_1LevaluatesecondoperandLOGICAL_AND_2L:...

The LOGICAL_AND_1 bytecode performs phase 1 processing. If itdetermines that the second operand is needed, it leaves the firstoperand on the stack and continues with the following code. Otherwiseit pops the first operand, pushes the result and branches to L.

The LOGICAL_AND_2 bytecode performs phase 2 processing, popping bothoperands and pushing the result.

Type Slots

At the C level, the new special methods are manifested as five newslots in the type object. In the patch, they are added to thetp_as_number substructure, since this allows making use of someexisting code for dealing with unary and binary operators. Theirexistence is signalled by a new type flag,Py_TPFLAGS_HAVE_BOOLEAN_OVERLOAD.

The new type slots are:

unaryfuncnb_logical_not;unaryfuncnb_logical_and_1;unaryfuncnb_logical_or_1;binaryfuncnb_logical_and_2;binaryfuncnb_logical_or_2;

Python/C API Functions

There are also five new Python/C API functions corresponding to thenew operations:

PyObject*PyObject_LogicalNot(PyObject*);PyObject*PyObject_LogicalAnd1(PyObject*);PyObject*PyObject_LogicalOr1(PyObject*);PyObject*PyObject_LogicalAnd2(PyObject*,PyObject*);PyObject*PyObject_LogicalOr2(PyObject*,PyObject*);

Alternatives and Optimisations

This section discusses some possible variations on the proposal,and ways in which the bytecode sequences generated for booleanexpressions could be optimised.

Reduced special method set

For completeness, the full version of this proposal includes amechanism for types to define their own customised short-circuitingbehaviour. However, the full mechanism is not needed to address themain use cases put forward here, and it would be possible todefine a simplified version that only includes the phase 2methods. There would then only be 5 new special methods (__and2__,__rand2__, __or2__, __ror2__, __not__) with 3 associated type slotsand 3 API functions.

This simplified version could be expanded to the full versionlater if desired.

Additional bytecodes

As defined here, the bytecode sequence for code that branches onthe result of a boolean expression would be slightly longer thanit currently is. For example, in Python 2.7,

ifaandb:statement1else:statement2

generates

LOAD_GLOBALaPOP_JUMP_IF_FALSEfalse_branchLOAD_GLOBALbPOP_JUMP_IF_FALSEfalse_branch<codeforstatement1>JUMP_FORWARDend_branchfalse_branch:<codeforstatement2>end_branch:

Under this proposal as described so far, it would become something like

LOAD_GLOBALaLOGICAL_AND_1testLOAD_GLOBALbLOGICAL_AND_2test:POP_JUMP_IF_FALSEfalse_branch<codeforstatement1>JUMP_FORWARDend_branchfalse_branch:<codeforstatement2>end_branch:

This involves executing one extra bytecode in the short-circuitingcase and two extra bytecodes in the non-short-circuiting case.

However, by introducing extra bytecodes that combine the logicaloperations with testing and branching on the result, it can bereduced to the same number of bytecodes as the original:

LOAD_GLOBALaAND1_JUMPtrue_branch,false_branchLOAD_GLOBALbAND2_JUMP_IF_FALSEfalse_branchtrue_branch:<codeforstatement1>JUMP_FORWARDend_branchfalse_branch:<codeforstatement2>end_branch:

Here, AND1_JUMP performs phase 1 processing as above,and then examines the result. If there is a result, it is poppedfrom the stack, its truth value is tested and a branch taken toone of two locations.

Otherwise, the first operand is left on the stack and executioncontinues to the next bytecode. The AND2_JUMP_IF_FALSE bytecodeperforms phase 2 processing, pops the result and branches ifit tests false

For the ‘or’ operator, there would be corresponding OR1_JUMPand OR2_JUMP_IF_TRUE bytecodes.

If the simplified version without phase 1 methods is used, thenearly exiting can only occur if the first operand is false for‘and’ and true for ‘or’. Consequently, the two-target AND1_JUMP andOR1_JUMP bytecodes can be replaced with AND1_JUMP_IF_FALSE andOR1_JUMP_IF_TRUE, these being ordinary branch instructions withonly one target.

Optimisation of ‘not’

Recent versions of Python implement a simple optimisation inwhich branching on a negated boolean expression is implementedby reversing the sense of the branch, saving a UNARY_NOT opcode.

Taking a strict view, this optimisation should no longer beperformed, because the ‘not’ operator may be overridden to producequite different results from usual. However, in typical use cases,it is not envisaged that expressions involving customised booleanoperations will be used for branching – it is much more likelythat the result will be used in some other way.

Therefore, it would probably do little harm to specify that thecompiler is allowed to use the laws of boolean algebra tosimplify any expression that appears directly in a booleancontext. If this is inconvenient, the result can always be assignedto a temporary name first.

This would allow the existing ‘not’ optimisation to remain, andwould permit future extensions of it such as using De Morgan’s lawsto extend it deeper into the expression.

Usage Examples

Example 1: NumPy Arrays

#-----------------------------------------------------------------##   This example creates a subclass of numpy array to which#   'and', 'or' and 'not' can be applied, producing an array#   of booleans.##-----------------------------------------------------------------fromnumpyimportarray,ndarrayclassBArray(ndarray):def__str__(self):return"barray(%s)"%ndarray.__str__(self)def__and2__(self,other):return(self&other)def__or2__(self,other):return(self&other)def__not__(self):return(self==0)defbarray(*args,**kwds):returnarray(*args,**kwds).view(type=BArray)a0=barray([0,1,2,4])a1=barray([1,2,3,4])a2=barray([5,6,3,4])a3=barray([5,1,2,4])print"a0:",a0print"a1:",a1print"a2:",a2print"a3:",a3print"not a0:",nota0print"a0 == a1 and a2 == a3:",a0==a1anda2==a3print"a0 == a1 or a2 == a3:",a0==a1ora2==a3

Example 1 Output

a0:barray([0124])a1:barray([1234])a2:barray([5634])a3:barray([5124])nota0:barray([TrueFalseFalseFalse])a0==a1anda2==a3:barray([FalseFalseFalseTrue])a0==a1ora2==a3:barray([FalseFalseFalseTrue])

Example 2: Database Queries

#-----------------------------------------------------------------##   This example demonstrates the creation of a DSL for database#   queries allowing 'and' and 'or' operators to be used to#   formulate the query.##-----------------------------------------------------------------classSQLNode(object):def__and2__(self,other):returnSQLBinop("and",self,other)def__rand2__(self,other):returnSQLBinop("and",other,self)def__eq__(self,other):returnSQLBinop("=",self,other)classTable(SQLNode):def__init__(self,name):self.__tablename__=namedef__getattr__(self,name):returnSQLAttr(self,name)def__sql__(self):returnself.__tablename__classSQLBinop(SQLNode):def__init__(self,op,opnd1,opnd2):self.op=op.upper()self.opnd1=opnd1self.opnd2=opnd2def__sql__(self):return"(%s%s%s)"%(sql(self.opnd1),self.op,sql(self.opnd2))classSQLAttr(SQLNode):def__init__(self,table,name):self.table=tableself.name=namedef__sql__(self):return"%s.%s"%(sql(self.table),self.name)classSQLSelect(SQLNode):def__init__(self,targets):self.targets=targetsself.where_clause=Nonedefwhere(self,expr):self.where_clause=exprreturnselfdef__sql__(self):result="SELECT%s"%", ".join([sql(target)fortargetinself.targets])ifself.where_clause:result="%s WHERE%s"%(result,sql(self.where_clause))returnresultdefsql(expr):ifisinstance(expr,SQLNode):returnexpr.__sql__()elifisinstance(expr,str):return"'%s'"%expr.replace("'","''")else:returnstr(expr)defselect(*targets):returnSQLSelect(targets)#-----------------------------------------------------------------dishes=Table("dishes")customers=Table("customers")orders=Table("orders")query=select(customers.name,dishes.price,orders.amount).where(customers.cust_id==orders.cust_idandorders.dish_id==dishes.dish_idanddishes.name=="Spam, Eggs, Sausages and Spam")printrepr(query)printsql(query)

Example 2 Output

<__main__.SQLSelectobjectat0x1cc830>SELECTcustomers.name,dishes.price,orders.amountWHERE(((customers.cust_id=orders.cust_id)AND(orders.dish_id=dishes.dish_id))AND(dishes.name='Spam, Eggs, Sausages and Spam'))

Copyright

This document has been placed in the public domain.


Source:https://github.com/python/peps/blob/main/peps/pep-0335.rst

Last modified:2025-02-01 08:59:27 GMT


[8]ページ先頭

©2009-2025 Movatter.jp