Release:1.4.54legacy version | Release Date: September 5, 2024

SQLAlchemy 1.4 Documentation

legacy version

Home |Download this Documentation

Changes and Migration

Project Versions

1.4.54

Home |Download this Documentation

Changes and Migration

On this page:

What’s New in SQLAlchemy 1.4?

What’s New in SQLAlchemy 1.4?¶

About this Document

This document describes changes between SQLAlchemy version 1.3and SQLAlchemy version 1.4.

Version 1.4 is taking on a different focus than other SQLAlchemy releasesin that it is in many ways attempting to serve as a potential migrationpoint for a more dramatic series of API changes currently planned forrelease 2.0 of SQLAlchemy. The focus of SQLAlchemy 2.0 is a modernizedand slimmed down API that removes lots of usage patterns that have longbeen discouraged, as well as mainstreams the best ideas in SQLAlchemy asfirst class API features, with the goal being that there is much lessambiguity in how the API is to be used, as well as that a series ofimplicit behaviors and rarely-used API flags that complicate the internalsand hinder performance will be removed.

For the current status of SQLAlchemy 2.0, seeMigrating to SQLAlchemy 2.0.

Major API changes and features - General¶

Python 3.6 is the minimum Python 3 version; Python 2.7 still supported¶

As Python 3.5 reached EOL in September of 2020, SQLAlchemy 1.4 now placesversion 3.6 as the minimum Python 3 version. Python 2.7 is still supported,however the SQLAlchemy 1.4 series will be the last series to support Python 2.

ORM Query is internally unified with select, update, delete; 2.0 style execution available¶

The biggest conceptual change to SQLAlchemy for version 2.0 and essentiallyin 1.4 as well is that the great separation between theSelectconstruct in Core and theQuery object in the ORM has been removed,as well as between theQuery.update() andQuery.delete()methods in how they relate toUpdate andDelete.

With regards toSelect andQuery, these two objectshave for many versions had similar, largely overlapping APIs and even someability to change between one and the other, while remaining very different intheir usage patterns and behaviors. The historical background for this wasthat theQuery object was introduced to overcome shortcomings intheSelect object which used to be at the core of how ORM objectswere queried, except that they had to be queried in terms ofTable metadata only. HoweverQuery had only asimplistic interface for loading objects, and only over the course of manymajor releases did it eventually gain most of the flexibility of theSelect object, which then led to the ongoing awkwardness thatthese two objects became highly similar yet still largely incompatible witheach other.

In version 1.4, all Core and ORM SELECT statements are rendered from aSelect object directly; when theQuery objectis used, at statement invocation time it copies its state to aSelectwhich is then invoked internally using2.0 style execution. Going forward,theQuery object will become legacy only, and applications willbe encouraged to move to2.0 style execution which allows Core constructsto be used freely against ORM entities:

withSession(engine,future=True)assess:stmt=(select(User).where(User.name=="sandy").join(User.addresses).where(Address.email_address.like("%gmail%")))result=sess.execute(stmt)foruserinresult.scalars():print(user)

Things to note about the above example:

TheSession andsessionmaker objects now featurefull context manager (i.e. thewith: statement) capability;see the revised documentation atOpening and Closing a Session for an example.
Within the 1.4 series, all2.0 style ORM invocation uses aSession that includes theSession.futureflag set toTrue; this flag indicates theSession shouldhave 2.0-style behaviors, which include that ORM queries can be invokedfromexecute as well as some changes in transactionalfeatures. In version 2.0 this flag will always beTrue.
Theselect() construct no longer needs brackets around thecolumns clause; seeselect(), case() now accept positional expressions for background on this improvement.
Theselect() /Select object has aSelect.join()method that acts like that of theQuery and even accommodatesan ORM relationship attribute (without breaking the separation betweenCore and ORM!) - seeselect().join() and outerjoin() add JOIN criteria to the current query, rather than creating a subquery for background on this.
Statements that work with ORM entities and are expected to return ORMresults are invoked usingSession.execute(). SeeQuerying (2.0 style) for a primer. See also the following noteatORM Session.execute() uses “future” style Result sets in all cases.
aResult object is returned, rather than a plain list, whichitself is a much more sophisticated version of the previousResultProxyobject; this object is now used both for Core and ORM results. SeeNew Result object,RowProxy is no longer a “proxy”; is now called Row and behaves like an enhanced named tuple, andThe “KeyedTuple” object returned by Query is replaced by Row for information on this.

Throughout SQLAlchemy’s documentation, there will be many references to1.x style and2.0 style execution. This is to distinguishbetween the two querying styles and to attempt to forwards-document the newcalling style going forward. In SQLAlchemy 2.0, while theQueryobject may remain as a legacy construct, it will no longer be featured inmost documentation.

Similar adjustments have been made to “bulk updates and deletes” such thatCoreupdate() anddelete() can be used for bulkoperations. A bulk update like the following:

session.query(User).filter(User.name=="sandy").update({"password":"foobar"},synchronize_session="fetch")

can now be achieved in2.0 style (and indeed the above runs internallyin this way) as follows:

withSession(engine,future=True)assess:stmt=(update(User).where(User.name=="sandy").values(password="foobar").execution_options(synchronize_session="fetch"))sess.execute(stmt)

Note the use of theExecutable.execution_options() method to passORM-related options. The use of “execution options” is now much more prevalentwithin both Core and ORM, and many ORM-related methods fromQueryare now implemented as execution options (seeQuery.execution_options()for some examples).

See also

Migrating to SQLAlchemy 2.0

#5159

ORM`Session.execute()` uses “future” style`Result` sets in all cases¶

As noted inRowProxy is no longer a “proxy”; is now called Row and behaves like an enhanced named tuple, theResult andRow objects now feature “named tuple” behavior, when used withanEngine that includes thecreate_engine.future parameter set toTrue. These“named tuple” rows in particular include a behavioral change which is thatPython containment expressions usingin, such as:

>>>engine=create_engine("...",future=True)>>>conn=engine.connect()>>>row=conn.execute.first()>>>"name"inrowTrue

The above containment test willusevalue containment, notkey containment; therow would need tohave avalue of “name” to returnTrue.

Under SQLAlchemy 1.4, whencreate_engine.future parameter settoFalse, legacy-styleLegacyRow objects are returned which feature thepartial-named-tuple behavior of prior SQLAlchemy versions, where containmentchecks continue to use key containment;"name"inrow would returnTrue if the row had acolumn named “name”, rather than a value.

When usingSession.execute(), full named-tuple style is enabledunconditionally, meaning"name"inrow will usevalue containmentas the test, andnot key containment. This is to accommodate thatSession.execute() now returns aResult that alsoaccommodates for ORM results, where even legacy ORM result rows such as thosereturned byQuery.all() use value containment.

This is a behavioral change from SQLAlchemy 1.3 to 1.4. To continue receivingkey-containment collections, use theResult.mappings() method toreceive aMappingResult that returns rows as dictionaries:

fordict_rowinsession.execute(text("select id from table")).mappings():assert"id"indict_row

Transparent SQL Compilation Caching added to All DQL, DML Statements in Core, ORM¶

One of the most broadly encompassing changes to ever land in a singleSQLAlchemy version, a many-month reorganization and refactoring of all queryingsystems from the base of Core all the way through ORM now allows themajority of Python computation involved producing SQL strings and relatedstatement metadata from a user-constructed statement to be cached in memory,such that subsequent invocations of an identical statement construct will use35-60% fewer CPU resources.

This caching goes beyond the construction of the SQL string to also include theconstruction of result fetching structures that link the SQL construct to theresult set, and in the ORM it includes the accommodation of ORM-enabledattribute loaders, relationship eager loaders and other options, and objectconstruction routines that must be built up each time an ORM query seeks to runand construct ORM objects from result sets.

To introduce the general idea of the feature, given code from thePerformance suite as follows, which will invokea very simple query “n” times, for a default value of n=10000. Thequery returns only a single row, as the overhead we are looking to decreaseis that ofmany small queries. The optimization is not as significantfor queries that return many rows:

session=Session(bind=engine)forid_inrandom.sample(ids,n):result=session.query(Customer).filter(Customer.id==id_).one()

This example in the 1.3 release of SQLAlchemy on a Dell XPS13 running Linuxcompletes as follows:

test_orm_query:(10000iterations);totaltime3.440652sec

In 1.4, the code above without modification completes:

test_orm_query:(10000iterations);totaltime2.367934sec

This first test indicates that regular ORM queries when using caching can runover many iterations in the range of30% faster.

A second variant of the feature is the optional use of Python lambdas to deferthe construction of the query itself. This is a more sophisticated variant ofthe approach used by the “Baked Query” extension, which was introduced inversion 1.0.0. The “lambda” feature may be used in a style very similar tothat of baked queries, except that it is available in an ad-hoc way for any SQLconstruct. It additionally includes the ability to scan each invocation of thelambda for bound literal values that change on every invocation, as well aschanges to other constructs, such as querying from a different entity or columneach time, while still not having to run the actual code each time.

Using this API looks as follows:

session=Session(bind=engine)forid_inrandom.sample(ids,n):stmt=lambda_stmt(lambda:future_select(Customer))stmt+=lambdas:s.where(Customer.id==id_)session.execute(stmt).scalar_one()

The code above completes:

test_orm_query_newstyle_w_lambdas:(10000iterations);totaltime1.247092sec

This test indicates that using the newer “select()” style of ORM querying,in conjunction with a full “baked” style invocation that caches the entireconstruction, can run over many iterations in the range of60% faster andgrants performance about the same as the baked query system which is now supersededby the native caching system.

The new system makes use of the existingConnection.execution_options.compiled_cache executionoption and also adds a cache to theEngine directly, which isconfigured using theEngine.query_cache_size parameter.

A significant portion of API and behavioral changes throughout 1.4 weredriven in order to support this new feature.

See also

SQL Compilation Caching

#4639 #5380 #4645 #4808 #5004

Declarative is now integrated into the ORM with new features¶

After ten years or so of popularity, thesqlalchemy.ext.declarativepackage is now integrated into thesqlalchemy.orm namespace, with theexception of the declarative “extension” classes which remain as Declarativeextensions.

The new classes added tosqlalchemy.orm include:

registry - a new class that supersedes the role of the“declarative base” class, serving as a registry of mapped classes whichcan be referenced via string name withinrelationship() callsand is agnostic of the style in which any particular class was mapped.
declarative_base() - this is the same declarative base class thathas been in use throughout the span of the declarative system, except it nowreferences aregistry object internally and is implementedby theregistry.generate_base() method which can be invokedfrom aregistry directly. Thedeclarative_base()function creates this registry automatically so there is no impact onexisting code. Thesqlalchemy.ext.declarative.declarative_base nameis still present, emitting a 2.0 deprecation warning when2.0 deprecations mode is enabled.
declared_attr() - the same “declared attr” function call nowpart ofsqlalchemy.orm. Thesqlalchemy.ext.declarative.declared_attrname is still present, emitting a 2.0 deprecation warning when2.0 deprecations mode is enabled.
Other names moved intosqlalchemy.orm includehas_inherited_table(),synonym_for(),DeclarativeMeta,as_declarative().

In addition, Theinstrument_declarative() function isdeprecated, superseded byregistry.map_declaratively(). TheConcreteBase,AbstractConcreteBase,andDeferredReflection classes remain as extensions in theDeclarative Extensions package.

Mapping styles have now been organized such that they all extend fromtheregistry object, and fall into these categories:

Declarative Mapping
- Usingdeclarative_base() Base class w/ metaclass
  Declarative Table
  Imperative Table (a.k.a. “hybrid table”)
- Usingregistry.mapped() Declarative Decorator
  Declarative Table
  Imperative Table (Hybrid)
  Applying ORM Mappings to an existing dataclass
Imperative (a.k.a. “classical” mapping)
- Usingregistry.map_imperatively()
  Mapping dataclasses using Imperative Mapping

The existing classical mapping functionmapper() remains, howeverit is deprecated to call uponmapper() directly; the newregistry.map_imperatively() method now routes the request throughthesqlalchemy.orm.registry() so that it integrates with other declarative mappingsunambiguously.

The new approach interoperates with 3rd party class instrumentation systemswhich necessarily must take place on the class before the mapping processdoes, allowing declarative mapping to work via a decorator instead of adeclarative base so that packages likedataclasses andattrs can beused with declarative mappings, in addition to working with classicalmappings.

Declarative documentation has now been fully integrated into the ORM mapperconfiguration documentation and includes examples for all styles of mappingsorganized into one place. See the sectionORM Mapped Class Overview for the start of the newly reorganizeddocumentation.

See also

ORM Mapped Class Overview

Python Dataclasses, attrs Supported w/ Declarative, Imperative Mappings

#5508

Python Dataclasses, attrs Supported w/ Declarative, Imperative Mappings¶

Along with the new declarative decorator styles introduced inDeclarative is now integrated into the ORM with new features,theMapper is now explicitly aware of the Pythondataclassesmodule and will recognize attributes that are configured in this way, andproceed to map them without skipping them as was the case previously. In thecase of theattrs module,attrs already removes its own attributesfrom the class so was already compatible with SQLAlchemy classical mappings.With the addition of theregistry.mapped() decorator, bothattribute systems can now interoperate with Declarative mappings as well.

Mapping dataclasses using Imperative Mapping

#5027

Asynchronous IO Support for Core and ORM¶

SQLAlchemy now supports Pythonasyncio-compatible database drivers using anall-new asyncio front-end interface toConnection for Coreusage as well asSession for ORM use, using theAsyncConnection andAsyncSession objects.

Note

The new asyncio feature should be consideredalpha level forthe initial releases of SQLAlchemy 1.4. This is super new stuff that usessome previously unfamiliar programming techniques.

The initial database API supported is theasyncpgasyncio driver for PostgreSQL.

The internal features of SQLAlchemy are fully integrated by making use ofthegreenlet library in orderto adapt the flow of execution within SQLAlchemy’s internals to propagateasyncioawait keywords outwards from the database driver to the end-userAPI, which featuresasync methods. Using this approach, the asyncpgdriver is fully operational within SQLAlchemy’s own test suite and featurescompatibility with most psycopg2 features. The approach was vetted andimproved upon by developers of the greenlet project for which SQLAlchemyis appreciative.

greenlets are good

Don’t confuse the greenlet library with event-based IO libraries that buildon top of it such asgevent andeventlet; while the use of theselibraries with SQLAlchemy is common, SQLAlchemy’s asyncio integrationdoes not make use of these event based systems in any way. The asyncioAPI integrates with the user-provided event loop, typically Python’s ownasyncio event loop, without the use of additional threads or event systems.The approach involves a single greenlet context switch perawait call,and the extension which makes it possible is less than 20 lines of code.

The user facingasync API itself is focused around IO-oriented methods suchasAsyncEngine.connect() andAsyncConnection.execute(). The new Core constructs strictlysupport2.0 style usage only; which means all statements must beinvoked given a connection object, in this caseAsyncConnection.

Within the ORM,2.0 style query execution issupported, usingselect() constructs in conjunction withAsyncSession.execute(); the legacyQueryobject itself is not supported by theAsyncSession class.

ORM features such as lazy loading of related attributes as well as unexpiry ofexpired attributes are by definition disallowed in the traditional asyncioprogramming model, as they indicate IO operations that would run implicitlywithin the scope of a Pythongetattr() operation. To overcome this, thetraditional asyncio application should make judicious use ofeagerloading techniques as well as forego the use of featuressuch asexpire on commit so that such loads are notneeded.

For the asyncio application developer whochooses to break withtradition, the new API provides astrictly optionalfeature such that applications that wish to make use of such ORM featurescan opt to organize database-related code into functions which can then berun within greenlets using theAsyncSession.run_sync()method. See thegreenlet_orm.py example atAsyncio Integrationfor a demonstration.

Support for asynchronous cursors is also provided using new methodsAsyncConnection.stream() andAsyncSession.stream(), which support a newAsyncResult object that itself provides awaitableversions of common methods likeAsyncResult.all() andAsyncResult.fetchmany(). Both Core and ORM are integratedwith the feature which corresponds to the use of “server side cursors”in traditional SQLAlchemy.

See also

Asynchronous I/O (asyncio)

Asyncio Integration

#3414

Many Core and ORM statement objects now perform much of their construction and validation in the compile phase¶

A major initiative in the 1.4 series is to approach the model of both Core SQLstatements as well as the ORM Query to allow for an efficient, cacheable modelof statement creation and compilation, where the compilation step would becached, based on a cache key generated by the created statement object, whichitself is newly created for each use. Towards this goal, much of the Pythoncomputation which occurs within the construction of statements, particularlythat of the ORMQuery as well as theselect()construct when used to invoke ORM queries, is being moved to occur withinthe compilation phase of the statement which only occurs after the statementhas been invoked, and only if the statement’s compiled form was not yetcached.

From an end-user perspective, this means that some of the error messages whichcan arise based on arguments passed to the object will no longer be raisedimmediately, and instead will occur only when the statement is invoked forthe first time. These conditions are always structural and not data driven,so there is no risk of such a condition being missed due to a cached statement.

Error conditions which fall under this category include:

when a_selectable.CompoundSelect is constructed (e.g. a UNION, EXCEPT, etc.)and the SELECT statements passed do not have the same number of columns, aCompileError is now raised to this effect; previously, anArgumentError would be raised immediately upon statementconstruction.
Various error conditions which may arise when calling uponQuery.join()will be evaluated at statement compilation time rather than when the methodis first called.

Other things that may change involve theQuery object directly:

Behaviors may be slightly different when calling upon theQuery.statement accessor. TheSelect objectreturned is now a direct copy of the same state that was present in theQuery, without any ORM-specific compilation being performed(which means it’s dramatically faster). However, theSelectwill not have the same internal state as it had in 1.3, including things likethe FROM clauses being explicitly spelled out if they were not explicitlystated in theQuery. This means code that relies uponmanipulating thisSelect statement such as calling methods likeSelect.with_only_columns() may need to accommodate for the FROMclause.

Repaired internal importing conventions such that code linters may work correctly¶

SQLAlchemy has for a long time used a parameter-injecting decorator to help resolvemutually-dependent module imports, like this:

@util.dependency_for("sqlalchemy.sql.dml")definsert(self,dml,*args,**kw):

Where the above function would be rewritten to no longer have thedml parameteron the outside. This would confuse code-linting tools into seeing a missing parameterto functions. A new approach has been implemented internally such that the function’ssignature is no longer modified and the module object is procured inside the functioninstead.

#4656

#4689

Support for SQL Regular Expression operators¶

A long awaited feature to add rudimentary support for database regularexpression operators, to complement theColumnOperators.like() andColumnOperators.match() suites of operations. The new featuresincludeColumnOperators.regexp_match() implementing a regularexpression match like function, andColumnOperators.regexp_replace()implementing a regular expression string replace function.

Supported backends include SQLite, PostgreSQL, MySQL / MariaDB, and Oracle.The SQLite backend only supports “regexp_match” but not “regexp_replace”.

The regular expression syntaxes and flags arenot backend agnostic.A future feature will allow multiple regular expression syntaxes to bespecified at once to switch between different backends on the fly.

For SQLite, Python’sre.search() function with no additional argumentsis established as the implementation.

See also

ColumnOperators.regexp_match()

ColumnOperators.regexp_replace()

Regular Expression Support - SQLite implementation notes

#1390

SQLAlchemy 2.0 Deprecations Mode¶

One of the primary goals of the 1.4 release is to provide a “transitional”release so that applications may migrate to SQLAlchemy 2.0 gradually. Towardsthis end, a primary feature in release 1.4 is “2.0 deprecations mode”, which isa series of deprecation warnings that emit against every detectable API patternwhich will work differently in version 2.0. The warnings all make use of theRemovedIn20Warning class. As these warnings affect foundationalpatterns including theselect() andEngine constructs, evensimple applications can generate a lot of warnings until appropriate APIchanges are made. The warning mode is therefore turned off by default untilthe developer enables the environment variableSQLALCHEMY_WARN_20=1.

For a full walkthrough of using 2.0 Deprecations mode, seeMigration to 2.0 Step Two - Turn on RemovedIn20Warnings.

See also

Migrating to SQLAlchemy 2.0

Migration to 2.0 Step Two - Turn on RemovedIn20Warnings

API and Behavioral Changes - Core¶

A SELECT statement is no longer implicitly considered to be a FROM clause¶

This change is one of the larger conceptual changes in SQLAlchemy in many years,however it is hoped that the end user impact is relatively small, as the changemore closely matches what databases like MySQL and PostgreSQL require in any case.

The most immediate noticeable impact is that aselect() can no longerbe embedded inside of anotherselect() directly, without explicitlyturning the innerselect() into a subquery first. This is historicallyperformed by using theSelectBase.alias() method, which remains, howeveris more explicitly suited by using a new methodSelectBase.subquery();both methods do the same thing. The object returned is nowSubquery,which is very similar to theAlias object and shares a commonbaseAliasedReturnsRows.

That is, this will now raise:

stmt1=select(user.c.id,user.c.name)stmt2=select(addresses,stmt1).select_from(addresses.join(stmt1))

Raising:

sqlalchemy.exc.ArgumentError:ColumnexpressionorFROMclauseexpected,got<...Selectobject...>.TocreateaFROMclausefroma<class'sqlalchemy.sql.selectable.Select'> object, use the .subquery() method.

The correct calling form is instead (noting also thatbrackets are nolonger required for select()):

sq1=select(user.c.id,user.c.name).subquery()stmt2=select(addresses,sq1).select_from(addresses.join(sq1))

Noting above that theSelectBase.subquery() method is essentiallyequivalent to using theSelectBase.alias() method.

The rationale for this change is based on the following:

In order to support the unification ofSelect withQuery, theSelect object needs to haveSelect.join() andSelect.outerjoin() methods thatactually add JOIN criteria to the existing FROM clause, as is what users havealways expected it to do in any case. The previous behavior, having toalign with what aFromClause would do, was that it would generatean unnamed subquery and then JOIN to it, which was a completely uselessfeature that only confused those users unfortunate enough to try this. Thischange is discussed atselect().join() and outerjoin() add JOIN criteria to the current query, rather than creating a subquery.

The behavior of including a SELECT in the FROM clause of another SELECTwithout first creating an alias or subquery would be that it creates anunnamed subquery. While standard SQL does support this syntax, in practiceit is rejected by most databases. For example, both the MySQL and PostgreSQLoutright reject the usage of unnamed subqueries:

# MySQL / MariaDB:MariaDB[(none)]>select*from(select1);ERROR1248(42000):Everyderivedtablemusthaveitsownalias# PostgreSQL:test=>select*from(select1);ERROR:subqueryinFROMmusthaveanaliasLINE1:select*from(select1);^HINT:Forexample,FROM(SELECT...)[AS]foo.

A database like SQLite accepts them, however it is still often the case thatthe names produced from such a subquery are too ambiguous to be useful:

sqlite>CREATETABLEa(idinteger);sqlite>CREATETABLEb(idinteger);sqlite>SELECT*FROMaJOIN(SELECT*FROMb)ONa.id=id;Error:ambiguouscolumnname:idsqlite>SELECT*FROMaJOIN(SELECT*FROMb)ONa.id=b.id;Error:nosuchcolumn:b.id# use a namesqlite>SELECT*FROMaJOIN(SELECT*FROMb)ASanon_1ONa.id=anon_1.id;

AsSelectBase objects are no longerFromClause objects, attributes like the.c attributeas well as methods like.select() is now deprecated, as they imply implicitproduction of a subquery. The.join() and.outerjoin() methods are nowrepurposed to append JOIN criteria to the existing query in a similarway as that ofQuery.join(), which is what users have alwaysexpected these methods to do in any case.

In place of the.c attribute, a new attributeSelectBase.selected_columnsis added. This attribute resolves to a column collection that is what mostpeople hope that.c does (but does not), which is to reference the columnsthat are in the columns clause of the SELECT statement. A common beginner mistakeis code such as the following:

stmt=select(users)stmt=stmt.where(stmt.c.name=="foo")

The above code appears intuitive and that it would generate“SELECT * FROM users WHERE name=’foo’”, however veteran SQLAlchemy users willrecognize that it in fact generates a useless subquery resembling“SELECT * FROM (SELECT * FROM users) WHERE name=’foo’”.

The newSelectBase.selected_columns attribute howeverdoes suitthe use case above, as in a case like the above it links directly to the columnspresent in theusers.c collection:

stmt=select(users)stmt=stmt.where(stmt.selected_columns.name=="foo")

#4617

select().join() and outerjoin() add JOIN criteria to the current query, rather than creating a subquery¶

Towards the goal of unifyingQuery andSelect,particularly for2.0 style use ofSelect, it was criticalthat there be a workingSelect.join() method that behaves like theQuery.join() method, adding additional entries to the FROM clause ofthe existing SELECT and then returning the newSelect object forfurther modification, instead of wrapping the object inside of an unnamedsubquery and returning a JOIN from that subquery, a behavior that has alwaysbeen virtually useless and completely misleading to users.

To allow this to be the case,A SELECT statement is no longer implicitly considered to be a FROM clause was first implemented whichsplits offSelect from having to be aFromClause;this removed the requirement thatSelect.join() would need toreturn aJoin object rather than a new version of thatSelect object that includes a new JOIN in its FROM clause.

From that point on, as theSelect.join() andSelect.outerjoin()did have an existing behavior, the original plan was that thesemethods would be deprecated, and the new “useful” version ofthe methods would be available on an alternate, “future”Selectobject available as a separate import.

However, after some time working with this particular codebase, it was decidedthat having two different kinds ofSelect objects floatingaround, each with 95% the same behavior except for some subtle differencein how some of the methods behave was going to be more misleading and inconvenientthan simply making a hard change in how these two methods behave, giventhat the existing behavior ofSelect.join() andSelect.outerjoin()is essentially never used and only causes confusion.

So it was decided, given how very useless the current behavior is, and howextremely useful and important and useful the new behavior would be, to make ahard behavioral change in this one area, rather than waiting another yearand having a more awkward API in the interim. SQLAlchemy developers do nottake it lightly to make a completely breaking change like this, however this isa very special case and it is extremely unlikely that the previousimplementation of these methods was being used; as noted inA SELECT statement is no longer implicitly considered to be a FROM clause, major databases such as MySQL and PostgreSQL don’t allowfor unnamed subqueries in any case and from a syntactical point of view it’snearly impossible for a JOIN from an unnamed subquery to be useful since it’svery difficult to refer to the columns within it unambiguously.

With the new implementation,Select.join() andSelect.outerjoin() now behave very similarly to that ofQuery.join(), adding JOIN criteria to the existing statement bymatching to the left entity:

stmt=select(user_table).join(addresses_table,user_table.c.id==addresses_table.c.user_id)

producing:

SELECTuser.id,user.nameFROMuserJOINaddressONuser.id=address.user_id

As is the case forJoin, the ON clause is automatically determinedif feasible:

stmt=select(user_table).join(addresses_table)

When ORM entities are used in the statement, this is essentially how ORMqueries are built up using2.0 style invocation. ORM entities willassign a “plugin” to the statement internally such that ORM-related compilationrules will take place when the statement is compiled into a SQL string. Moredirectly, theSelect.join() method can accommodate ORMrelationships, without breaking the hard separation between Core and ORMinternals:

stmt=select(User).join(User.addresses)

Another new methodSelect.join_from() is also added, whichallows easier specification of the left and right side of a join at once:

stmt=select(Address.email_address,User.name).join_from(User,Address)

producing:

SELECTaddress.email_address,user.nameFROMuserJOINaddressONuser.id==address.user_id

The URL object is now immutable¶

TheURL object has been formalized such that it now presentsitself as anamedtuple with a fixed number of fields that are immutable. Inaddition, the dictionary represented by theURL.query attributeis also an immutable mapping. Mutation of theURL object wasnot a formally supported or documented use case which led to some open-endeduse cases that made it very difficult to intercept incorrect usages, mostcommonly mutation of theURL.query dictionary to include non-string elements.It also led to all the common problems of allowing mutability in a fundamentaldata object, namely unwanted mutations elsewhere leaking into code that didn’texpect the URL to change. Finally, the namedtuple design is inspired by thatof Python’surllib.parse.urlparse() which returns the parsed object as anamed tuple.

The decision to change the API outright is based on a calculus weighing theinfeasibility of a deprecation path (which would involve changing theURL.query dictionary to be a special dictionary that emits deprecationwarnings when any kind of standard library mutation methods are invoked, inaddition that when the dictionary would hold any kind of list of elements, thelist would also have to emit deprecation warnings on mutation) against theunlikely use case of projects already mutatingURL objects inthe first place, as well as that small changes such as that of#5341were creating backwards-incompatibility in any case. The primary case formutation of aURL object is that of parsing plugin arguments within theCreateEnginePlugin extension point, itself a fairly recentaddition that based on Github code search is in use by two repositories,neither of which are actually mutating the URL object.

TheURL object now provides a rich interface inspectingand generating newURL objects. Theexisting mechanism to create aURL object, themake_url() function, remains unchanged:

>>>fromsqlalchemy.engineimportmake_url>>>url=make_url("postgresql+psycopg2://user:pass@host/dbname")

For programmatic construction, code that may have been using theURL constructor or__init__ method directly willreceive a deprecation warning if arguments are passed as keyword argumentsand not an exact 7-tuple. The keyword-style constructor is now availablevia theURL.create() method:

>>>fromsqlalchemy.engineimportURL>>>url=URL.create("postgresql","user","pass",host="host",database="dbname")>>>str(url)'postgresql://user:pass@host/dbname'

Fields can be altered typically using theURL.set() method, whichreturns a newURL object with changes applied:

>>>mysql_url=url.set(drivername="mysql+pymysql")>>>str(mysql_url)'mysql+pymysql://user:pass@host/dbname'

To alter the contents of theURL.query dictionary, methodssuch asURL.update_query_dict() may be used:

>>>url.update_query_dict({"sslcert":"/path/to/crt"})postgresql://user:***@host/dbname?sslcert=%2Fpath%2Fto%2Fcrt

To upgrade code that is mutating these fields directly, abackwards andforwards compatible approach is to use a duck-typing, as in the followingstyle:

defset_url_drivername(some_url,some_drivername):# check for 1.4ifhasattr(some_url,"set"):returnsome_url.set(drivername=some_drivername)else:# SQLAlchemy 1.3 or earlier, mutate in placesome_url.drivername=some_drivernamereturnsome_urldefset_ssl_cert(some_url,ssl_cert):# check for 1.4ifhasattr(some_url,"update_query_dict"):returnsome_url.update_query_dict({"sslcert":ssl_cert})else:# SQLAlchemy 1.3 or earlier, mutate in placesome_url.query["sslcert"]=ssl_certreturnsome_url

The query string retains its existing format as a dictionary of stringsto strings, using sequences of strings to represent multiple parameters.For example:

>>>fromsqlalchemy.engineimportmake_url>>>url=make_url(..."postgresql://user:pass@host/dbname?alt_host=host1&alt_host=host2&sslcert=%2Fpath%2Fto%2Fcrt"...)>>>url.queryimmutabledict({'alt_host': ('host1', 'host2'), 'sslcert': '/path/to/crt'})

To work with the contents of theURL.query attribute such that all values arenormalized into sequences, use theURL.normalized_query attribute:

>>>url.normalized_queryimmutabledict({'alt_host': ('host1', 'host2'), 'sslcert': ('/path/to/crt',)})

The query string can be appended to via methods such asURL.update_query_dict(),URL.update_query_pairs(),URL.update_query_string():

>>>url.update_query_dict({"alt_host":"host3"},append=True)postgresql://user:***@host/dbname?alt_host=host1&alt_host=host2&alt_host=host3&sslcert=%2Fpath%2Fto%2Fcrt

See also

URL

Changes to CreateEnginePlugin¶

TheCreateEnginePlugin is also impacted by this change,as the documentation for custom plugins indicated that thedict.pop()method should be used to remove consumed arguments from the URL object. Thisshould now be achieved using theCreateEnginePlugin.update_url()method. A backwards compatible approach would look like:

fromsqlalchemy.engineimportCreateEnginePluginclassMyPlugin(CreateEnginePlugin):def__init__(self,url,kwargs):# check for 1.4 styleifhasattr(CreateEnginePlugin,"update_url"):self.my_argument_one=url.query["my_argument_one"]self.my_argument_two=url.query["my_argument_two"]else:# legacyself.my_argument_one=url.query.pop("my_argument_one")self.my_argument_two=url.query.pop("my_argument_two")self.my_argument_three=kwargs.pop("my_argument_three",None)defupdate_url(self,url):# this method runs in 1.4 only and should be used to consume# plugin-specific argumentsreturnurl.difference_update_query(["my_argument_one","my_argument_two"])

See the docstring atCreateEnginePlugin for complete detailson how this class is used.

#5526

select(), case() now accept positional expressions¶

As it may be seen elsewhere in this document, theselect() construct willnow accept “columns clause” arguments positionally, rather than requiring theybe passed as a list:

# new way, supports 2.0stmt=select(table.c.col1,table.c.col2,...)

When sending the arguments positionally, no other keyword arguments are permitted.In SQLAlchemy 2.0, the above calling style will be the only calling stylesupported.

For the duration of 1.4, the previous calling style will still continueto function, which passes the list of columns or other expressions as a list:

# old way, still works in 1.4stmt=select([table.c.col1,table.c.col2,...])

The above legacy calling style also accepts the old keyword arguments that havesince been removed from most narrative documentation. The existence of thesekeyword arguments is why the columns clause was passed as a list in the first place:

# very much the old way, but still works in 1.4stmt=select([table.c.col1,table.c.col2,...],whereclause=table.c.col1==5)

The detection between the two styles is based on whether or not the firstpositional argument is a list. There are unfortunately still likely someusages that look like the following, where the keyword for the “whereclause”is omitted:

# very much the old way, but still works in 1.4stmt=select([table.c.col1,table.c.col2,...],table.c.col1==5)

As part of this change, theSelect construct also gains the 2.0-style“future” API which includes an updatedSelect.join() method as wellas methods likeSelect.filter_by() andSelect.join_from().

In a related change, thecase() construct has also been modifiedto accept its list of WHEN clauses positionally, with a similar deprecationtrack for the old calling style:

stmt=select(users_table).where(case((users_table.c.name=="wendy","W"),(users_table.c.name=="jack","J"),else_="E",))

The convention for SQLAlchemy constructs accepting*args vs. a list ofvalues, as is the latter case for a construct likeColumnOperators.in_(), is thatpositional arguments are used forstructural specification, lists are used for data specification.

select() construct created in “legacy” mode; keyword arguments, etc.

#5284

All IN expressions render parameters for each value in the list on the fly (e.g. expanding parameters)¶

The “expanding IN” feature, first introduced inLate-expanded IN parameter sets allow IN expressions with cached statements, has maturedenough such that it is clearly superior to the previous method of rendering INexpressions. As the approach was improved to handle empty lists of values, itis now the only means that Core / ORM will use to render lists of INparameters.

The previous approach which has been present in SQLAlchemy since its firstrelease was that when a list of values were passed to theColumnOperators.in_() method, the list would be expanded into a seriesof individualBindParameter objects at statement construction time.This suffered from the limitation that it was not possible to vary theparameter list at statement execution time based on the parameter dictionary,which meant that string SQL statements could not be cached independently oftheir parameters, nor could the parameter dictionary be fully used forstatements that included IN expressions generally.

In order to service the “baked query” feature described atBaked Queries, a cacheable version of IN was needed, which is whatbrought about the “expanding IN” feature. In contrast to the existing behaviorwhereby the parameter list is expanded at statement construction time intoindividualBindParameter objects, the feature instead uses a singleBindParameter that stores the list of values at once; when thestatement is executed by theEngine, it is “expanded” on the fly intoindividual bound parameter positions based on the parameters passed to the calltoConnection.execute(), and the existing SQL string which may have beenretrieved from a previous execution is modified using a regular expression tosuit the current parameter set. This allows for the sameCompiledobject, which stores the rendered string statement, to be invoked multipletimes against different parameter sets that modify the list contents passed toIN expressions, while still maintaining the behavior of individual scalarparameters being passed to the DBAPI. While some DBAPIs do support thisfunctionality directly, it is not generally available; the “expanding IN”feature now supports the behavior consistently for all backends.

As a major focus of 1.4 is to allow for true statement caching in Core and ORMwithout the awkwardness of the “baked” system, and since the “expanding IN”feature represents a simpler approach to building expressions in any case,it’s now invoked automatically whenever a list of values is passed toan IN expression:

stmt=select(A.id,A.data).where(A.id.in_([1,2,3]))

The pre-execution string representation is:

>>>print(stmt)SELECT a.id, a.dataFROM aWHERE a.id IN ([POSTCOMPILE_id_1])

To render the values directly, useliteral_binds as was the case previously:

>>>print(stmt.compile(compile_kwargs={"literal_binds":True}))SELECT a.id, a.dataFROM aWHERE a.id IN (1, 2, 3)

A new flag, “render_postcompile”, is added as a helper to allow the currentbound value to be rendered as it would be passed to the database:

>>>print(stmt.compile(compile_kwargs={"render_postcompile":True}))SELECT a.id, a.dataFROM aWHERE a.id IN (:id_1_1, :id_1_2, :id_1_3)

Engine logging output shows the ultimate rendered statement as well:

INFOsqlalchemy.engine.base.EngineSELECTa.id,a.dataFROMaWHEREa.idIN(?,?,?)INFOsqlalchemy.engine.base.Engine(1,2,3)

As part of this change, the behavior of “empty IN” expressions, where the listparameter is empty, is now standardized on use of the IN operator against aso-called “empty set”. As there is no standard SQL syntax for empty sets, aSELECT that returns no rows is used, tailored in specific ways for each backendso that the database treats it as an empty set; this feature was firstintroduced in version 1.3 and is described atExpanding IN feature now supports empty lists. Thecreate_engine.empty_in_strategy parameter, introduced in version1.2 as a means for migrating for how this case was treated for the previous INsystem, is now deprecated and this flag no longer has an effect; as describedinThe IN / NOT IN operator’s empty collection behavior is now configurable; default expression simplified, this flag allowed a dialect to switch between theoriginal system of comparing a column against itself, which turned out to be ahuge performance issue, and a newer system of comparing “1 != 1” inorder to produce a “false” expression. The 1.3 introduced behavior whichnow takes place in all cases is more correct than both approaches as the INoperator is still used, and does not have the performance issue of the originalsystem.

In addition, the “expanding” parameter system has been generalized so that italso services other dialect-specific use cases where a parameter cannot beaccommodated by the DBAPI or backing database; seeNew “post compile” bound parameters used for LIMIT/OFFSET in Oracle, SQL Server fordetails.

Expanding IN feature now supports empty lists

BindParameter

#4645

Built-in FROM linting will warn for any potential cartesian products in a SELECT statement¶

As the Core expression language as well as the ORM are built on an “implicitFROMs” model where a particular FROM clause is automatically added if any partof the query refers to it, a common issue is the case where a SELECT statement,either a top level statement or an embedded subquery, contains FROM elementsthat are not joined to the rest of the FROM elements in the query, causingwhat’s referred to as a “cartesian product” in the result set, i.e. everypossible combination of rows from each FROM element not otherwise joined. Inrelational databases, this is nearly always an undesirable outcome as itproduces an enormous result set full of duplicated, uncorrelated data.

SQLAlchemy, for all of its great features, is particularly prone to this sortof issue happening as a SELECT statement will have elements added to its FROMclause automatically from any table seen in the other clauses. A typicalscenario looks like the following, where two tables are JOINed together,however an additional entry in the WHERE clause that perhaps inadvertently doesnot line up with these two tables will create an additional FROM entry:

address_alias=aliased(Address)q=(session.query(User).join(address_alias,User.addresses).filter(Address.email_address=="foo"))

The above query selects from a JOIN ofUser andaddress_alias, thelatter of which is an alias of theAddress entity. However, theAddress entity is used within the WHERE clause directly, so the above wouldresult in the SQL:

SELECTusers.idASusers_id,users.nameASusers_name,users.fullnameASusers_fullname,users.nicknameASusers_nicknameFROMaddresses,usersJOINaddressesASaddresses_1ONusers.id=addresses_1.user_idWHEREaddresses.email_address=:email_address_1

In the above SQL, we can see what SQLAlchemy developers term “the dreadedcomma”, as we see “FROM addresses, users JOIN addresses” in the FROM clausewhich is the classic sign of a cartesian product; where a query is making useof JOIN in order to join FROM clauses together, however because one of them isnot joined, it uses a comma. The above query will return a full set ofrows that join the “user” and “addresses” table together on the “id / user_id”column, and will then apply all those rows into a cartesian product againstevery row in the “addresses” table directly. That is, if there are ten userrows and 100 rows in addresses, the above query will return its expected resultrows, likely to be 100 as all address rows would be selected, multiplied by 100again, so that the total result size would be 10000 rows.

The “table1, table2 JOIN table3” pattern is one that also occurs quitefrequently within the SQLAlchemy ORM due to either subtle mis-application ofORM features particularly those related to joined eager loading or joined tableinheritance, as well as a result of SQLAlchemy ORM bugs within those samesystems. Similar issues apply to SELECT statements that use “implicit joins”,where the JOIN keyword is not used and instead each FROM element is linked withanother one via the WHERE clause.

For some years there has been a recipe on the Wiki that applies a graphalgorithm to aselect() construct at query execution time and inspectsthe structure of the query for these un-linked FROM clauses, parsing throughthe WHERE clause and all JOIN clauses to determine how FROM elements are linkedtogether and ensuring that all the FROM elements are connected in a singlegraph. This recipe has now been adapted to be part of theSQLCompileritself where it now optionally emits a warning for a statement if thiscondition is detected. The warning is enabled using thecreate_engine.enable_from_linting flag and is enabled by default.The computational overhead of the linter is very low, and additionally it onlyoccurs during statement compilation which means for a cached SQL statement itonly occurs once.

Using this feature, our ORM query above will emit a warning:

>>>q.all()SAWarning: SELECT statement has a cartesian product between FROMelement(s) "addresses_1", "users" and FROM element "addresses".Apply join condition(s) between each element to resolve.

The linter feature accommodates not just for tables linked together through theJOIN clauses but also through the WHERE clause Above, we can add a WHEREclause to link the newAddress entity with the previousaddress_aliasentity and that will remove the warning:

q=(session.query(User).join(address_alias,User.addresses).filter(Address.email_address=="foo").filter(Address.id==address_alias.id))# resolve cartesian products,# will no longer warn

The cartesian product warning considersany kind of link between twoFROM clauses to be a resolution, even if the end result set is stillwasteful, as the linter is intended only to detect the common case of aFROM clause that is completely unexpected. If the FROM clause is referredto explicitly elsewhere and linked to the other FROMs, no warning is emitted:

q=(session.query(User).join(address_alias,User.addresses).filter(Address.email_address=="foo").filter(Address.id>address_alias.id))# will generate a lot of rows,# but no warning

Full cartesian products are also allowed if they are explicitly stated; if wewanted for example the cartesian product ofUser andAddress, we canJOIN ontrue() so that every row will match with every other; thefollowing query will return all rows and produce no warnings:

fromsqlalchemyimporttrue# intentional cartesian productq=session.query(User).join(Address,true())# intentional cartesian product

The warning is only generated by default when the statement is compiled by theConnection for execution; calling theClauseElement.compile()method will not emit a warning unless the linting flag is supplied:

>>>fromsqlalchemy.sqlimportFROM_LINTING>>>print(q.statement.compile(linting=FROM_LINTING))SAWarning: SELECT statement has a cartesian product between FROM element(s) "addresses" and FROM element "users".  Apply join condition(s) between each element to resolve.SELECT users.id, users.name, users.fullname, users.nicknameFROM addresses, users JOIN addresses AS addresses_1 ON users.id = addresses_1.user_idWHERE addresses.email_address = :email_address_1

#4737

New Result object¶

A major goal of SQLAlchemy 2.0 is to unify how “results” are handled betweenthe ORM and Core. Towards this goal, version 1.4 introduces new versionsof both theResultProxy andRowProxy objects that have been partof SQLAlchemy since the beginning.

The new objects are documented atResult andRow,and are used not only for Core result sets but for2.0 style resultswithin the ORM as well.

This result object is fully compatible withResultProxy and includes manynew features, that are now applied to both Core and ORM results equally,including methods such as:

Result.one() - returns exactly a single row, or raises:

withengine.connect()asconn:row=conn.execute(table.select().where(table.c.id==5)).one()

Result.one_or_none() - same, but also returns None for no rows

Result.all() - returns all rows

Result.partitions() - fetches rows in chunks:

withengine.connect()asconn:result=conn.execute(table.select().order_by(table.c.id),execution_options={"stream_results":True})forchunkinresult.partitions(500):# process up to 500 records

Result.columns() - allows slicing and reorganizing of rows:

withengine.connect()asconn:# requests x, y, zresult=conn.execute(select(table.c.x,table.c.y,table.c.z))# iterate rows as y, xfory,xinresult.columns("y","x"):print("Y:%s  X:%s"%(y,x))

Result.scalars() - returns lists of scalar objects, from thefirst column by default but can also be selected:

result=session.execute(select(User).order_by(User.id))foruser_objinresult.scalars():# ...

Result.mappings() - instead of named-tuple rows, returnsdictionaries:

withengine.connect()asconn:result=conn.execute(select(table.c.x,table.c.y,table.c.z))formap_inresult.mappings():print("Y:%(y)s  X:%(x)s"%map_)

When using Core, the object returned byConnection.execute() isan instance ofCursorResult, which continues to feature the same APIfeatures asResultProxy regarding inserted primary keys, defaults,rowcounts, etc. For ORM, aResult subclass will be returnedthat performs translation of Core rows into ORM rows, and then allows all thesame operations to take place.

See also

ORM Query Unified with Core Select - in the 2.0 migration documentation

#5087

#4395

#4959

RowProxy is no longer a “proxy”; is now called Row and behaves like an enhanced named tuple¶

TheRowProxy class, which represents individual database result rowsin a Core result set, is now calledRow and is no longer a “proxy”object; what this means is that when theRow object is returned, therow is a simple tuple that contains the data in its final form, already havingbeen processed by result-row handling functions associated with datatypes(examples include turning a date string from the database into adatetimeobject, a JSON string into a Pythonjson.loads() result, etc.).

The immediate rationale for this is so that the row can act more like a Pythonnamed tuple, rather than a mapping, where the values in the tuple are thesubject of the__contains__ operator on the tuple, rather than the keys.WithRow acting like a named tuple, it is then suitable for use as asreplacement for the ORM’sKeyedTuple object, leading to an eventualAPI where both the ORM and Core deliver result sets that behave identically.Unification of major patterns within ORM and Core is a major goal of SQLAlchemy2.0, and release 1.4 aims to have most or all of the underlying architecturalpatterns in place in order to support this process. The note inThe “KeyedTuple” object returned by Query is replaced by Row describes the ORM’s use of theRow class.

For release 1.4, theRow class provides an additional subclassLegacyRow, which is used by Core and provides a backwards-compatibleversion ofRowProxy while emitting deprecation warnings for those APIfeatures and behaviors that will be moved. ORMQuery now makes useofRow directly as a replacement forKeyedTuple.

TheLegacyRow class is a transitional class where the__contains__ method is still testing against the keys, not the values,while emitting a deprecation warning when the operation succeeds.Additionally, all the other mapping-like methods on the previousRowProxy are deprecated, includingLegacyRow.keys(),LegacyRow.items(), etc. For mapping-like behaviors from aRowobject, including support for these methods as well as a key-oriented__contains__ operator, the API going forward will be to first access aspecial attributeRow._mapping, which will then provide a completemapping interface to the row, rather than a tuple interface.

Rationale: To behave more like a named tuple rather than a mapping¶

The difference between a named tuple and a mapping as far as boolean operatorscan be summarized. Given a “named tuple” in pseudo code as:

row=(id:5,name:'some name')

The biggest cross-incompatible difference is the behavior of__contains__:

"id"inrow# True for a mapping, False for a named tuple"some name"inrow# False for a mapping, True for a named tuple

In 1.4, when aLegacyRow is returned by a Core result set, the above"id"inrow comparison will continue to succeed, however a deprecationwarning will be emitted. To use the “in” operator as a mapping, use theRow._mapping attribute:

"id"inrow._mapping

SQLAlchemy 2.0’s result object will feature a.mappings() modifier so thatthese mappings can be received directly:

# using sqlalchemy.future packageforrowinresult.mappings():row["id"]

Proxying behavior goes away, was also unnecessary in modern usage¶

The refactor ofRow to behave like a tuple requires that alldata values be fully available up front. This is an internal behavior changefrom that ofRowProxy, where result-row processing functions wouldbe invoked at the point of accessing an element of the row, instead ofwhen the row was first fetched. This means for example when retrievinga datetime value from SQLite, the data for the row as present in theRowProxy object would previously have looked like:

row_proxy=(1,"2019-12-31 19:56:58.272106")

and then upon access via__getitem__, thedatetime.strptime() functionwould be used on the fly to convert the above string date into adatetimeobject. With the new architecture, thedatetime() object is presentin the tuple when it is returned, thedatetime.strptime() functionhaving been called just once up front:

row=(1,datetime.datetime(2019,12,31,19,56,58,272106))

TheRowProxy andRow objects in SQLAlchemy are where themajority of SQLAlchemy’s C extension code takes place. This code has beenhighly refactored to provide the new behavior in an efficient manner, andoverall performance has been improved as the design ofRow is nowconsiderably simpler.

The rationale behind the previous behavior assumed a usage model where aresult row might have dozens or hundreds of columns present, where most ofthose columns would not be accessed, and for which a majority of those columnswould require some result-value processing function. By invoking theprocessing function only when needed, the goal was that lots of resultprocessing functions would not be necessary, thus increasing performance.

There are many reasons why the above assumptions do not hold:

the vast majority of row-processing functions called were to Unicode decodea bytestring into a Python Unicode string under Python 2. This was rightas Python Unicode was beginning to see use and before Python 3 existed.Once Python 3 was introduced, within a few years, all Python DBAPIs tookon the proper role of supporting the delivering of Python Unicode objects directly, underboth Python 2 and Python 3, as an option in the former case and as the onlyway forward in the latter case. Eventually, in most cases it becamethe default for Python 2 as well. SQLAlchemy’s Python 2 support stillenables explicit string-to-Unicode conversion for some DBAPIs such ascx_Oracle, however it is now performed at the DBAPI level rather thanas a standard SQLAlchemy result row processing function.
The above string conversion, when it is used, was made to be extremelyperformant via the C extensions, so much so that even in 1.4, SQLAlchemy’sbyte-to-Unicode codec hook is plugged into cx_Oracle where it has beenobserved to be more performant than cx_Oracle’s own hook; this meant thatthe overhead for converting all strings in a row was not as significantas it originally was in any case.
Row processing functions are not used in most other cases; theexceptions are SQLite’s datetime support, JSON support for some backends,some numeric handlers such as string toDecimal. In the case ofDecimal, Python 3 also standardized on the highly performantcdecimalimplementation, which is not the case in Python 2 which continues to usethe much less performant pure Python version.
Fetching full rows where only a few columns are needed is not common withinreal-world use cases In the early days of SQLAlchemy, database code from otherlanguages of the form “row = fetch(‘SELECT * FROM table’)” was common;using SQLAlchemy’s expression language however, code observed in the wildtypically makes use of the specific columns needed.

ORM Session.execute() uses “future” style Result sets in all cases

#4710

SELECT objects and derived FROM clauses allow for duplicate columns and column labels¶

This change allows that theselect() construct now allows for duplicatecolumn labels as well as duplicate column objects themselves, so that resulttuples are organized and ordered in the identical way in that the columns wereselected. The ORMQuery already works this way, so this changeallows for greater cross-compatibility between the two, which is a key goal ofthe 2.0 transition:

>>>fromsqlalchemyimportcolumn,select>>>c1,c2,c3,c4=column("c1"),column("c2"),column("c3"),column("c4")>>>stmt=select(c1,c2,c3.label("c2"),c2,c4)>>>print(stmt)SELECT c1, c2, c3 AS c2, c2, c4

To support this change, theColumnCollection used bySelectBase as well as for derived FROM clauses such as subqueriesalso support duplicate columns; this includes the newSelectBase.selected_columns attribute, the deprecatedSelectBase.cattribute, as well as theFromClause.c attribute seen on constructssuch asSubquery andAlias:

>>>list(stmt.selected_columns)[    <sqlalchemy.sql.elements.ColumnClause at 0x7fa540bcca20; c1>,    <sqlalchemy.sql.elements.ColumnClause at 0x7fa540bcc9e8; c2>,    <sqlalchemy.sql.elements.Label object at 0x7fa540b3e2e8>,    <sqlalchemy.sql.elements.ColumnClause at 0x7fa540bcc9e8; c2>,    <sqlalchemy.sql.elements.ColumnClause at 0x7fa540897048; c4>]>>>print(stmt.subquery().select())SELECT anon_1.c1, anon_1.c2, anon_1.c2, anon_1.c2, anon_1.c4FROM (SELECT c1, c2, c3 AS c2, c2, c4) AS anon_1

ColumnCollection also allows access by integer index to supportwhen the string “key” is ambiguous:

>>>stmt.selected_columns[2]<sqlalchemy.sql.elements.Label object at 0x7fa540b3e2e8>

To suit the use ofColumnCollection in objects such asTable andPrimaryKeyConstraint, the old “deduplicating”behavior which is more critical for these objects is preserved in a new classDedupeColumnCollection.

The change includes that the familiar warning"Column%rontable%rbeingreplacedby%r,whichhasthesamekey. Consideruse_labelsforselect()statements." isremoved; theSelect.apply_labels() is stillavailable and is still used by the ORM for all SELECT operations, however itdoes not imply deduplication of column objects, although it does implydeduplication of implicitly generated labels:

>>>fromsqlalchemyimporttable>>>user=table("user",column("id"),column("name"))>>>stmt=select(user.c.id,user.c.name,user.c.id).apply_labels()>>>print(stmt)SELECT "user".id AS user_id, "user".name AS user_name, "user".id AS id_1FROM "user"

Finally, the change makes it easier to create UNION and other_selectable.CompoundSelect objects, by ensuring that the number and positionof columns in a SELECT statement mirrors what was given, in a use case suchas:

>>>s1=select(user,user.c.id)>>>s2=select(c1,c2,c3)>>>fromsqlalchemyimportunion>>>u=union(s1,s2)>>>print(u)SELECT "user".id, "user".name, "user".idFROM "user" UNION SELECT c1, c2, c3

#4753

Improved column labeling for simple column expressions using CAST or similar¶

A user pointed out that the PostgreSQL database has a convenient behavior whenusing functions like CAST against a named column, in that the result column nameis named the same as the inner expression:

test=>SELECTCAST(dataASVARCHAR)FROMfoo;data------5(1row)

This allows one to apply CAST to table columns while not losing the columnname (above using the name"data") in the result row. Compare todatabases such as MySQL/MariaDB, as well as most others, where the columnname is taken from the full SQL expression and is not very portable:

MariaDB[test]>SELECTCAST(dataASCHAR)FROMfoo;+--------------------+|CAST(dataASCHAR)|+--------------------+|5|+--------------------+1rowinset(0.003sec)

In SQLAlchemy Core expressions, we never deal with a raw generated name likethe above, as SQLAlchemy applies auto-labeling to expressions like these, whichare up until now always a so-called “anonymous” expression:

>>>print(select(cast(foo.c.data,String)))SELECT CAST(foo.data AS VARCHAR) AS anon_1     # old behaviorFROM foo

These anonymous expressions were necessary as SQLAlchemy’sResultProxy made heavy use of result column names in order to matchup datatypes, such as theString datatype which used to haveresult-row-processing behavior, to the correct column, so most importantly thenames had to be both easy to determine in a database-agnostic manner as well asunique in all cases. In SQLAlchemy 1.0 as part of#918, thisreliance on named columns in result rows (specifically thecursor.description element of the PEP-249 cursor) was scaled back to not benecessary for most Core SELECT constructs; in release 1.4, the system overallis becoming more comfortable with SELECT statements that have duplicate columnor label names such as inSELECT objects and derived FROM clauses allow for duplicate columns and column labels. So we now emulate PostgreSQL’sreasonable behavior for simple modifications to a single column, mostprominently with CAST:

>>>print(select(cast(foo.c.data,String)))SELECT CAST(foo.data AS VARCHAR) AS dataFROM foo

For CAST against expressions that don’t have a name, the previous logic is usedto generate the usual “anonymous” labels:

>>>print(select(cast("hi there,"+foo.c.data,String)))SELECT CAST(:data_1 + foo.data AS VARCHAR) AS anon_1FROM foo

Acast() against aLabel, despite having to omit the labelexpression as these don’t render inside of a CAST, will nonetheless make use ofthe given name:

>>>print(select(cast(("hi there,"+foo.c.data).label("hello_data"),String)))SELECT CAST(:data_1 + foo.data AS VARCHAR) AS hello_dataFROM foo

And of course as was always the case,Label can be applied to theexpression on the outside to apply an “AS <name>” label directly:

>>>print(select(cast(("hi there,"+foo.c.data),String).label("hello_data")))SELECT CAST(:data_1 + foo.data AS VARCHAR) AS hello_dataFROM foo

#4449

New “post compile” bound parameters used for LIMIT/OFFSET in Oracle, SQL Server¶

A major goal of the 1.4 series is to establish that all Core SQL constructsare completely cacheable, meaning that a particularCompiledstructure will produce an identical SQL string regardless of any SQL parametersused with it, which notably includes those used to specify the LIMIT andOFFSET values, typically used for pagination and “top N” style results.

While SQLAlchemy has used bound parameters for LIMIT/OFFSET schemes for manyyears, a few outliers remained where such parameters were not allowed, includinga SQL Server “TOP N” statement, such as:

SELECTTOP5mytable.id,mytable.dataFROMmytable

as well as with Oracle, where the FIRST_ROWS() hint (which SQLAlchemy willuse if theoptimize_limits=True parameter is passed tocreate_engine() with an Oracle URL) does not allow them,but also that using bound parameters with ROWNUM comparisons has been reportedas producing slower query plans:

SELECTanon_1.id,anon_1.dataFROM(SELECT/*+FIRST_ROWS(5)*/anon_2.idASid,anon_2.dataASdata,ROWNUMASora_rnFROM(SELECTmytable.id,mytable.dataFROMmytable)anon_2WHEREROWNUM<=:param_1)anon_1WHEREora_rn>:param_2

In order to allow for all statements to be unconditionally cacheable at thecompilation level, a new form of bound parameter called a “post compile”parameter has been added, which makes use of the same mechanism as thatof “expanding IN parameters”. This is abindparam() that behavesidentically to any other bound parameter except that parameter value willbe rendered literally into the SQL string before sending it to the DBAPIcursor.execute() method. The new parameter is used internally by theSQL Server and Oracle dialects, so that the drivers receive the literalrendered value but the rest of SQLAlchemy can still consider this as abound parameter. The above two statements when stringified usingstr(statement.compile(dialect=<dialect>)) now look like:

SELECTTOP[POSTCOMPILE_param_1]mytable.id,mytable.dataFROMmytable

and:

SELECTanon_1.id,anon_1.dataFROM(SELECT/*+FIRST_ROWS([POSTCOMPILE__ora_frow_1])*/anon_2.idASid,anon_2.dataASdata,ROWNUMASora_rnFROM(SELECTmytable.id,mytable.dataFROMmytable)anon_2WHEREROWNUM<=[POSTCOMPILE_param_1])anon_1WHEREora_rn>[POSTCOMPILE_param_2]

The[POSTCOMPILE_<param>] format is also what is seen when an“expanding IN” is used.

When viewing the SQL logging output, the final form of the statement willbe seen:

SELECTanon_1.id,anon_1.dataFROM(SELECT/*+FIRST_ROWS(5)*/anon_2.idASid,anon_2.dataASdata,ROWNUMASora_rnFROM(SELECTmytable.idASid,mytable.dataASdataFROMmytable)anon_2WHEREROWNUM<=8)anon_1WHEREora_rn>3

The “post compile parameter” feature is exposed as public API through thebindparam.literal_execute parameter, however is currently notintended for general use. The literal values are rendered using theTypeEngine.literal_processor() of the underlying datatype, which inSQLAlchemy hasextremely limited scope, supporting only integers and simplestring values.

#4808

Connection-level transactions can now be inactive based on subtransaction¶

AConnection now includes the behavior where aTransactioncan be made inactive due to a rollback on an inner transaction, however theTransaction will not clear until it is itself rolled back.

This is essentially a new error condition which will disallow statementexecutions to proceed on aConnection if an inner “sub” transactionhas been rolled back. The behavior works very similarly to that of theORMSession, where if an outer transaction has been begun, it needsto be rolled back to clear the invalid transaction; this behavior is describedin“This Session’s transaction has been rolled back due to a previous exception during flush.” (or similar).

While theConnection has had a less strict behavioral pattern thantheSession, this change was made as it helps to identify whena subtransaction has rolled back the DBAPI transaction, however the externalcode isn’t aware of this and attempts to continue proceeding, which in factruns operations on a new transaction. The “test harness” pattern describedatJoining a Session into an External Transaction (such as for test suites) is the common place for this to occur.

The “subtransaction” feature of Core and ORM is itself deprecated and willno longer be present in version 2.0. As a result, this new error conditionis itself temporary as it will no longer apply once subtransactions are removed.

In order to work with the 2.0 style behavior that does not includesubtransactions, use thecreate_engine.future parameteroncreate_engine().

The error message is described in the errors page atThis connection is on an inactive transaction. Please rollback() fully before proceeding.

Enum and Boolean datatypes no longer default to “create constraint”¶

TheEnum.create_constraint andBoolean.create_constraint parameters now default to False,indicating when a so-called “non-native” version of these two datatypes iscreated, a CHECK constraint willnot be generated by default. TheseCHECK constraints present schema-management maintenance complexities thatshould be opted in to, rather than being turned on by default.

To ensure that a CREATE CONSTRAINT is emitted for these types, set theseflags toTrue:

classSpam(Base):__tablename__="spam"id=Column(Integer,primary_key=True)boolean=Column(Boolean(create_constraint=True))enum=Column(Enum("a","b","c",create_constraint=True))

#5367

New Features - ORM¶

Raiseload for Columns¶

The “raiseload” feature, which raisesInvalidRequestError when anunloaded attribute is accessed, is now available for column-oriented attributesusing thedefer.raiseload parameter ofdefer(). Thisworks in the same manner as that of theraiseload() option used byrelationship loading:

book=session.query(Book).options(defer(Book.summary,raiseload=True)).first()# would raise an exceptionbook.summary

To configure column-level raiseload on a mapping, thedeferred.raiseload parameter ofdeferred() may be used. Theundefer() option may then be used at query time to eagerly loadthe attribute:

classBook(Base):__tablename__="book"book_id=Column(Integer,primary_key=True)title=Column(String(200),nullable=False)summary=deferred(Column(String(2000)),raiseload=True)excerpt=deferred(Column(Text),raiseload=True)book_w_excerpt=session.query(Book).options(undefer(Book.excerpt)).first()

It was originally considered that the existingraiseload() option thatworks forrelationship() attributes be expanded to also support column-orientedattributes. However, this would break the “wildcard” behavior ofraiseload(),which is documented as allowing one to prevent all relationships from loading:

session.query(Order).options(joinedload(Order.items),raiseload("*"))

Above, if we had expandedraiseload() to accommodate for columns aswell, the wildcard would also prevent columns from loading and thus be abackwards incompatible change; additionally, it’s not clear ifraiseload() covered both column expressions and relationships, how onewould achieve the effect above of only blocking relationship loads, withoutnew API being added. So to keep things simple, the option for columnsremains ondefer():

raiseload() - query option to raise for relationship loads
defer.raiseload - query option to raise for column expression loads

As part of this change, the behavior of “deferred” in conjunction withattribute expiration has changed. Previously, when an object would be markedas expired, and then unexpired via the access of one of the expired attributes,attributes which were mapped as “deferred” at the mapper level would also load.This has been changed such that an attribute that is deferred in the mappingwill never “unexpire”, it only loads when accessed as part of the deferralloader.

An attribute that is not mapped as “deferred”, however was deferred at querytime via thedefer() option, will be reset when the object or attributeis expired; that is, the deferred option is removed. This is the same behavioras was present previously.

See also

Raiseload for Deferred Columns

#4826

ORM Batch inserts with psycopg2 now batch statements with RETURNING in most cases¶

The change inpsycopg2 dialect features “execute_values” with RETURNING for INSERT statements by default adds support for “executemany” + “RETURNING”at the same time in Core, which is now enabled for the psycopg2 dialectby default using the psycopg2execute_values() extension. The ORM flushprocess now makes use of this feature such that the retrieval of newly generatedprimary key values and server defaults can be achieved while not losing theperformance benefits of being able to batch INSERT statements together. Additionally,psycopg2’sexecute_values() extension itself provides a five-fold performanceimprovement over psycopg2’s default “executemany” implementation, by rewritingan INSERT statement to include many “VALUES” expressions all in one statementrather than invoking the same statement repeatedly, as psycopg2 lacks the abilityto PREPARE the statement ahead of time as would normally be expected for thisapproach to be performant.

SQLAlchemy includes aperformance suite withinits examples, where we can compare the times generated for the “batch_inserts”runner against 1.3 and 1.4, revealing a 3x-5x speedup for most flavorsof batch insert:

# 1.3$python-mexamples.performancebulk_inserts--dburlpostgresql://scott:tiger@localhost/testtest_flush_no_pk:(100000iterations);totaltime14.051527sectest_bulk_save_return_pks:(100000iterations);totaltime15.002470sectest_flush_pk_given:(100000iterations);totaltime7.863680sectest_bulk_save:(100000iterations);totaltime6.780378sectest_bulk_insert_mappings:(100000iterations);totaltime5.363070sectest_core_insert:(100000iterations);totaltime5.362647sec# 1.4 with enhancement$python-mexamples.performancebulk_inserts--dburlpostgresql://scott:tiger@localhost/testtest_flush_no_pk:(100000iterations);totaltime3.820807sectest_bulk_save_return_pks:(100000iterations);totaltime3.176378sectest_flush_pk_given:(100000iterations);totaltime4.037789sectest_bulk_save:(100000iterations);totaltime2.604446sectest_bulk_insert_mappings:(100000iterations);totaltime1.204897sectest_core_insert:(100000iterations);totaltime0.958976sec

Note that theexecute_values() extension modifies the INSERT statement in the psycopg2layer,after it’s been logged by SQLAlchemy. So with SQL logging, one will see theparameter sets batched together, but the joining of multiple “values” will not be visibleon the application side:

2020-06-2719:08:18,166INFOsqlalchemy.engine.EngineINSERTINTOa(data)VALUES(%(data)s)RETURNINGa.id2020-06-2719:08:18,166INFOsqlalchemy.engine.Engine[generatedin0.00698s]({'data':'data 1'},{'data':'data 2'},{'data':'data 3'},{'data':'data 4'},{'data':'data 5'},{'data':'data 6'},{'data':'data 7'},{'data':'data 8'}...displaying10of4999totalboundparametersets...{'data':'data 4998'},{'data':'data 4999'})2020-06-2719:08:18,254INFOsqlalchemy.engine.EngineCOMMIT

The ultimate INSERT statement can be seen by enabling statement logging on the PostgreSQL side:

2020-06-2719:08:18.169EDT[26960]LOG:statement:INSERTINTOa(data)VALUES('data 1'),('data 2'),('data 3'),('data 4'),('data 5'),('data 6'),('data7'),('data8'),('data9'),('data10'),('data11'),('data12'),...('data 999'),('data 1000')RETURNINGa.id2020-06-2719:08:18.175EDT[26960]LOG:statement:INSERTINTOa(data)VALUES('data 1001'),('data1002'),('data1003'),('data1004'),('data1005'),('data1006'),('data1007'),('data1008'),('data1009'),('data1010'),('data1011'), ...

The feature batches rows into groups of 1000 by default which can be affectedusing theexecutemany_values_page_size argument documented atPsycopg2 Fast Execution Helpers.

#5263

ORM Bulk Update and Delete use RETURNING for “fetch” strategy when available¶

An ORM bulk update or delete that uses the “fetch” strategy:

sess.query(User).filter(User.age>29).update({"age":User.age-10},synchronize_session="fetch")

Will now use RETURNING if the backend database supports it; this currentlyincludes PostgreSQL and SQL Server (the Oracle dialect does not support RETURNINGof multiple rows):

UPDATEusersSETage_int=(users.age_int-%(age_int_1)s)WHEREusers.age_int>%(age_int_2)sRETURNINGusers.id[generatedin0.00060s]{'age_int_1':10,'age_int_2':29}Col('id',)Row(2,)Row(4,)

For backends that do not support RETURNING of multiple rows, the previous approachof emitting SELECT for the primary keys beforehand is still used:

SELECTusers.idFROMusersWHEREusers.age_int>%(age_int_1)s[generatedin0.00043s]{'age_int_1':29}Col('id',)Row(2,)Row(4,)UPDATEusersSETage_int=(users.age_int-%(age_int_1)s)WHEREusers.age_int>%(age_int_2)s[generatedin0.00102s]{'age_int_1':10,'age_int_2':29}

One of the intricate challenges of this change is to support cases such as thehorizontal sharding extension, where a single bulk update or delete may bemultiplexed among backends some of which support RETURNING and some don’t. Thenew 1.4 execution architecture supports this case so that the “fetch” strategycan be left intact with a graceful degrade to using a SELECT, rather than havingto add a new “returning” strategy that would not be backend-agnostic.

As part of this change, the “fetch” strategy is also made much more efficientin that it will no longer expire the objects located which match the rows,for Python expressions used in the SET clause which can be evaluated inPython; these are instead assigneddirectly onto the object in the same way as the “evaluate” strategy. Onlyfor SQL expressions that can’t be evaluated does it fall back to expiringthe attributes. The “evaluate” strategy has also been enhanced to fall backto “expire” for a value that cannot be evaluated.

Behavioral Changes - ORM¶

The “KeyedTuple” object returned by Query is replaced by Row¶

As discussed atRowProxy is no longer a “proxy”; is now called Row and behaves like an enhanced named tuple, the CoreRowProxy objectis now replaced by a class calledRow. The baseRowobject now behaves more fully like a named tuple, and as such it is nowused as the basis for tuple-like results returned by theQueryobject, rather than the previous “KeyedTuple” class.

The rationale is so that by SQLAlchemy 2.0, both Core and ORM SELECT statementswill return result rows using the sameRow object which behaves likea named tuple. Dictionary-like functionality is available fromRowvia theRow._mapping attribute. In the interim, Core result setswill make use of aRow subclassLegacyRow which maintainsthe previous dict/tuple hybrid behavior for backwards compatibility while theRow class will be used directly for ORM tuple results returnedby theQuery object.

Effort has been made to get most of the featureset ofRow to beavailable within the ORM, meaning that access by string name as wellas entity / column should work:

row=s.query(User,Address).join(User.addresses).first()row._mapping[User]# same as row[0]row._mapping[Address]# same as row[1]row._mapping["User"]# same as row[0]row._mapping["Address"]# same as row[1]u1=aliased(User)row=s.query(u1).only_return_tuples(True).first()row._mapping[u1]# same as row[0]row=s.query(User.id,Address.email_address).join(User.addresses).first()row._mapping[User.id]# same as row[0]row._mapping["id"]# same as row[0]row._mapping[users.c.id]# same as row[0]

#4710.

Session features new “autobegin” behavior¶

Previously, theSession in its default mode ofautocommit=Falsewould internally begin aSessionTransaction object immediatelyon construction, and additionally would create a new one after each call toSession.rollback() orSession.commit().

The new behavior is that thisSessionTransaction object is nowcreated on demand only, when methods such asSession.add() orSession.execute() are called. However it is also now possibleto callSession.begin() explicitly in order to begin the transaction,even inautocommit=False mode, thus matching the behavior of thefuture-style_base.Connection.

The behavioral changes this indicates are:

TheSession can now be in the state where no transaction is begun,even inautocommit=False mode. Previously, this state was only availablein “autocommit” mode.
Within this state, theSession.commit() andSession.rollback()methods are no-ops. Code that relies upon these methods to expire all objectsshould make explicit use of eitherSession.begin() orSession.expire_all() to suit their use case.
TheSessionEvents.after_transaction_create() event hook is not emittedimmediately when theSession is created, or after aSession.rollback() orSession.commit() completes.
TheSession.close() method also does not imply implicit begin of a newSessionTransaction.

See also

Auto Begin

Rationale¶

TheSession object’s default behavior ofautocommit=Falsehistorically has meant that there is always aSessionTransactionobject in play, associated with theSession via theSession.transaction attribute. When the givenSessionTransaction was complete, due to a commit, rollback, or close,it was immediately replaced with a new one. TheSessionTransactionby itself does not imply the usage of any connection-oriented resources, sothis long-standing behavior has a particular elegance to it in that the stateofSession.transaction is always predictable as non-None.

However, as part of the initiative in#5056 to greatly reducereference cycles, this assumption means that calling uponSession.close() results in aSession object that still hasreference cycles and is more expensive to clean up, not to mention that thereis a small overhead in constructing theSessionTransactionobject, which meant that there would be unnecessary overhead createdfor aSession that for example invokedSession.commit()and thenSession.close().

As such, it was decided thatSession.close() should leave the internalstate ofself.transaction, now referred to internally asself._transaction, as None, and that a newSessionTransactionshould only be created when needed. For consistency and code coverage, thisbehavior was also expanded to include all the points at which “autobegin” isexpected, not just whenSession.close() were called.

In particular, this causes a behavioral change for applications whichsubscribe to theSessionEvents.after_transaction_create() event hook;previously, this event would be emitted when theSession were firstconstructed, as well as for most actions that closed the previous transactionand would emitSessionEvents.after_transaction_end(). The new behavioris thatSessionEvents.after_transaction_create() is emitted on demand,when theSession has not yet created a newSessionTransaction object and mapped objects are associated with theSession through methods likeSession.add() andSession.delete(), when theSession.transaction attribute iscalled upon, when theSession.flush() method has tasks to complete, etc.

In addition, code which relies upon theSession.commit() orSession.rollback() method to unconditionally expire all objects can nolonger do so. Code which needs to expire all objects when no change that hasoccurred should be callingSession.expire_all() for this case.

Besides the change in when theSessionEvents.after_transaction_create()event is emitted as well as the no-op nature ofSession.commit() orSession.rollback(), the change should have no other user-visible impacton theSession object’s behavior; theSession will continueto have the behavior that it remains usable for new operations afterSession.close() is called, and the sequencing of how theSession interacts with theEngine and the databaseitself should also remain unaffected, since these operations were alreadyoperating in an on-demand fashion.

#5074

Viewonly relationships don’t synchronize backrefs¶

In#5149 in 1.3.14, SQLAlchemy began emitting a warning when therelationship.backref orrelationship.back_populateskeywords would be used at the same time as therelationship.viewonlyflag on the target relationship. This was because a “viewonly” relationship doesnot actually persist changes made to it, which could cause some misleadingbehaviors to occur. However, in#5237, we sought to refine thisbehavior as there are legitimate use cases to have backrefs set up onviewonly relationships, including that back populates attributes are usedin some cases by the relationship lazy loaders to determine that an additionaleager load in the other direction is not necessary, as well as that backpopulates can be used for mapper introspection and thatbackref()can be a convenient way to set up bi-directional relationships.

The solution then was to make the “mutation” that occurs from a backrefan optional thing, using therelationship.sync_backrefflag. In 1.4 the value ofrelationship.sync_backref defaultsto False for a relationship target that also setsrelationship.viewonly.This indicates that any changes made to a relationship withviewonly will not impact the state of the other side or of theSessionin any way:

classUser(Base):# ...addresses=relationship(Address,backref=backref("user",viewonly=True))classAddress(Base):# ...u1=session.query(User).filter_by(name="x").first()a1=Address()a1.user=u1

Above, thea1 object willnot be added to theu1.addressescollection, nor will thea1 object be added to the session. Previously,both of these things would be true. The warning thatrelationship.sync_backref should be set toFalse whenrelationship.viewonly isFalse is no longer emitted as this isnow the default behavior.

#5237

cascade_backrefs behavior deprecated for removal in 2.0¶

SQLAlchemy has long had a behavior of cascading objects into theSession based on backref assignment. GivenUser belowalready in aSession, assigning it to theAddress.userattribute of anAddress object, assuming a bidirectional relationshipis set up, would mean that theAddress also gets put into theSession at that point:

u1=User()session.add(u1)a1=Address()a1.user=u1# <--- adds "a1" to the Session

The above behavior was an unintended side effect of backref behavior, in thatsincea1.user impliesu1.addresses.append(a1),a1 would getcascaded into theSession. This remains the default behaviorthroughout 1.4. At some point, a new flagrelationship.cascade_backrefswas added to disable to above behavior, along withbackref.cascade_backrefsto set this when the relationship is specified byrelationship.backref, as it can besurprising and also gets in the way of some operations where the object would be placed intheSession too early and get prematurely flushed.

In 2.0, the default behavior will be that “cascade_backrefs” is False, andadditionally there will be no “True” behavior as this is not generally a desirablebehavior. When 2.0 deprecation warnings are enabled, a warning will be emittedwhen a “backref cascade” actually takes place. To get the new behavior, eithersetrelationship.cascade_backrefs andbackref.cascade_backrefs toFalse on any targetrelationships, as is already supported in 1.3 and earlier, or alternatively makeuse of theSession.future flag to2.0-style mode:

Session=sessionmaker(engine,future=True)withSession()assession:u1=User()session.add(u1)a1=Address()a1.user=u1# <--- will not add "a1" to the Session

#5150

Eager loaders emit during unexpire operations¶

A long sought behavior was that when an expired object is accessed, configuredeager loaders will run in order to eagerly load relationships on the expiredobject when the object is refreshed or otherwise unexpired. This behavior hasnow been added, so that joinedloaders will add inline JOINs as usual, andselectin/subquery loaders will run an “immediateload” operation for a givenrelationship, when an expired object is unexpired or an object is refreshed:

>>>a1=session.query(A).options(joinedload(A.bs)).first()>>>a1.data="new data">>>session.commit()

Above, theA object was loaded with ajoinedload() option associatedwith it in order to eagerly load thebs collection. After thesession.commit(), the state of the object is expired. Upon accessingthe.data column attribute, the object is refreshed and this will nowinclude the joinedload operation as well:

>>>a1.dataSELECT a.id AS a_id, a.data AS a_data, b_1.id AS b_1_id, b_1.a_id AS b_1_a_idFROM a LEFT OUTER JOIN b AS b_1 ON a.id = b_1.a_idWHERE a.id = ?

The behavior applies both to loader strategies applied to therelationship() directly, as well as with options used withQuery.options(), provided that the object was originally loaded by thatquery.

For the “secondary” eager loaders “selectinload” and “subqueryload”, the SQLstrategy for these loaders is not necessary in order to eagerly load attributeson a single object; so they will instead invoke the “immediateload” strategy ina refresh scenario, which resembles the query emitted by “lazyload”, emitted asan additional query:

>>>a1=session.query(A).options(selectinload(A.bs)).first()>>>a1.data="new data">>>session.commit()>>>a1.dataSELECT a.id AS a_id, a.data AS a_dataFROM aWHERE a.id = ?(1,)SELECT b.id AS b_id, b.a_id AS b_a_idFROM bWHERE ? = b.a_id(1,)

Note that a loader option does not apply to an object that was introducedinto theSession in a different way. That is, if thea1 objectwere just persisted in thisSession, or was loaded with a differentquery before the eager option had been applied, then the object doesn’t havean eager load option associated with it. This is not a new concept, howeverusers who are looking for the eagerload on refresh behavior may find thisto be more noticeable.

#1763

Column loaders such as`deferred()`,`with_expression()` only take effect when indicated on the outermost, full entity query¶

Note

This change note was not present in earlier versions of this document,however is relevant for all SQLAlchemy 1.4 versions.

A behavior that was never supported in 1.3 and previous versionsyet nonetheless would have a particular effectwas to repurpose column loader options such asdefer() andwith_expression() in subqueries in order to control whichSQL expressions would be in the columns clause of each subquery. A typicalexample would be toconstruct UNION queries, such as:

q1=session.query(User).options(with_expression(User.expr,literal("u1")))q2=session.query(User).options(with_expression(User.expr,literal("u2")))q1.union_all(q2).all()

In version 1.3, thewith_expression() option would take effectfor each element of the UNION, such as:

SELECTanon_1.anon_2ASanon_1_anon_2,anon_1.user_account_idASanon_1_user_account_id,anon_1.user_account_nameASanon_1_user_account_nameFROM(SELECT?ASanon_2,user_account.idASuser_account_id,user_account.nameASuser_account_nameFROMuser_accountUNIONALLSELECT?ASanon_3,user_account.idASuser_account_id,user_account.nameASuser_account_nameFROMuser_account)ASanon_1('u1','u2')

SQLAlchemy 1.4’s notion of loader options has been made more strict, and as suchare applied to theoutermost part of the query only, which is theSELECT that is intended to populate the actual ORM entities to be returned; thequery above in 1.4 will produce:

SELECT?ASanon_1,anon_2.user_account_idASanon_2_user_account_id,anon_2.user_account_nameASanon_2_user_account_nameFROM(SELECTuser_account.idASuser_account_id,user_account.nameASuser_account_nameFROMuser_accountUNIONALLSELECTuser_account.idASuser_account_id,user_account.nameASuser_account_nameFROMuser_account)ASanon_2('u1',)

that is, the options for theQuery were taken from the firstelement of the UNION, since all loader options are only to be at the topmostlevel. The option from the second query was ignored.

Rationale¶

This behavior now more closely matches that of other kinds of loader optionssuch as relationship loader options likejoinedload() in allSQLAlchemy versions, 1.3 and earlier included, which in a UNION situation werealready copied out to the top most level of the query, and only taken from thefirst element of the UNION, discarding any options on other parts of the query.

This implicit copying and selective ignoring of options, demonstrated above asbeing fairly arbitrary, is a legacy behavior that’s only part ofQuery, and is a particular example of whereQueryand its means of applyingQuery.union_all() falls short, as it’sambiguous how to turn a single SELECT into a UNION of itself and another queryand how loader options should be applied to that new statement.

SQLAlchemy 1.4’s behavior can be demonstrated as generally superior to thatof 1.3 for a more common case of usingdefer(). The followingquery:

q1=session.query(User).options(defer(User.name))q2=session.query(User).options(defer(User.name))q1.union_all(q2).all()

In 1.3 would awkwardly add NULL to the inner queries and then SELECT it:

SELECTanon_1.anon_2ASanon_1_anon_2,anon_1.user_account_idASanon_1_user_account_idFROM(SELECTNULLASanon_2,user_account.idASuser_account_idFROMuser_accountUNIONALLSELECTNULLASanon_2,user_account.idASuser_account_idFROMuser_account)ASanon_1

If all queries didn’t have the identical options set up, the above scenariowould raise an error due to not being able to form a proper UNION.

Whereas in 1.4, the option is applied only at the top layer, omittingthe fetch forUser.name, and this complexity is avoided:

SELECTanon_1.user_account_idASanon_1_user_account_idFROM(SELECTuser_account.idASuser_account_id,user_account.nameASuser_account_nameFROMuser_accountUNIONALLSELECTuser_account.idASuser_account_id,user_account.nameASuser_account_nameFROMuser_account)ASanon_1

Correct Approach¶

Using2.0-style querying, no warning is emitted at the moment, howeverthe nestedwith_expression() options are consistently ignored asthey don’t apply to an entity being loaded, and are not implicitly copiedanywhere. The query below produces no output for thewith_expression() calls:

s1=select(User).options(with_expression(User.expr,literal("u1")))s2=select(User).options(with_expression(User.expr,literal("u2")))stmt=union_all(s1,s2)session.scalars(select(User).from_statement(stmt)).all()

producing the SQL:

SELECTuser_account.id,user_account.nameFROMuser_accountUNIONALLSELECTuser_account.id,user_account.nameFROMuser_account

To correctly applywith_expression() to theUser entity,it should be applied to the outermost level of the query, using anordinary SQL expression inside the columns clause of each SELECT:

s1=select(User,literal("u1").label("some_literal"))s2=select(User,literal("u2").label("some_literal"))stmt=union_all(s1,s2)session.scalars(select(User).from_statement(stmt).options(with_expression(User.expr,stmt.selected_columns.some_literal))).all()

Which will produce the expected SQL:

SELECTuser_account.id,user_account.name,?ASsome_literalFROMuser_accountUNIONALLSELECTuser_account.id,user_account.name,?ASsome_literalFROMuser_account

TheUser objects themselves will include this expression in theircontents underneathUser.expr.

Accessing an uninitialized collection attribute on a transient object no longer mutates dict¶

It has always been SQLAlchemy’s behavior that accessing mapped attributes on anewly created object returns an implicitly generated value, rather than raisingAttributeError, such asNone for scalar attributes or[] for alist-holding relationship:

>>>u1=User()>>>u1.nameNone>>>u1.addresses[]

The rationale for the above behavior was originally to make ORM objects easierto work with. Since an ORM object represents an empty row when first createdwithout any state, it is intuitive that its un-accessed attributes wouldresolve toNone (or SQL NULL) for scalars and to empty collections forrelationships. In particular, it makes possible an extremely common patternof being able to mutate the new collection without manually creating andassigning an empty collection first:

>>>u1=User()>>>u1.addresses.append(Address())# no need to assign u1.addresses = []

Up until version 1.0 of SQLAlchemy, the behavior of this initialization systemfor both scalar attributes as well as collections would be that theNone orempty collection would bepopulated into the object’s state, e.g.__dict__. This meant that the following two operations were equivalent:

>>>u1=User()>>>u1.name=None# explicit assignment>>>u2=User()>>>u2.name# implicit assignment just by accessing itNone

Where above, bothu1 andu2 would have the valueNone populatedin the value of thename attribute. Since this is a SQL NULL, the ORMwould skip including these values within an INSERT so that SQL-level defaultstake place, if any, else the value defaults to NULL on the database side.

In version 1.0 as part ofChanges to attribute events and other operations regarding attributes that have no pre-existing value, this behavior was refined sothat theNone value was no longer populated into__dict__, onlyreturned. Besides removing the mutating side effect of a getter operation,this change also made it possible to set columns that did have server defaultsto the value NULL by actually assigningNone, which was now distinguishedfrom just reading it.

The change however did not accommodate for collections, where returning anempty collection that is not assigned meant that this mutable collection wouldbe different each time and also would not be able to correctly accommodate formutating operations (e.g. append, add, etc.) called upon it. While thebehavior continued to generally not get in anyone’s way, an edge case waseventually identified in#4519 where this empty collection could beharmful, which is when the object is merged into a session:

>>>u1=User(id=1)# create an empty User to merge with id=1 in the database>>>merged1=session.merge(...u1...)# value of merged1.addresses is unchanged from that of the DB>>>u2=User(id=2)# create an empty User to merge with id=2 in the database>>>u2.addresses[]>>>merged2=session.merge(u2)# value of merged2.addresses has been emptied in the DB

Above, the.addresses collection onmerged1 will contain all theAddress() objects that were already in the database.merged2 willnot; because it has an empty list implicitly assigned, the.addressescollection will be erased. This is an example of where this mutating sideeffect can actually mutate the database itself.

While it was considered that perhaps the attribute system should begin usingstrict “plain Python” behavior, raisingAttributeError in all cases fornon-existent attributes on non-persistent objects and requiring that allcollections be explicitly assigned, such a change would likely be too extremefor the vast number of applications that have relied upon this behavior formany years, leading to a complex rollout / backwards compatibility problem aswell as the likelihood that workarounds to restore the old behavior wouldbecome prevalent, thus rendering the whole change ineffective in any case.

The change then is to keep the default producing behavior, but to finally makethe non-mutating behavior of scalars a reality for collections as well, via theaddition of additional mechanics in the collection system. When accessing theempty attribute, the new collection is created and associated with the state,however is not added to__dict__ until it is actually mutated:

>>>u1=User()>>>l1=u1.addresses# new list is created, associated with the state>>>assertu1.addressesisl1# you get the same list each time you access it>>>assert(..."addresses"notinu1.__dict__...)# but it won't go into __dict__ until it's mutated>>>fromsqlalchemyimportinspect>>>inspect(u1).attrs.addresses.historyHistory(added=None, unchanged=None, deleted=None)

When the list is changed, then it becomes part of the tracked changes tobe persisted to the database:

>>>l1.append(Address())>>>assert"addresses"inu1.__dict__>>>inspect(u1).attrs.addresses.historyHistory(added=[<__main__.Address object at 0x7f49b725eda0>], unchanged=[], deleted=[])

This change is expected to havenearly no impact on existing applicationsin any way, except that it has been observed that some applications may berelying upon the implicit assignment of this collection, such as to assert thatthe object contains certain values based on its__dict__:

>>>u1=User()>>>u1.addresses[]# this will now fail, would pass before>>>assert{k:vfork,vinu1.__dict__.items()ifnotk.startswith("_")}=={..."addresses":[]...}

or to ensure that the collection won’t require a lazy load to proceed, the(admittedly awkward) code below will now also fail:

>>>u1=User()>>>u1.addresses[]>>>s.add(u1)>>>s.flush()>>>s.close()>>>u1.addresses# <-- will fail, .addresses is not loaded and object is detached

Applications that rely upon the implicit mutating behavior of collections willneed to be changed so that they assign the desired collection explicitly:

>>>u1.addresses=[]

#4519

The “New instance conflicts with existing identity” error is now a warning¶

SQLAlchemy has always had logic to detect when an object in theSessionto be inserted has the same primary key as an object that is already present:

classProduct(Base):__tablename__="product"id=Column(Integer,primary_key=True)session=Session(engine)# add Product with primary key 1session.add(Product(id=1))session.flush()# add another Product with same primary keysession.add(Product(id=1))s.commit()# <-- will raise FlushError

The change is that theFlushError is altered to be only a warning:

sqlalchemy/orm/persistence.py:408:SAWarning:Newinstance<Productat0x7f1ff65e0ba8>withidentitykey(<class'__main__.Product'>, (1,), None) conflicts with persistent instance <Product at 0x7f1ff60a4550>

Subsequent to that, the condition will attempt to insert the row into thedatabase which will emitIntegrityError, which is the same error thatwould be raised if the primary key identity was not already present in theSession:

sqlalchemy.exc.IntegrityError:(sqlite3.IntegrityError)UNIQUEconstraintfailed:product.id

The rationale is to allow code that is usingIntegrityError to catchduplicates to function regardless of the existing state of theSession, as is often done using savepoints:

# add another Product with same primary keytry:withsession.begin_nested():session.add(Product(id=1))exceptexc.IntegrityError:print("row already exists")

The above logic was not fully feasible earlier, as in the case that theProduct object with the existing identity were already in theSession, the code would also have to catchFlushError,which additionally is not filtered for the specific condition of integrityissues. With the change, the above block behaves consistently with theexception of the warning also being emitted.

Since the logic in question deals with the primary key, all databases emit anintegrity error in the case of primary key conflicts on INSERT. The casewhere an error would not be raised, that would have earlier, is the extremelyunusual scenario of a mapping that defines a primary key on the mappedselectable that is more restrictive than what is actually configured in thedatabase schema, such as when mapping to joins of tables or when definingadditional columns as part of a composite primary key that is not actuallyconstrained in the database schema. However, these situations also work moreconsistently in that the INSERT would theoretically proceed whether or not theexisting identity were still in the database. The warning can also beconfigured to raise an exception using the Python warnings filter.

#4662

Persistence-related cascade operations disallowed with viewonly=True¶

When arelationship() is set asviewonly=True using therelationship.viewonly flag, it indicates this relationship shouldonly be used to load data from the database, and should not be mutatedor involved in a persistence operation. In order to ensure this contractworks successfully, the relationship can no longer specifyrelationship.cascade settings that make no sense in terms of“viewonly”.

The primary targets here are the “delete, delete-orphan” cascades, whichthrough 1.3 continued to impact persistence even if viewonly were True, whichis a bug; even if viewonly were True, an object would still cascade thesetwo operations onto the related object if the parent were deleted or theobject were detached. Rather than modify the cascade operations to checkfor viewonly, the configuration of both of these together is simplydisallowed:

classUser(Base):# ...# this is now an erroraddresses=relationship("Address",viewonly=True,cascade="all, delete-orphan")

The above will raise:

sqlalchemy.exc.ArgumentError:Cascadesettings"delete, delete-orphan, merge, save-update"applytopersistenceoperationsandshouldnotbecombinedwithaviewonly=Truerelationship.

Applications that have this issue should be emitting a warning as ofSQLAlchemy 1.3.12, and for the above error the solution is to removethe cascade settings for a viewonly relationship.

#4993 #4994

Stricter behavior when querying inheritance mappings using custom queries¶

This change applies to the scenario where a joined- or single- tableinheritance subclass entity is being queried, given a completed SELECT subqueryto select from. If the given subquery returns rows that do not correspond tothe requested polymorphic identity or identities, an error is raised.Previously, this condition would pass silently under joined table inheritance,returning an invalid subclass, and under single table inheritance, theQuery would be adding additional criteria against the subquery tolimit the results which could inappropriately interfere with the intent of thequery.

Given the example mapping ofEmployee,Engineer(Employee),Manager(Employee),in the 1.3 series if we were to emit the following query against a joinedinheritance mapping:

s=Session(e)s.add_all([Engineer(),Manager()])s.commit()print(s.query(Manager).select_entity_from(s.query(Employee).subquery()).all())

The subquery selects both theEngineer and theManager rows, andeven though the outer query is againstManager, we get a nonManagerobject back:

SELECTanon_1.typeASanon_1_type,anon_1.idASanon_1_idFROM(SELECTemployee.typeAStype,employee.idASidFROMemployee)ASanon_12020-01-2918:04:13,524INFOsqlalchemy.engine.base.Engine()[<__main__.Engineerobjectat0x7f7f5b9a9810>,<__main__.Managerobjectat0x7f7f5b9a9750>]

The new behavior is that this condition raises an error:

sqlalchemy.exc.InvalidRequestError:Rowwithidentitykey(<class'__main__.Employee'>, (1,), None) can'tbeloadedintoanobject;thepolymorphicdiscriminatorcolumn'%(140205120401296 anon)s.type'referstomappedclassEngineer->engineer,whichisnotasub-mapperoftherequestedmappedclassManager->manager

The above error only raises if the primary key columns of that entity arenon-NULL. If there’s no primary key for a given entity in a row, no attemptto construct an entity is made.

In the case of single inheritance mapping, the change in behavior is slightlymore involved; ifEngineer andManager above are mapped withsingle table inheritance, in 1.3 the following query would be emitted andonly aManager object is returned:

SELECTanon_1.typeASanon_1_type,anon_1.idASanon_1_idFROM(SELECTemployee.typeAStype,employee.idASidFROMemployee)ASanon_1WHEREanon_1.typeIN(?)2020-01-2918:08:32,975INFOsqlalchemy.engine.base.Engine('manager',)[<__main__.Managerobjectat0x7ff1b0200d50>]

TheQuery added the “single table inheritance” criteria to thesubquery, editorializing on the intent that was originally set up by it.This behavior was added in version 1.0 in#3891, and creates abehavioral inconsistency between “joined” and “single” table inheritance,and additionally modifies the intent of the given query, which may intendto return additional rows where the columns that correspond to the inheritingentity are NULL, which is a valid use case. The behavior is now equivalentto that of joined table inheritance, where it is assumed that the subqueryreturns the correct rows and an error is raised if an unexpected polymorphicidentity is encountered:

SELECTanon_1.typeASanon_1_type,anon_1.idASanon_1_idFROM(SELECTemployee.typeAStype,employee.idASidFROMemployee)ASanon_12020-01-2918:13:10,554INFOsqlalchemy.engine.base.Engine()Traceback(mostrecentcalllast):# ...sqlalchemy.exc.InvalidRequestError:Rowwithidentitykey(<class'__main__.Employee'>, (1,), None) can'tbeloadedintoanobject;thepolymorphicdiscriminatorcolumn'%(140700085268432 anon)s.type'referstomappedclassEngineer->employee,whichisnotasub-mapperoftherequestedmappedclassManager->employee

The correct adjustment to the situation as presented above which worked on 1.3is to adjust the given subquery to correctly filter the rows based on thediscriminator column:

print(s.query(Manager).select_entity_from(s.query(Employee).filter(Employee.discriminator=='manager').subquery()).all())SELECTanon_1.typeASanon_1_type,anon_1.idASanon_1_idFROM(SELECTemployee.typeAStype,employee.idASidFROMemployeeWHEREemployee.type=?)ASanon_12020-01-2918:14:49,770INFOsqlalchemy.engine.base.Engine('manager',)[<__main__.Managerobjectat0x7f70e13fca90>]

#5122

Dialect Changes¶

pg8000 minimum version is 1.16.6, supports Python 3 only¶

Support for the pg8000 dialect has been dramatically improved, with help fromthe project’s maintainer.

Due to API changes, the pg8000 dialect now requiresversion 1.16.6 or greater. The pg8000 series has dropped Python 2 support as ofthe 1.13 series. Python 2 users who require pg8000 should ensure theirrequirements are pinned atSQLAlchemy<1.4.

#5451

psycopg2 version 2.7 or higher is required for the PostgreSQL psycopg2 dialect¶

The psycopg2 dialect relies upon many features of psycopg2 releasedin the past few years. To simplify the dialect, version 2.7, releasedin March, 2017 is now the minimum version required.

psycopg2 dialect no longer has limitations regarding bound parameter names¶

SQLAlchemy 1.3 was not able to accommodate bound parameter names that includedpercent signs or parenthesis under the psycopg2 dialect. This in turn meantthat column names which included these characters were also problematic asINSERT and other DML statements would generate parameter names that matchedthat of the column, which would then cause failures. The workaround was to makeuse of theColumn.key parameter so that an alternate namethat would be used to generate the parameter, or otherwise the parameter styleof the dialect had to be changed at thecreate_engine() level. As ofSQLAlchemy 1.4.0beta3 all naming limitations have been removed and parametersare fully escaped in all scenarios, so these workarounds are no longernecessary.

#5941

#5653

psycopg2 dialect features “execute_values” with RETURNING for INSERT statements by default¶

The first half of a significant performance enhancement for PostgreSQL whenusing both Core and ORM, the psycopg2 dialect now usespsycopg2.extras.execute_values() by default for compiled INSERT statementsand also implements RETURNING support in this mode. The other half of thischange isORM Batch inserts with psycopg2 now batch statements with RETURNING in most cases which allows the ORM to take advantage ofRETURNING with executemany (i.e. batching of INSERT statements) so that ORMbulk inserts with psycopg2 are up to 400% faster depending on specifics.

This extension method allows many rows to be INSERTed within a singlestatement, using an extended VALUES clause for the statement. WhileSQLAlchemy’sinsert() construct already supports this syntax viatheInsert.values() method, the extension method allows theconstruction of the VALUES clause to occur dynamically when the statementis executed as an “executemany” execution, which is what occurs when onepasses a list of parameter dictionaries toConnection.execute().It also occurs beyond the cache boundary so that the INSERT statement maybe cached before the VALUES are rendered.

A quick test of theexecute_values() approach using thebulk_inserts.py script in thePerformance examplesuite reveals an approximatefivefold performance increase:

$python-mexamples.performancebulk_inserts--testtest_core_insert--num100000--dburlpostgresql://scott:tiger@localhost/test# 1.3test_core_insert:AsingleCoreINSERTconstructinsertingmappingsinbulk.(100000iterations);totaltime5.229326sec# 1.4test_core_insert:AsingleCoreINSERTconstructinsertingmappingsinbulk.(100000iterations);totaltime0.944007sec

Support for the “batch” extension was added in version 1.2 inSupport for Batch Mode / Fast Execution Helpers, and enhanced to include support for theexecute_valuesextension in 1.3 in#4623. In 1.4 theexecute_values extension isnow being turned on by default for INSERT statements; the “batch” extensionfor UPDATE and DELETE remains off by default.

In addition, theexecute_values extension function supports returning therows that are generated by RETURNING as an aggregated list. The psycopg2dialect will now retrieve this list if the giveninsert() constructrequests returning via theInsert.returning() method or similar methodsintended to return generated defaults; the rows are then installed in theresult so that they are retrieved as though they came from the cursordirectly. This allows tools like the ORM to use batched inserts in all cases,which is expected to provide a dramatic performance improvement.

Theexecutemany_mode feature of the psycopg2 dialect has been revisedwith the following changes:

A new mode"values_only" is added. This mode uses the very performantpsycopg2.extras.execute_values() extension method for compiled INSERTstatements run with executemany(), but does not useexecute_batch() forUPDATE and DELETE statements. This new mode is now the default setting forthe psycopg2 dialect.
The existing"values" mode is now named"values_plus_batch". This modewill useexecute_values for INSERT statements andexecute_batchfor UPDATE and DELETE statements. The mode is not enabled by defaultbecause it disables the proper functioning ofcursor.rowcount withUPDATE and DELETE statements executed withexecutemany().
RETURNING support is enabled for"values_only" and"values" forINSERT statements. The psycopg2 dialect will receive the rows backfrom psycopg2 using the fetch=True flag and install them into the resultset as though they came directly from the cursor (which they ultimately did,however psycopg2’s extension function has aggregated multiple batches intoone list).
The default “page_size” setting forexecute_values has been increasedfrom 100 to 1000. The default remains at 100 for theexecute_batchfunction. These parameters may both be modified as was the case before.
Theuse_batch_mode flag that was part of the 1.2 version of the featureis removed; the behavior remains controllable via theexecutemany_modeflag added in 1.3.
The Core engine and dialect has been enhanced to support executemanyplus returning mode, currently only available with psycopg2, by providingnewCursorResult.inserted_primary_key_rows andCursorResult.returned_default_rows accessors.

See also

Psycopg2 Fast Execution Helpers

#5401

Removed “join rewriting” logic from SQLite dialect; updated imports¶

Dropped support for right-nested join rewriting to support old SQLiteversions prior to 3.7.16, released in 2013. It is not expected thatany modern Python versions rely upon this limitation.

The behavior was first introduced in 0.9 and was part of the larger change ofallowing for right nested joins as described atMany JOIN and LEFT OUTER JOIN expressions will no longer be wrapped in (SELECT * FROM ..) AS ANON_1.However the SQLite workaround produced many regressions in the 2013-2014period due to its complexity. In 2016, the dialect was modified so that thejoin rewriting logic would only occur for SQLite versions prior to 3.7.16 afterbisection was used to identify where SQLite fixed its support for thisconstruct, and no further issues were reported against the behavior (eventhough some bugs were found internally). It is now anticipated that thereare little to no Python builds for Python 2.7 or 3.5 and above (the supportedPython versions) which would include a SQLite version prior to 3.7.17, andthe behavior is only necessary only in more complex ORM joining scenarios.A warning is now emitted if the installed SQLite version is older than3.7.16.

In related changes, the module imports for SQLite no longer attempt toimport the “pysqlite2” driver on Python 3 as this driver does not existon Python 3; a very old warning for old pysqlite2 versions is also dropped.

#4895

Added Sequence support for MariaDB 10.3¶

The MariaDB database as of 10.3 supports sequences. SQLAlchemy’s MySQLdialect now implements support for theSequence object against thisdatabase, meaning “CREATE SEQUENCE” DDL will be emitted for aSequence that is present in aTable orMetaDatacollection in the same way as it works for backends such as PostgreSQL, Oracle,when the dialect’s server version check has confirmed the database is MariaDB10.3 or greater. Additionally, theSequence will act as acolumn default and primary key generation object when used in these ways.

Since this change will impact the assumptions both for DDL as well as thebehavior of INSERT statements for an application that is currently deployedagainst MariaDB 10.3 which also happens to make explicit use theSequence construct within its table definitions, it is important tonote thatSequence supports a flagSequence.optionalwhich is used to limit the scenarios in which theSequence to takeeffect. When “optional” is used on aSequence that is present in theinteger primary key column of a table:

Table("some_table",metadata,Column("id",Integer,Sequence("some_seq",optional=True),primary_key=True),)

The aboveSequence is only used for DDL and INSERT statements if thetarget database does not support any other means of generating integer primarykey values for the column. That is, the Oracle database above would use thesequence, however the PostgreSQL and MariaDB 10.3 databases would not. This maybe important for an existing application that is upgrading to SQLAlchemy 1.4which may not have emitted DDL for thisSequence against its backingdatabase, as an INSERT statement will fail if it seeks to use a sequence thatwas not created.

See also

Defining Sequences

#4976

Added Sequence support distinct from IDENTITY to SQL Server¶

TheSequence construct is now fully functional with MicrosoftSQL Server. When applied to aColumn, the DDL for the table willno longer include IDENTITY keywords and instead will rely upon “CREATE SEQUENCE”to ensure a sequence is present which will then be used for INSERT statementson the table.

TheSequence prior to version 1.3 was used to control parameters forthe IDENTITY column in SQL Server; this usage emitted deprecation warningsthroughout 1.3 and is now removed in 1.4. For control of parameters for anIDENTITY column, themssql_identity_start andmssql_identity_incrementparameters should be used; see the MSSQL dialect documentation linked below.

#4235

#4633

Previous:Changes and Migration Next:Migrating to SQLAlchemy 2.0

flambé! the dragon andThe Alchemist image designs created and generously donated byRotem Yaari.

Created usingSphinx 8.2.3. Documentation last generated: Thu 10 Jul 2025 12:27:10 AM EDT

Movatterモバイル変換

SQLAlchemy 1.4 Documentation