lxml - XML and HTML with Python

» lxml takes all the pain out of XML. «
Stephan Richter

lxml is the most feature-richand easy-to-use libraryfor processing XML and HTMLin the Python language.


The lxml XML toolkit is a Pythonic binding for the C librarieslibxml2 andlibxslt. It is unique in that it combines the speed andXML feature completeness of these libraries with the simplicity of anative Python API, mostly compatible but superior to the well-knownElementTree API. The latest release works with all CPython versionsfrom 3.6 to 3.12. See theintroduction for more information aboutbackground and goals of the lxml project. Some common questions areanswered in theFAQ.

The HTML documentation from this web site is part ofthe normalsource download.

lxml.etree follows theElementTree API as much as possible, buildingit on top of the native libxml2 tree. If you are new to ElementTree,start with thelxml.etree tutorial for XML processing. See also theElementTreecompatibility overview and theElementTree performancepage comparing lxml to the originalElementTree andcElementTreeimplementations.

Right after thelxml.etree tutorial for XML processing and theElementTree documentation, the next place to look is thelxml.etreespecific API documentation. It describes how lxml extends theElementTree API to expose libxml2 and libxslt specific XMLfunctionality, such asXPath,Relax NG,XML Schema,XSLT, andc14n (includingc14n 2.0).Python code can be called from XPath expressions and XSLTstylesheets through the use ofXPath extension functions. lxmlalso offers aSAX compliant API, that works with the SAX support inthe standard library.

There is a separate modulelxml.objectify that implements a data-bindingAPI on top of lxml.etree. See theobjectify and etree FAQ entry for acomparison.

In addition to the ElementTree API, lxml also features a sophisticatedAPI forcustom XML element classes. This is a simple way to writearbitrary XML driven APIs on top of lxml. lxml.etree also has aC-level API that can be used to efficiently extend lxml.etree inexternal C modules, including fast custom element class support.


The best way to download lxml is to visitlxml at the Python PackageIndex (PyPI). It has the sourcethat compiles on various platforms. The source distribution is signedwiththis key.

The latest version islxml 5.3.0, released 2024-08-10(changes for 5.3.0).Older versionsare listed below.

Please take a look at theinstallation instructions !

This complete website (including the generated API documentation) ispart of the source distribution, so if you want to download thedocumentation for offline use, take the source archive and copy thedoc/html directory out of the source tree.

The latestinstallable developer sourcesare available from Github. It's also possible to check outthe latest development version of lxml from Github directly, using a commandlike this:

git clone lxml

You can browse thesource repository and its history throughthe web. Please readhow to build lxml from sourcefirst. Thelatest CHANGES of the developer version are alsoaccessible. You can check there if a bug you found has been fixedor a feature you want has been implemented in the latest trunk version.

Mailing list

Questions? Suggestions? Code to contribute? We have amailing list.

You can alsosearch the archive for past questions and discussions.

Bug tracker

lxml uses thelaunchpad bug tracker. If you are sure you found abug in lxml, please file a bug report there. If you are not surewhether some unexpected behaviour of lxml is a bug or not, pleasecheck the documentation and ask on themailing list first. Do notforget tosearch the archive!


The lxml library is shipped under aBSD license. libxml2 and libxslt2itself are shipped under theMIT license. There should therefore be noobstacle to using lxml in your codebase.

Old Versions

Generated on: 2024-08-10.

