- Notifications
You must be signed in to change notification settings - Fork801
High level Python client for Elasticsearch
License
elastic/elasticsearch-dsl-py
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
As of release 8.18.0, the Elasticsearch DSL package is part of the officialElasticsearch Python client,so a separate install is not needed anymore. To migrate, follow these steps:
- Uninstall the
elasticsearch-dsl
package - Make sure you have version 8.18.0 or newer of the
elasticsearch
packageinstalled - Replace
elasticsearch_dsl
withelasticsearch.dsl
in your imports
To prevent applications from breaking unexpectedly due to this change, the8.18.0 release of this package automatically redirects all imports to thecorresponding modules of the Python client package.
The following instructions only apply to versions 8.17 and older of thispackage.
Elasticsearch DSL is a high-level library whose aim is to help with writing andrunning queries against Elasticsearch. It is built on top of the officiallow-level client (elasticsearch-py).
It provides a more convenient and idiomatic way to write and manipulatequeries. It stays close to the Elasticsearch JSON DSL, mirroring itsterminology and structure. It exposes the whole range of the DSL from Pythoneither directly using defined classes or a queryset-like expressions.
It also provides an optional wrapper for working with documents as Pythonobjects: defining mappings, retrieving and saving documents, wrapping thedocument data in user-defined classes.
To use the other Elasticsearch APIs (eg. cluster health) just use theunderlying client.
pip install elasticsearch-dsl
Please see theexamplesdirectory to see some complex examples usingelasticsearch-dsl
.
The library is compatible with all Elasticsearch versions since2.x
but youhave to use a matching major version:
ForElasticsearch 8.0 and later, use the major version 8 (8.x.y
) of thelibrary.
ForElasticsearch 7.0 and later, use the major version 7 (7.x.y
) of thelibrary.
ForElasticsearch 6.0 and later, use the major version 6 (6.x.y
) of thelibrary.
ForElasticsearch 5.0 and later, use the major version 5 (5.x.y
) of thelibrary.
ForElasticsearch 2.0 and later, use the major version 2 (2.x.y
) of thelibrary.
The recommended way to set your requirements in your setup.py orrequirements.txt is:
# Elasticsearch 8.xelasticsearch-dsl>=8.0.0,<9.0.0# Elasticsearch 7.xelasticsearch-dsl>=7.0.0,<8.0.0# Elasticsearch 6.xelasticsearch-dsl>=6.0.0,<7.0.0# Elasticsearch 5.xelasticsearch-dsl>=5.0.0,<6.0.0# Elasticsearch 2.xelasticsearch-dsl>=2.0.0,<3.0.0
The development is happening onmain
, older branches only get bugfix releases
Let's have a typical search request written directly as adict
:
fromelasticsearchimportElasticsearchclient=Elasticsearch("https://localhost:9200")response=client.search(index="my-index",body={"query": {"bool": {"must": [{"match": {"title":"python"}}],"must_not": [{"match": {"description":"beta"}}],"filter": [{"term": {"category":"search"}}] } },"aggs" : {"per_tag": {"terms": {"field":"tags"},"aggs": {"max_lines": {"max": {"field":"lines"}} } } } })forhitinresponse['hits']['hits']:print(hit['_score'],hit['_source']['title'])fortaginresponse['aggregations']['per_tag']['buckets']:print(tag['key'],tag['max_lines']['value'])
The problem with this approach is that it is very verbose, prone to syntaxmistakes like incorrect nesting, hard to modify (eg. adding another filter) anddefinitely not fun to write.
Let's rewrite the example using the Python DSL:
fromelasticsearchimportElasticsearchfromelasticsearch_dslimportSearchclient=Elasticsearch("https://localhost:9200")s=Search(using=client,index="my-index") \ .filter("term",category="search") \ .query("match",title="python") \ .exclude("match",description="beta")s.aggs.bucket('per_tag','terms',field='tags') \ .metric('max_lines','max',field='lines')response=s.execute()forhitinresponse:print(hit.meta.score,hit.title)fortaginresponse.aggregations.per_tag.buckets:print(tag.key,tag.max_lines.value)
As you see, the library took care of:
- creating appropriate
Query
objects by name (eq. "match") - composing queries into a compound
bool
query - putting the
term
query in a filter context of thebool
query - providing a convenient access to response data
- no curly or square brackets everywhere
Let's have a simple Python class representing an article in a blogging system:
fromdatetimeimportdatetimefromelasticsearch_dslimportDocument,Date,Integer,Keyword,Text,connections# Define a default Elasticsearch clientconnections.create_connection(hosts="https://localhost:9200")classArticle(Document):title=Text(analyzer='snowball',fields={'raw':Keyword()})body=Text(analyzer='snowball')tags=Keyword()published_from=Date()lines=Integer()classIndex:name='blog'settings= {"number_of_shards":2, }defsave(self,**kwargs):self.lines=len(self.body.split())returnsuper(Article,self).save(**kwargs)defis_published(self):returndatetime.now()>self.published_from# create the mappings in elasticsearchArticle.init()# create and save and articlearticle=Article(meta={'id':42},title='Hello world!',tags=['test'])article.body=''' looong text '''article.published_from=datetime.now()article.save()article=Article.get(id=42)print(article.is_published())# Display cluster healthprint(connections.get_connection().cluster.health())
In this example you can see:
- providing a default connection
- defining fields with mapping configuration
- setting index name
- defining custom methods
- overriding the built-in
.save()
method to hook into the persistencelife cycle - retrieving and saving the object into Elasticsearch
- accessing the underlying client for other APIs
You can see more in the persistence chapter of the documentation.
You don't have to port your entire application to get the benefits of thePython DSL, you can start gradually by creating aSearch
object from yourexistingdict
, modifying it using the API and serializing it back to adict
:
body= {...}# insert complicated query here# Convert to Search objects=Search.from_dict(body)# Add some filters, aggregations, queries, ...s.filter("term",tags="python")# Convert back to dict to plug back into existing codebody=s.to_dict()
Activate Virtual Environment (virtualenvs):
$ virtualenv venv$source venv/bin/activate
To install all of the dependencies necessary for development, run:
$ pip install -e'.[develop]'
To run all of the tests forelasticsearch-dsl-py
, run:
$ python setup.pytest
Alternatively, it is possible to use therun_tests.py
script intest_elasticsearch_dsl
, which wrapspytest, to run subsets of the test suite. Someexamples can be seen below:
# Run all of the tests in `test_elasticsearch_dsl/test_analysis.py`$ ./run_tests.py test_analysis.py# Run only the `test_analyzer_serializes_as_name` test.$ ./run_tests.py test_analysis.py::test_analyzer_serializes_as_name
pytest
will skip tests fromtest_elasticsearch_dsl/test_integration
unless there is an instance of Elasticsearch on which a connection can occur.By default, the test connection is attempted atlocalhost:9200
, based onthe defaults specified in theelasticsearch-py
Connection class.Because running the integrationtests will cause destructive changes to the Elasticsearch cluster, only runthem when the associated cluster is empty. As such, if theElasticsearch instance atlocalhost:9200
does not meet these requirements,it is possible to specify a different test Elasticsearch server through theTEST_ES_SERVER
environment variable.
$ TEST_ES_SERVER=my-test-server:9201 ./run_tests
Documentation is available athttps://elasticsearch-dsl.readthedocs.io.
Want to hack on Elasticsearch DSL? Awesome! We haveContribution-Guide.
Copyright 2013 Elasticsearch
Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.
About
High level Python client for Elasticsearch
Topics
Resources
License
Contributing
Security policy
Uh oh!
There was an error while loading.Please reload this page.