Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

🔍 Elasticsearch Scala Client - Reactive, Non Blocking, Type Safe, HTTP Client

License

NotificationsYou must be signed in to change notification settings

Philippus/elastic4s

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

buildCurrent VersionScala Steward badgeLicense

This is a community project - PRs will be accepted and releases published by the maintainer

Elastic4s is a concise, idiomatic, reactive, type safe Scala client for Elasticsearch. The official Elasticsearch Java client can of course be used in Scala, but due to Java's syntax it is more verbose and it naturally doesn't support classes in the core Scala core library nor Scala idioms such as typeclass support.

Elastic4s's DSL allows you to construct your requests programatically, with syntactic and semantic errors manifested at compile time, and uses standard Scala futures to enable you to easily integrate into an asynchronous workflow. The aim of the DSL is that requests are written in a builder-like way, while staying broadly similar to the Java API or Rest API. Each request is an immutable object, so you can create requests and safely reuse them, or further copy them for derived requests. Because each request is strongly typed your IDE or editor can use the type information to show you what operations are available for any request type.

Elastic4s supports Scala collections so you don't have to do tedious conversions from your Scala domain classes into Java collections. It also allows you to index and read classes directly using typeclasses so you don't have to set fields or json documents manually. These typeclasses are generated using your favourite json library - modules exist for Jackson, Circe, Json4s, PlayJson and Spray Json. The client also uses standard Scala durations to avoid the use of strings or primitives for duration lengths.

Key points

  • Type safe concise DSL
  • Integrates with standard Scala futures or other effects libraries
  • Uses Scala collections library over Java collections
  • ReturnsOption where the java methods would return null
  • Uses ScalaDurations instead of strings/longs for time values
  • Supports typeclasses for indexing, updating, and search backed by Jackson, Circe, Json4s, PlayJson and Spray Json implementations
  • Supports Java and Scala HTTP clients such as Akka-Http
  • Providesreactive-streams implementation
  • Provides a testkit subproject ideal for your tests

Release

Current Elastic4s versions support Scala 2.12 and 2.13. Scala 2.10 support has been dropped starting with 5.0.x and Scala 2.11 has been dropped starting with 7.2.0. For releases that are compatible with earlier versions of Elasticsearch,search maven central.

Note that starting from versions 7.17.25 and 8.12.0 the group id has changed from com.sksamuel.elastic4s to nl.gn0s1s.

Elastic VersionScala 2.12Scala 2.13Scala 3
8.17.x
8.16.x
8.15.x
8.14.x
8.13.x
8.12.x
8.11.x
8.10.x
8.9.x
8.8.x
8.7.x
8.6.x
8.5.x
8.4.x
8.3.x
8.2.x
8.1.x
8.0.x
7.17.x

For releases prior to 7.17.xsearch maven central.

Quick Start

We have created sample projects in both sbt, maven and gradle. Check them out here:https://github.com/philippus/elastic4s/tree/master/samples

To get started you will need to add a dependency:

// major.minor are in sync with the elasticsearch releasesvalelastic4sVersion="x.x.x"libraryDependencies++=Seq(// recommended client for beginners"nl.gn0s1s"%%"elastic4s-client-esjava"% elastic4sVersion,// test kit"nl.gn0s1s"%%"elastic4s-testkit"% elastic4sVersion%"test")

The basic usage is that you create an instance of a client and then invoke theexecute method with the requests youwant to perform. The execute method is asynchronous and will return a standard ScalaFuture[T](or use one of theAlternative executors) where T is the responsetype appropriate for your request type. For example asearch request will return a response of typeSearchResponsewhich contains the results of the search.

To create an instance of the HTTP client, use theElasticClient companion object methods.Requests are created using the elastic4s DSL. For example to create a search request, you would do:

search("index").query("findthistext")

The DSL methods are located in theElasticDsl trait which needs to be imported or extended.

importcom.sksamuel.elastic4s.ElasticDsl._

Creating a Client

The entry point in elastic4s is an instance ofElasticClient.This class is used to execute requests, such asSearchRequest, against an Elasticsearch cluster and returns a response type such asSearchResponse.

ElasticClient takes care of transforming the requests and responses, and handling success and failure, but the actual HTTP functions are delegated to a HTTP library.One such library isJavaClient which uses the http client provided by the offical Java elasticsearch library.

So, to connect to an ElasticSearch cluster, pass an instance ofJavaClient to anElasticClient.JavaClient is configured usingElasticProperties in which you can specify protocol, host, and port in a single string.

valprops=ElasticProperties("http://host1:9200")valclient=ElasticClient(JavaClient(props))

For multiple nodes you can pass a comma-separated list of endpoints in a single string:

valnodes=ElasticProperties("http://host1:9200,host2:9200,host3:9200")valclient=ElasticClient(JavaClient(nodes))

There are several http libraries to choose from, or you can wrap any HTTP library you wish. For further details, and informationon how to specify credentials and other options, seethe full client documentation

Example Application

An example is worth 1000 characters so here is a quick example of how to connect to a node with a client, create anindex and index a one field document. Then we will search for that document using a simple text query.

Note: As of version0.7.x theLocalNode functionality has been removed. It is recommended that you stand upa local ElasticSearch Docker container for development. This is the same strategy used in thetests.

importcom.sksamuel.elastic4s.fields.TextFieldimportcom.sksamuel.elastic4s.http.JavaClientimportcom.sksamuel.elastic4s.requests.common.RefreshPolicyimportcom.sksamuel.elastic4s.requests.searches.SearchResponseobjectArtistIndexextendsApp {// in this example we create a client to a local Docker container at localhost:9200valclient=ElasticClient(JavaClient(ElasticProperties(s"http://${sys.env.getOrElse("ES_HOST","127.0.0.1")}:${sys.env.getOrElse("ES_PORT","9200")}")))// we must import the dslimportcom.sksamuel.elastic4s.ElasticDsl._// Next we create an index in advance ready to receive documents.// await is a helper method to make this operation synchronous instead of async// You would normally avoid doing this in a real program as it will block// the calling thread but is useful when testing  client.execute {    createIndex("artists").mapping(      properties(TextField("name")      )    )  }.await// Next we index a single document which is just the name of an Artist.// The RefreshPolicy.Immediate means that we want this document to flush to the disk immediately.// see the section on Eventual Consistency.  client.execute {    indexInto("artists").fields("name"->"L.S. Lowry").refresh(RefreshPolicy.Immediate)  }.await// now we can search for the document we just indexedvalresp= client.execute {    search("artists").query("lowry")  }.await// resp is a Response[+U] ADT consisting of either a RequestFailure containing the// Elasticsearch error details, or a RequestSuccess[U] that depends on the type of request.// In this case it is a RequestSuccess[SearchResponse]  println("---- Search Results ----")  respmatch {casefailure:RequestFailure=> println("We failed"+ failure.error)caseresults:RequestSuccess[SearchResponse]=> println(results.result.hits.hits.toList)caseresults:RequestSuccess[_]=> println(results.result)  }// Response also supports familiar combinators like map / flatMap / foreach:  resp foreach (search=> println(s"There were${search.totalHits} total hits"))  client.close()}

Alternative Executors

By default, elastic4s uses scalaFutures when returning responses, but any effect type can be supported.

If you wish to use ZIO, Cats-Effect, Monix or Scalaz, then read this page onalternative effects.

Index Refreshing

When you index a document in Elasticsearch, usually it is not immediately available to be searched, as arefresh has to happen to make it visible to the search API.

By default a refresh occurs every second but this can be changed if needed.Note that this only impacts the visibility of newly indexed documents and has nothingto do with data consistency and durability.

This setting can becontrolled when creating an index or when indexed documents.

Create Index

All documents in Elasticsearch are stored in an index. We do not need to tell Elasticsearch in advance what an indexwill look like (eg what fields it will contain) as Elasticsearch will adapt the index dynamically as more documents are added, but we must at least create the index first.

To create an index called "places" that is fully dynamic we can simply use:

client.execute {  createIndex("places")}

We can optionally set the number of shards and/or replicas

client.execute {  createIndex("places").shards(3).replicas(2)}

Sometimes we want to specify the properties of the fields in the index in advance.This allows us to manually set the type of the field (where Elasticsearch might infer something else) or set the analyzer used,or multiple other options

To do this we add mappings:

client.execute {    createIndex("cities").mapping(        properties(            keywordField("id"),            textField("name").boost(4),            textField("content"),            keywordField("country"),            keywordField("continent")        )    )}

Then Elasticsearch is preconfigured with those mappings for those fields.It is still fully dynamic and other fields will be created as needed with default options. Only the fields specified will have their type preset.

Analyzers

Analyzers control how Elasticsearch parses the fields for indexing. For example, you might decide that you wantwhitespace to be important, so that "band of brothers" is indexed as a single "word" rather than the default which isto split on whitespace. There are many advanced options available in analayzers. Elasticsearch also allows us to createcustom analyzers. For more detailssee the documentation on analyzers.

Indexing

To index a document we need to specify the index and type and optionally we can set an id.If we don't include an id then elasticsearch will generate one for us.We must also include at least one field. Fields are specified as standard tuples.

client.execute {  indexInto("cities").id("123").fields("name"->"London","country"->"United Kingdom","continent"->"Europe","status"->"Awesome"  )}

There are many additional options we can set such as routing, version, parent, timestamp and op type.Seeofficial documentation for additional options, all ofwhich exist in the DSL as keywords that reflect their name in the official API.

Indexable Typeclass

Sometimes it is useful to create documents directly from your domain model instead of manually creating maps of fields.To achieve this, elastic4s provides theIndexable typeclass.

If you provide an implicit instance ofIndexable[T] in scope for anyclass T that you wish to index, and then you can invokedoc(t) on theIndexRequest.

For example:

// a simple example of a domain modelcaseclassCharacter(name:String,location:String)// turn instances of characters into jsonimplicitobjectCharacterIndexableextendsIndexable[Character] {overridedefjson(t:Character):String=s""" { "name" : "${t.name}", "location" : "${t.location}" }"""}// now index requests can directly use characters as docsvaljonsnow=Character("jon snow","the wall")client.execute {  indexInto("gameofthrones").doc(jonsnow)}

Some people prefer to write typeclasses manually for the types they need to support. Other people like to just haveit done automagically. For the latter, elastic4s provides extensions for the well known Scala Json libraries thatcan be used to generate Json generically.

To use this, add the import for your chosen library below and bring the implicits into scope. Then you can pass any case classinstance todoc and anIndexable will be derived automatically.

LibraryElastic4s ModuleImport
Jacksonelastic4s-json-jacksonimport ElasticJackson.Implicits._
Json4selastic4s-json-json4simport ElasticJson4s.Implicits._
Circeelastic4s-json-circeimport io.circe.generic.auto._
import com.sksamuel.elastic4s.circe._
PlayJsonelastic4s-json-playimport com.sksamuel.elastic4s.playjson._
Spray Jsonelastic4s-json-sprayimport com.sksamuel.elastic4s.sprayjson._
ZIO 1.0 Jsonelastic4s-json-zio-1import com.sksamuel.elastic4s.ziojson._
ZIO 2.0 Jsonelastic4s-json-zioimport com.sksamuel.elastic4s.ziojson._

Searching

To execute asearch in elastic4s, we need to pass an instance ofSearchRequest to our client.

One way to do this is to invokesearch and pass in the index name. From there, you can callquery and pass in the type of query you want to perform.

For example, to perform a simple text search, where the query is parsed from a single string we can do:

client.execute {  search("cities").query("London")}

For full details on creating queries and other search capabilities such source filtering and aggregations, please readthis.

Multisearch

Multiple search requests can be executed in a single call using themultisearch request type. This is the search equivilent of the bulk request.

HitReader Typeclass

By default Elasticsearch search responses contain an array ofSearchHit instances which contain things like the id,index, type, version, etc as well as the document source as a string or map. Elastic4s provides a means to convert theseback to meaningful domain types quite easily using theHitReader[T] typeclass.

Provide an implementation of this typeclass, as an in scope implicit, for whatever type you wish to marshall search responses into, and then you can callto[T] orsafeTo[T] on the response.The difference betweento andsafeTo is thatto will drop any errors and just return successful conversions, whereas safeTo returnsa sequence ofEither[Throwable, T].

A full example:

caseclassCharacter(name:String,location:String)implicitobjectCharacterHitReaderextendsHitReader[Character] {overridedefread(hit:Hit):Either[Throwable,Character]= {valsource= hit.sourceAsMapRight(Character(source("name").toString, source("location").toString))  }}valresp= client.execute {  search("gameofthrones").query("kings landing")}.await// don't block in real code// .to[Character] will look for an implicit HitReader[Character] in scope// and then convert all the hits into Characters for us.valcharacters:Seq[Character]= resp.result.to[Character]

This is basically the inverse of theIndexable typeclass. And just like Indexable, the json modules provide implementationsout of the box for any types. The imports are the same as for the Indexable typeclasses.

As a bonus feature of the Jackson implementation, if your domain object has fields called_timestamp,_id,_type,_index, or_version then those special fields will be automatically populated as well.

Highlighting

Elasticsearch can annotate results to show which part of the results matched the queries by using highlighting.Just think when you're in google and you see the snippets underneath your results - that's what highlighting does.

We can use this very easily, just add a highlighting definition to your search request, where you set the field or fields to be highlighted. Viz:

search("music").query("kate bush").highlighting (  highlight("body").fragmentSize(20))

All very straightforward. There are many options you can use to tweak the results. In the example above I havesimply set the snippets to be taken from the field called "body" and to have max length 20. You can set the number of fragments to return, seperate queries to generate them and other things. See the elasticsearch page onhighlighting for more info.

Get / Multiget

Aget request allows us to retrieve a document directly by id.

client.execute {  get("bands","coldplay")}

We can fetch multiple documents at once using themultiget request.

Deleting

In elasticsearch we can delete based on an id, or based on a query (which can match multiple documents).

See more aboutdelete.

Updates

We can update existing documents without having to do a full index, by updating a partial set of fields.We canupdate-by-id orupdate-by-query.

For more details see theupdate page.

More like this

If you want to return documents that are "similar" to a current document we can do that very easily with the more like this query.

client.execute {  search("drinks").query {    moreLikeThisQuery("name").likeTexts("coors","beer","molson").minTermFreq(1).minDocFreq(1)  }}

For all the options seehere.

Count

Acount request executes a query and returns a count of the number of matching documents for that query.

Bulk Operations

Elasticsearch is fast. Roundtrips are not. Sometimes we want to wrestle every last inch of performance and a useful wayto do this is to batch up requests. We can do this in elasticsearch via the bulk API. A bulk request wraps index,delete and update requests in a single request.

client.execute {  bulk(    indexInto("bands").fields("name"->"coldplay"),// one index request    deleteById("bands","123"),// a delete by id request    indexInto("bands").fields(// second index request"name"->"elton john","best_album"->"tumbleweed connection"    )  )}

A single HTTP request is now needed for 3 operations. In addition Elasticsearch can now optimize the requests,by combining inserts or using aggressive caching.

For full details see thedocs on bulk operations.

Show Query JSON

It can be useful to see the json output of requests in case you wish to tinker with the request in a REST client or your browser. It can be much easier to tweak a complicated query when you have the instant feedback of the HTTP interface.

Elastic4s makes it easy to get this json where possible. Simply invoke theshow method on the client with a request to get back a json string. Eg:

valjson= client.show {  search("music").query("coldplay")}println(json)

Not all requests have a json body. For exampleget-by-id is modelled purely by http query parameters, there is no json body to output. And some requests aren't supported by the show method - you will get an implicit not found error during compliation if that is the case

Aliases

Anindex alias is a logical name used to reference one or more indices. Most Elasticsearch APIs accept an index alias in place of an index name.

For elastic4s syntax for aliasesclick here.

Explain

Anexplain request computes a score explanation for a query and a specific document. This can give useful feedback whether a document matches or didn’t match a specific query.

For elastic4s syntax for explainclick here.

Validate Query

The validate query request type allows you to check a query is valid before executing it.

See the syntaxhere.

Force Merge

Merging reduces the number of segments in each shard by merging some of them together, and also frees up the space used by deleted documents. Merging normally happens automatically, but sometimes it is useful to trigger a merge manually.

See the syntaxhere.

Cluster APIs

Elasticsearch supports querying the state of the cluster itself, to find out information on nodes, shards, indices, tasks and so on. See the range of cluster APIshere.

Search Iterator

Sometimes you may wish to iterate over all the results in a search, without worrying too much about handling futures, and re-requestingvia a scroll. TheSearchIterator will do this for you, although it will block between requests. A search iterator is just an implementationofscala.collection.Iterator backed by elasticsearch queries.

To create one, use the iterate method on the companion object, passing in the http client, and a search request to execute. Thesearch request must specify a keep alive value (which is used by elasticsearch for scrolling).

implicitvalreader:HitReader[MyType]=  ...valiterator=SearchIterator.iterate[MyType](client, search(index).matchAllQuery.keepAlive("1m").size(50))iterator.foreach(println)

For instance, in the above we are bringing back all documents in the index, 50 results at a time, marshalled intoinstances ofMyType using the implicitHitReader (see the section on HitReaders). If you want just the rawelasticsearchHit object, then useSearchIterator.hits

Note: Whenever the results in a particularbatch have been iterated on, theSearchIterator will then execute another query for the next batch and block waiting on that query.So if you are looking for a pure non blocking solution, consider the reactive streams implementation. However, if you just want aquick and simple way to iterate over some data without bringing back all the results at onceSearchIterator is perfect.

Elastic Reactive Streams

Elastic4s has an implementation of thereactive streams api for both publishing and subscribing that is builtusing Akka. To use this, you need to add a dependency on the elastic4s-streams module.

There are two things you can do with the reactive streams implementation. You can create an elastic subscriber, and have thatstream data from some publisher into elasticsearch. Or you can create an elastic publisher and have documents streamed out to subscribers.

For full details read thereactivestreams documentation

Using Elastic4s in your project

For gradle users, add (replace 2.12 with 2.13 for Scala 2.13):

compile'nl.gn0s1s:elastic4s-core_2.12:x.x.x'

For sbt users add:

libraryDependencies+="nl.gn0s1s"%%"elastic4s-core"%"x.x.x"

For Maven users add (replace 2.12 with 2.13 for Scala 2.13):

<dependency>    <groupId>nl.gn0s1s</groupId>    <artifactId>elastic4s-core_2.12</artifactId>    <version>x.x.x</version></dependency>

Check for the latest released versions onmaven central

Building and Testing

This project is built with sbt. So to build with:

sbt compile

And to test:

sbttest

The project is currentlycross-built against Scala 2.12, 2.13 and 3, when preparing a pull request the above commands should be run with thesbt+ modifier to compile and testagainst all versions. For example:sbt +compile.

For the tests to work you will need to run a local elastic instance on port 39227,with security enabled. One easy way of doing this is to use docker (via docker compose):docker compose up

Used By

  • AOE
  • Barclays Bank
  • BlueRootLabs
  • Canal+
  • Deutsche Bank
  • Goldman Sachs
  • Graphflow
  • HMRC
  • HSBC
  • Hotel Urbano
  • Immobilien Scout
  • Iterable
  • Jusbrasil
  • Lenses
  • Lichess
  • Mapp
  • Rostelecom-Solar
  • Scaladex
  • Shazaam
  • ShopRunner
  • Soundcloud
  • Starmind
  • Twitter
  • Wehkamp Retail Group

Raise a PR to add your company here

Contributions

Contributions to elastic4s are always welcome. Good ways to contribute include:

  • Raising bugs and feature requests
  • Fixing bugs and enhancing the DSL
  • Improving the performance of elastic4s
  • Adding to the documentation

License

This software is licensed under the Apache 2 license, quoted below.Copyright 2013-2016 Stephen SamuelLicensed under the Apache License, Version 2.0 (the "License"); you may notuse this file except in compliance with the License. You may obtain a copy ofthe License at    http://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS, WITHOUTWARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See theLicense for the specific language governing permissions and limitations underthe License.

[8]ページ先頭

©2009-2025 Movatter.jp