Movatterモバイル変換


[0]ホーム

URL:


Loading
  1. Elastic Docs/
  2. Reference/
  3. Elasticsearch/
  4. Text analysis components/
  5. Token filter reference

Elision token filter

Removes specifiedelisions from the beginning of tokens. For example, you can use this filter to changel'avion toavion.

When not customized, the filter removes the following French elisions by default:

l',m',t',qu',n',s',j',d',c',jusqu',quoiqu',lorsqu',puisqu'

Customized versions of this filter are included in several of Elasticsearch's built-inlanguage analyzers:

This filter uses Lucene’sElisionFilter.

The followinganalyze API request uses theelision filter to removej' fromj’examine près du wharf:

GET _analyze{  "tokenizer" : "standard",  "filter" : ["elision"],  "text" : "j’examine près du wharf"}

The filter produces the following tokens:

[ examine, près, du, wharf ]

The followingcreate index API request uses theelision filter to configure a newcustom analyzer.

PUT /elision_example{  "settings": {    "analysis": {      "analyzer": {        "whitespace_elision": {          "tokenizer": "whitespace",          "filter": [ "elision" ]        }      }    }  }}

articles
(Required*, array of string) List of elisions to remove.

To be removed, the elision must be at the beginning of a token and be immediately followed by an apostrophe. Both the elision and apostrophe are removed.

For customelision filters, either this parameter orarticles_path must be specified.

articles_path
(Required*, string) Path to a file that contains a list of elisions to remove.

This path must be absolute or relative to theconfig location, and the file must be UTF-8 encoded. Each elision in the file must be separated by a line break.

To be removed, the elision must be at the beginning of a token and be immediately followed by an apostrophe. Both the elision and apostrophe are removed.

For customelision filters, either this parameter orarticles must be specified.

articles_case
(Optional, Boolean) Iftrue, elision matching is case insensitive. Iffalse, elision matching is case sensitive. Defaults tofalse.

To customize theelision filter, duplicate it to create the basis for a new custom token filter. You can modify the filter using its configurable parameters.

For example, the following request creates a custom case-insensitiveelision filter that removes thel',m',t',qu',n',s', andj' elisions:

PUT /elision_case_insensitive_example{  "settings": {    "analysis": {      "analyzer": {        "default": {          "tokenizer": "whitespace",          "filter": [ "elision_case_insensitive" ]        }      },      "filter": {        "elision_case_insensitive": {          "type": "elision",          "articles": [ "l", "m", "t", "qu", "n", "s", "j" ],          "articles_case": true        }      }    }  }}

[8]ページ先頭

©2009-2026 Movatter.jp