Movatterモバイル変換


[0]ホーム

URL:


Loading
  1. Elastic Docs/
  2. Reference/
  3. Elasticsearch/
  4. Elasticsearch plugins/
  5. Analysis plugins/
  6. ICU analysis plugin

ICU normalization token filter

Normalizes characters as explainedhere. It registers itself as theicu_normalizer token filter, which is available to all indices without any further configuration. The type of normalization can be specified with thename parameter, which acceptsnfc,nfkc, andnfkc_cf (default).

Which letters are normalized can be controlled by specifying theunicode_set_filter parameter, which accepts aUnicodeSet.

You should probably prefer theNormalization character filter.

Here are two examples, the default usage and a customised token filter:

PUT icu_sample{  "settings": {    "index": {      "analysis": {        "analyzer": {          "nfkc_cf_normalized": {            "tokenizer": "icu_tokenizer",            "filter": [              "icu_normalizer"            ]          },          "nfc_normalized": {            "tokenizer": "icu_tokenizer",            "filter": [              "nfc_normalizer"            ]          }        },        "filter": {          "nfc_normalizer": {            "type": "icu_normalizer",            "name": "nfc"          }        }      }    }  }}
  1. Uses the defaultnfkc_cf normalization.
  2. Uses the customizednfc_normalizer token filter, which is set to usenfc normalization.

[8]ページ先頭

©2009-2026 Movatter.jp