bulk
packageThis package is not in the latest version of its module.
Details
Validgo.mod file
The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license
Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version
Modules with tagged versions give importers more predictable builds.
Stable version
When a project reaches major version v1 it is considered stable.
- Learn more about best practices
Repository
Links
Documentation¶
Overview¶
Bulk index or delete documents.Perform multiple `index`, `create`, `delete`, and `update` actions in asingle request.This reduces overhead and can greatly increase indexing speed.
If the Elasticsearch security features are enabled, you must have thefollowing index privileges for the target data stream, index, or index alias:
* To use the `create` action, you must have the `create_doc`, `create`,`index`, or `write` index privilege. Data streams support only the `create`action.* To use the `index` action, you must have the `create`, `index`, or `write`index privilege.* To use the `delete` action, you must have the `delete` or `write` indexprivilege.* To use the `update` action, you must have the `index` or `write` indexprivilege.* To automatically create a data stream or index with a bulk API request, youmust have the `auto_configure`, `create_index`, or `manage` index privilege.* To make the result of a bulk operation visible to search using the`refresh` parameter, you must have the `maintenance` or `manage` indexprivilege.
Automatic data stream creation requires a matching index template with datastream enabled.
The actions are specified in the request body using a newline delimited JSON(NDJSON) structure:
```action_and_meta_data\noptional_source\naction_and_meta_data\noptional_source\n....action_and_meta_data\noptional_source\n```
The `index` and `create` actions expect a source on the next line and havethe same semantics as the `op_type` parameter in the standard index API.A `create` action fails if a document with the same ID already exists in thetargetAn `index` action adds or replaces a document as necessary.
NOTE: Data streams support only the `create` action.To update or delete a document in a data stream, you must target the backingindex containing the document.
An `update` action expects that the partial doc, upsert, and script and itsoptions are specified on the next line.
A `delete` action does not expect a source on the next line and has the samesemantics as the standard delete API.
NOTE: The final line of data must end with a newline character (`\n`).Each newline character may be preceded by a carriage return (`\r`).When sending NDJSON data to the `_bulk` endpoint, use a `Content-Type` headerof `application/json` or `application/x-ndjson`.Because this format uses literal newline characters (`\n`) as delimiters,make sure that the JSON actions and sources are not pretty printed.
If you provide a target in the request path, it is used for any actions thatdon't explicitly specify an `_index` argument.
A note on the format: the idea here is to make processing as fast aspossible.As some of the actions are redirected to other shards on other nodes, only`action_meta_data` is parsed on the receiving node side.
Client libraries using this protocol should try and strive to do somethingsimilar on the client side, and reduce buffering as much as possible.
There is no "correct" number of actions to perform in a single bulk request.Experiment with different settings to find the optimal size for yourparticular workload.Note that Elasticsearch limits the maximum size of a HTTP request to 100mb bydefault so clients must ensure that no request exceeds this size.It is not possible to index a single document that exceeds the size limit, soyou must pre-process any such documents into smaller pieces before sendingthem to Elasticsearch.For instance, split documents into pages or chapters before indexing them, orstore raw binary data in a system outside Elasticsearch and replace the rawdata with a link to the external system in the documents that you send toElasticsearch.
**Client suppport for bulk requests**
Some of the officially supported clients provide helpers to assist with bulkrequests and reindexing:
* Go: Check out `esutil.BulkIndexer`* Perl: Check out `Search::Elasticsearch::Client::5_0::Bulk` and`Search::Elasticsearch::Client::5_0::Scroll`* Python: Check out `elasticsearch.helpers.*`* #"#pkg-index" title="Go to Index" aria-label="Go to Index">¶
- Variables
- type Bulk
- func (r *Bulk) CreateOp(op types.CreateOperation, doc interface{}) error
- func (r *Bulk) DeleteOp(op types.DeleteOperation) error
- func (r Bulk) Do(providedCtx context.Context) (*Response, error)
- func (r *Bulk) ErrorTrace(errortrace bool) *Bulk
- func (r *Bulk) FilterPath(filterpaths ...string) *Bulk
- func (r *Bulk) Header(key, value string) *Bulk
- func (r *Bulk) HttpRequest(ctx context.Context) (*http.Request, error)
- func (r *Bulk) Human(human bool) *Bulk
- func (r *Bulk) IncludeSourceOnError(includesourceonerror bool) *Bulk
- func (r *Bulk) Index(index string) *Bulk
- func (r *Bulk) IndexOp(op types.IndexOperation, doc interface{}) error
- func (r *Bulk) ListExecutedPipelines(listexecutedpipelines bool) *Bulk
- func (r Bulk) Perform(providedCtx context.Context) (*http.Response, error)
- func (r *Bulk) Pipeline(pipeline string) *Bulk
- func (r *Bulk) Pretty(pretty bool) *Bulk
- func (r *Bulk) Raw(raw io.Reader) *Bulk
- func (r *Bulk) Refresh(refresh refresh.Refresh) *Bulk
- func (r *Bulk) Request(req *Request) *Bulk
- func (r *Bulk) RequireAlias(requirealias bool) *Bulk
- func (r *Bulk) RequireDataStream(requiredatastream bool) *Bulk
- func (r *Bulk) Routing(routing string) *Bulk
- func (r *Bulk) SourceExcludes_(fields ...string) *Bulk
- func (r *Bulk) SourceIncludes_(fields ...string) *Bulk
- func (r *Bulk) Source_(sourceconfigparam string) *Bulk
- func (r *Bulk) Timeout(duration string) *Bulk
- func (r *Bulk) UpdateOp(op types.UpdateOperation, doc interface{}, update *types.UpdateAction) error
- func (r *Bulk) WaitForActiveShards(waitforactiveshards string) *Bulk
- type NewBulk
- type Request
- type Response
Constants¶
This section is empty.
Variables¶
var ErrBuildPath =errors.New("cannot build path, check for missing path parameters")ErrBuildPath is returned in case of missing parameters within the build of the request.
Functions¶
This section is empty.
Types¶
typeBulk¶
type Bulk struct {// contains filtered or unexported fields}funcNew¶
func New(tpelastictransport.Interface) *Bulk
Bulk index or delete documents.Perform multiple `index`, `create`, `delete`, and `update` actions in asingle request.This reduces overhead and can greatly increase indexing speed.
If the Elasticsearch security features are enabled, you must have thefollowing index privileges for the target data stream, index, or index alias:
* To use the `create` action, you must have the `create_doc`, `create`,`index`, or `write` index privilege. Data streams support only the `create`action.* To use the `index` action, you must have the `create`, `index`, or `write`index privilege.* To use the `delete` action, you must have the `delete` or `write` indexprivilege.* To use the `update` action, you must have the `index` or `write` indexprivilege.* To automatically create a data stream or index with a bulk API request, youmust have the `auto_configure`, `create_index`, or `manage` index privilege.* To make the result of a bulk operation visible to search using the`refresh` parameter, you must have the `maintenance` or `manage` indexprivilege.
Automatic data stream creation requires a matching index template with datastream enabled.
The actions are specified in the request body using a newline delimited JSON(NDJSON) structure:
```action_and_meta_data\noptional_source\naction_and_meta_data\noptional_source\n....action_and_meta_data\noptional_source\n```
The `index` and `create` actions expect a source on the next line and havethe same semantics as the `op_type` parameter in the standard index API.A `create` action fails if a document with the same ID already exists in thetargetAn `index` action adds or replaces a document as necessary.
NOTE: Data streams support only the `create` action.To update or delete a document in a data stream, you must target the backingindex containing the document.
An `update` action expects that the partial doc, upsert, and script and itsoptions are specified on the next line.
A `delete` action does not expect a source on the next line and has the samesemantics as the standard delete API.
NOTE: The final line of data must end with a newline character (`\n`).Each newline character may be preceded by a carriage return (`\r`).When sending NDJSON data to the `_bulk` endpoint, use a `Content-Type` headerof `application/json` or `application/x-ndjson`.Because this format uses literal newline characters (`\n`) as delimiters,make sure that the JSON actions and sources are not pretty printed.
If you provide a target in the request path, it is used for any actions thatdon't explicitly specify an `_index` argument.
A note on the format: the idea here is to make processing as fast aspossible.As some of the actions are redirected to other shards on other nodes, only`action_meta_data` is parsed on the receiving node side.
Client libraries using this protocol should try and strive to do somethingsimilar on the client side, and reduce buffering as much as possible.
There is no "correct" number of actions to perform in a single bulk request.Experiment with different settings to find the optimal size for yourparticular workload.Note that Elasticsearch limits the maximum size of a HTTP request to 100mb bydefault so clients must ensure that no request exceeds this size.It is not possible to index a single document that exceeds the size limit, soyou must pre-process any such documents into smaller pieces before sendingthem to Elasticsearch.For instance, split documents into pages or chapters before indexing them, orstore raw binary data in a system outside Elasticsearch and replace the rawdata with a link to the external system in the documents that you send toElasticsearch.
**Client suppport for bulk requests**
Some of the officially supported clients provide helpers to assist with bulkrequests and reindexing:
* Go: Check out `esutil.BulkIndexer`* Perl: Check out `Search::Elasticsearch::Client::5_0::Bulk` and`Search::Elasticsearch::Client::5_0::Scroll`* Python: Check out `elasticsearch.helpers.*`* #"https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html">https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html
func (*Bulk)CreateOp¶added inv8.10.0
func (r *Bulk) CreateOp(optypes.CreateOperation, doc interface{})error
CreateOp is a helper function to add a CreateOperation to the current bulk request.doc argument can be a []byte, json.RawMessage or a struct.
func (*Bulk)DeleteOp¶added inv8.10.0
func (r *Bulk) DeleteOp(optypes.DeleteOperation)error
DeleteOp is a helper function to add a DeleteOperation to the current bulk request.
func (Bulk)Do¶
Do runs the request through the transport, handle the response and returns a bulk.Response
func (*Bulk)ErrorTrace¶added inv8.14.0
ErrorTrace When set to `true` Elasticsearch will include the full stack trace of errorswhen they occur.API name: error_trace
func (*Bulk)FilterPath¶added inv8.14.0
FilterPath Comma-separated list of filters in dot notation which reduce the responsereturned by Elasticsearch.API name: filter_path
func (*Bulk)HttpRequest¶
HttpRequest returns the http.Request object built from thegiven parameters.
func (*Bulk)Human¶added inv8.14.0
Human When set to `true` will return statistics in a format suitable for humans.For example `"exists_time": "1h"` for humans and`"eixsts_time_in_millis": 3600000` for computers. When disabled the humanreadable values will be omitted. This makes sense for responses beingconsumedonly by machines.API name: human
func (*Bulk)IncludeSourceOnError¶added inv8.18.0
IncludeSourceOnError True or false if to include the document source in the error message in caseof parsing errors.API name: include_source_on_error
func (*Bulk)Index¶
Index The name of the data stream, index, or index alias to perform bulk actionson.API Name: index
func (*Bulk)IndexOp¶added inv8.10.0
func (r *Bulk) IndexOp(optypes.IndexOperation, doc interface{})error
IndexOp is a helper function to add an IndexOperation to the current bulk request.doc argument can be a []byte, json.RawMessage or a struct.
func (*Bulk)ListExecutedPipelines¶added inv8.17.0
ListExecutedPipelines If `true`, the response will include the ingest pipelines that were run foreach index or create.API name: list_executed_pipelines
func (Bulk)Perform¶
Perform runs the http.Request through the provided transport and returns an http.Response.
func (*Bulk)Pipeline¶
Pipeline The pipeline identifier to use to preprocess incoming documents.If the index has a default ingest pipeline specified, setting the value to`_none` turns off the default ingest pipeline for this request.If a final pipeline is configured, it will always run regardless of the valueof this parameter.API name: pipeline
func (*Bulk)Pretty¶added inv8.14.0
Pretty If set to `true` the returned JSON will be "pretty-formatted". Only usethis option for debugging only.API name: pretty
func (*Bulk)Raw¶
Raw takes a json payload as input which is then passed to the http.RequestIf specified Raw takes precedence on Request method.
func (*Bulk)Refresh¶
Refresh If `true`, Elasticsearch refreshes the affected shards to make this operationvisible to search.If `wait_for`, wait for a refresh to make this operation visible to search.If `false`, do nothing with refreshes.Valid values: `true`, `false`, `wait_for`.API name: refresh
func (*Bulk)RequireAlias¶
RequireAlias If `true`, the request's actions must target an index alias.API name: require_alias
func (*Bulk)RequireDataStream¶added inv8.17.0
RequireDataStream If `true`, the request's actions must target a data stream (existing or to becreated).API name: require_data_stream
func (*Bulk)Routing¶
Routing A custom value that is used to route operations to a specific shard.API name: routing
func (*Bulk)SourceExcludes_¶
SourceExcludes_ A comma-separated list of source fields to exclude from the response.You can also use this parameter to exclude fields from the subset specifiedin `_source_includes` query parameter.If the `_source` parameter is `false`, this parameter is ignored.API name: _source_excludes
func (*Bulk)SourceIncludes_¶
SourceIncludes_ A comma-separated list of source fields to include in the response.If this parameter is specified, only these source fields are returned.You can exclude fields from this subset using the `_source_excludes` queryparameter.If the `_source` parameter is `false`, this parameter is ignored.API name: _source_includes
func (*Bulk)Source_¶
Source_ Indicates whether to return the `_source` field (`true` or `false`) orcontains a list of fields to return.API name: _source
func (*Bulk)Timeout¶
Timeout The period each action waits for the following operations: automatic indexcreation, dynamic mapping updates, and waiting for active shards.The default is `1m` (one minute), which guarantees Elasticsearch waits for atleast the timeout before failing.The actual wait time could be longer, particularly when multiple waits occur.API name: timeout
func (*Bulk)UpdateOp¶added inv8.10.0
func (r *Bulk) UpdateOp(optypes.UpdateOperation, doc interface{}, update *types.UpdateAction)error
UpdateOp is a helper function to add an UpdateOperation with and UpdateAction to the current bulk request.update is optional, if both doc and update.Doc are provided, update.Doc has precedence.
func (*Bulk)WaitForActiveShards¶
WaitForActiveShards The number of shard copies that must be active before proceeding with theoperation.Set to `all` or any positive integer up to the total number of shards in theindex (`number_of_replicas+1`).The default is `1`, which waits for each primary shard to be active.API name: wait_for_active_shards
typeNewBulk¶
type NewBulk func() *Bulk
NewBulk type alias for index.
funcNewBulkFunc¶
func NewBulkFunc(tpelastictransport.Interface)NewBulk
NewBulkFunc returns a new instance of Bulk with the provided transport.Used in the index of the library this allows to retrieve every apis in once place.
typeRequest¶added inv8.11.0
type Request = []any
Request holds the request body struct for the package bulk
typeResponse¶
type Response struct {// Errors If `true`, one or more of the operations in the bulk request did not complete// successfully.Errorsbool `json:"errors"`IngestTook *int64 `json:"ingest_took,omitempty"`// Items The result of each operation in the bulk request, in the order they were// submitted.Items []map[operationtype.OperationType]types.ResponseItem `json:"items"`// Took The length of time, in milliseconds, it took to process the bulk request.Tookint64 `json:"took"`}Response holds the response body struct for the package bulk