Elasticsearch Sink

The Elasticsearch sink plugin is available in the Hub.

Plugin version: 1.10.1

Takes the Structured Record from the input source and converts it to a JSON string, then indexes it in Elasticsearch using the index, type, and ID Field specified by the user. The Elasticsearch server should be running prior to creating the application.

Use this sink whenever you need to write to an Elasticsearch server. For example, you might want to parse a file and read its contents into Elasticsearch, which you can achieve with a stream batch source and Elasticsearch as a sink.

Configuration

Property

Macro Enabled?

Description

Property

Macro Enabled?

Description

Reference Name

No

Required. Name used to uniquely identify this sink for lineage, annotating metadata, etc.

Elasticsearch Host

Yes

Required. The hostname and port for the Elasticsearch instance. 

Index

Yes

Required. The name of the index where the data will be stored; if the index does not already exist, it will be created using Elasticsearch’s default properties. 

Type

Yes

Required. The name of the type where the data will be stored; if it does not already exist, it will be created.

ID Field

Yes

Required. The field that will determine the id for the document. It should match a fieldname in the Structured Record of the input.

Additional Properties

Yes

Optional. Additional properties to use with the es-Hadoop client when writing the data, documented at elastic.co.

Example

This example connects to Elasticsearch, which is running locally, and writes the data to the specified index (megacorp) and type (employee). The data is indexed using the id field in the record. Each run, the documents will be updated if they are still present in the source:

Property

Value

Property

Value

Reference Name

essink1

Elasticsearch Host

localhost:9200

Index

megacorp

Type

employee

ID Field

id



Created in 2020 by Google Inc.