Elasticsearch Sink
The Elasticsearch sink plugin is available in the Hub.
Plugin version: 1.10.1
Takes the Structured Record from the input source and converts it to a JSON string, then indexes it in Elasticsearch using the index, type, and ID Field specified by the user. The Elasticsearch server should be running prior to creating the application.
Use this sink whenever you need to write to an Elasticsearch server. For example, you might want to parse a file and read its contents into Elasticsearch, which you can achieve with a stream batch source and Elasticsearch as a sink.
Configuration
Property | Macro Enabled? | Description |
---|---|---|
Reference Name | No | Required. Name used to uniquely identify this sink for lineage, annotating metadata, etc. |
Elasticsearch Host | Yes | Required. The hostname and port for the Elasticsearch instance. |
Index | Yes | Required. The name of the index where the data will be stored; if the index does not already exist, it will be created using Elasticsearch’s default properties. |
Type | Yes | Required. The name of the type where the data will be stored; if it does not already exist, it will be created. |
ID Field | Yes | Required. The field that will determine the id for the document. It should match a fieldname in the Structured Record of the input. |
Additional Properties | Yes | Optional. Additional properties to use with the es-Hadoop client when writing the data, documented at elastic.co. |
Example
This example connects to Elasticsearch, which is running locally, and writes the data to the specified index (megacorp) and type (employee). The data is indexed using the id field in the record. Each run, the documents will be updated if they are still present in the source:
Property | Value |
---|---|
Reference Name | essink1 |
Elasticsearch Host |
|
Index |
|
Type |
|
ID Field |
|
Created in 2020 by Google Inc.