Elasticsearch Batch Source
The Elasticsearch Batch source plugin is available in the Hub.
Plugin version: 1.10.1
Pulls documents from Elasticsearch according to the query specified by the user and converts each document to a Structured Record with the fields and schema specified by the user. The Elasticsearch server should be running prior to creating the application.
This source is used whenever you need to read data from Elasticsearch. For example, you may want to read in an index and type from Elasticsearch and store the data in an HBase table.
Configuration
Property | Macro Enabled? | Description |
---|---|---|
Reference Name | No | Required. Name used to uniquely identify this source for lineage, annotating metadata, etc. |
Elasticsearch Host | Yes | Required. The hostname and port for the Elasticsearch instance. |
Index | Yes | Required. The name of the index to query. |
Type | Yes | Required. The name of the type where the data is stored. |
Query | Yes | Required. The query to use to import data from the specified index and type; see Elasticsearch for additional query examples. |
Additional Properties | Yes | Optional. Additional properties to use with the es-hadoop client when reading the data, documented at elastic.co. |
Example
This example connects to Elasticsearch, which is running locally, and reads in records in the specified index (megacorp) and type (employee), which match the query to (in this case) select all records. All data from the index will be read on each run:
Property | Value |
---|---|
Reference Name |
|
Elasticsearch Host |
|
Index |
|
Type |
|
Query |
|
Related content
Created in 2020 by Google Inc.