Info |
---|
The Google Sheets batch source is available in the Hub. |
Plugin version: 1.4.12
Reads spreadsheets from specified Google Drive directory via Google Sheets API.
...
Property | Macro Enabled? | Version Introduced | Description | ||||||
---|---|---|---|---|---|---|---|---|---|
Reference Name | No | Required. Name used to uniquely identify this source for lineage, annotating metadata, etc. | |||||||
Directory Identifier | No | Required. Identifier of the source folder. This comes after
Then the Directory Identifier would be | |||||||
File Identifier | Yes | Identifier of the spreadsheet file. This comes after
| Filter | No | Optional. Filter that can be applied to the files in the selected directory. Filters follow the Google Drive filters syntax
Then the File Identifier would be | ||||
Sheets To Pull | Yes | Required. Filter that specifies set of sheets to process. For ‘numbers’ or ‘titles’ selections, user can populate specific values in the Sheets identifiers field. Default is all. | |||||||
Modification Date Range | No | Optional. Filter that narrows set of files by modified date range. User can select either among predefined or custom entered ranges. For Custom selection, the dates range can be specified via Start date and End date. Default is lifetime. | |||||||
Start Date | No | Start date for custom modification date range. Is shown only when Custom range is selected for Modification date range field. RFC3339 format, default timezone is UTC, e.g., 2012-06-04T12:00:00-08:00. | |||||||
End Date | No | End date for custom modification date range. Is shown only when Custom range is selected for Modification date range field. RFC3339 format, default timezone is UTC, e.g., 2012-06-04T12:00:00-08:00. | Sheets To Pull | Yes | Required. Filter that specifies set of sheets to process. For ‘numbers’ or ‘titles’ selections, user can populate specific values in the Sheets identifiers field. Default is all. | ||||
Sheets Identifiers | Yes | Optional. Set of sheets' numbers/titles to process. Is shown only when ‘titles’ or ‘numbers’ are selected for the Sheets to pull field. | |||||||
Authentication Type | No | Required. Type of authentication used to access Google API. OAuth2 and Service Account types are available. Make sure that:
OAuth2 client credentials can be generated on Google Cloud Credentials Page. For more information about OAuth2, see Google Drive API Documentation. Default is OAuth2. | |||||||
OAuth2: Client ID | NoYes | Optional. OAuth2 Client ID used to identify the application. | |||||||
OAuth2: Client Secret | NoYes | Optional. OAuth2 Client Secret used to access the authorization server. | |||||||
OAuth2: Refresh Token | No | Optional. OAuth2 Refresh Token to acquire new access tokens | Yes | OAuth2 refresh token for acquiring new access tokens. For more information, see the Google Drive API. | |||||
OAuth2: Access Token | Short lived access token used for connecting. | ||||||||
Service Account Type | Yes | 6.3.0/1.3.0 | Optional. Select one of the following options:
Make sure that the Google Drive Folder is shared with the specified service account email. | ||||||
Service Account File Path | Yes | Optional. Path on the local file system of the service account key used for authorization. Can be set to 'auto-detect' when running on a Dataproc cluster which needs to be created with the following scopes: When running on other clusters, the file must be present on every node in the cluster. Default is | |||||||
Service Account JSON | Yes | 1.4.0 | Optional. Contents of the service account JSON file. Service Account JSON can be generated on Google Cloud Service Account page. | ||||||
Extract Metadata | Yes | Required. Field to enable metadata extraction. Metadata extraction is useful when user wants to specify a header or a footer for a sheet. The rows in headers and footers are not available as data records. Instead, they are available in every record as a field called Default is No. | |||||||
Metadata Field Name | Yes | Required. Name of the record with metadata content. It is needed to distinguish metadata record from possible column with the same name. Default is No. | |||||||
First Header Row Index | Yes | Optional. Row number of the first row to be treated as header. | |||||||
Last Header Row Index | Yes | Optional. Row number of the last row to be treated as header. | |||||||
First Footer Row Index | Yes | Optional. Row number of the first row to be treated as footer. | |||||||
Last Footer Row Index | Yes | Optional. Row number of the last row to be treated as footer. | |||||||
Metadata Cells | Yes | Optional. Set of the cells for key-value pairs to extract as metadata from the specified metadata sections. Only shown if Extract metadata is set to true. The cell numbers should be within the header or footer. | |||||||
Text Formatting | Yes | Required. Output format for numeric sheet cells. In ‘Formatted values’ case the value will contain appropriate format of source cell e.g. ‘1.23$’, ‘123%’.” For ‘Values only’ only number value will be returned. Default is Values only. | |||||||
Skip Empty Data | Yes | Required. Field that allows skipping of empty structure records. Default is No. | |||||||
Add spreadsheet/sheet name fields | Yes | Optional. Toggle that defines if the source extends output schema with spreadsheet and sheet names. Default is No. | |||||||
Spreadsheet field name | Yes | Optional. Schema field name for spreadsheet name. | |||||||
Sheet field name | Yes | Optional. Schema field name for sheet name. | |||||||
Column Names Selection | Yes | Required. Source for column names. User can specify where from the plugin should get schema filed names. Are available following values: No column names - default sheet column names will be used (‘A’, ‘B’ etc.), Treat first row as column names - the plugin uses first row for schema defining and field names, Custom row as column names - as previous, but for custom row index. Default is Treat first row as column names. | |||||||
Custom Row Index For Column Names | Yes | Optional. Number of the row to be treated as a header. Only shown when the ‘Column Names Selection’ field is set to ‘Custom row as column names’ header. Default is 1. | |||||||
Last Data Column Index | Yes | Optional. Last column plugin will read as data. Default is 26. | |||||||
Last Data Row Index | Yes | Optional. Last row plugin will read as data. Default is 1000. | |||||||
Read Buffer Size | Yes | Optional. Number of rows the source reads with a single API request. Default is 100. |
...