Versions Compared
compared with
Key
- This line was added.
- This line was removed.
- Formatting was changed.
Introduction
Google drive plugins will help users move entire files from source to destination. Along the way, users can potentially run transformations on unstructured data such as images, audio and video as well.
User Storie(s)
- As a pipeline developer, I want to move all files from a Google drive directory to a different destination
- As a pipeline developer, I want to move all files from a Google drive directory that satisfy a filter to a different destination
- As a pipeline developer, I want to pull all images from a Google drive directory, so that I can process them using image recognition APIs
- As a pipeline developer, I want to pull all audio and video files from a Google drive directory, so that I can process them to extract metadata and/or generate transcripts, or apply other enrichments.
- As a pipeline developer, I want to move all files from an FTP source into Google drive.
Plugin Type
- Batch Source
- Batch Sink
- Real-time Source
- Real-time Sink
- Action
- Post-Run Action
- Aggregate
- Join
- Spark Model
- Spark Compute
Configurables
This section defines properties that are configurable for this plugin.
Source
Option level | User Facing Name | Type | Description | Optional | Constraints | Default value |
---|---|---|---|---|---|---|
Basic | App Id | String | Oauth2 app id | No | ||
Access Token | String | OAuth2 access token | No | |||
Directory identifier | String | ID is the last part of the URL, such as https://drive.google.com/drive/folders/0B2kqcwp2ycGZanhSR3JmREw5VTV | no | |||
Filter | String | A filter that can be applied to the files in the selected directory. Filters follow the Google Drive Filter Syntax | Yes | |||
Modification date range | String | In addition to the filter specified above, also filter files to only pull those that were modified between the date range | Yes | |||
File properties | Multi-select | Properties which should be get for each file in directory. Allowed names can be get from Google Drive API: Files | Yes | |||
File types to pull | Multi-select | Types of files should be pulled from specified directory. | Yes | binary | ||
Advanced | Maximum body partition size per split | Number | Maximum body partition size for each partition specified in bytes. Default 0 value means unlimited. | Yes | 0 | |
Exporting | Google Documents export format | Select | MIME type for Google Documents. Allowed values from Downloading Google Documents. | Yes | text/plain | |
Google Spreadsheets export format | Select | MIME type for Google Spreadsheets. | Yes | text/csv | ||
Google Drawings export format | Select | MIME type for Google Drawings. | Yes | image/svg+xml | ||
Google Presentations export format | Select | MIME type for Google Presentations. | Yes | text/plain |
Sink
User Facing Name | Type | Description | Optional | Constraints |
---|---|---|---|---|
App Id | String | Oauth2 app id | No | |
Access Token | String | OAuth2 access token | No | |
Directory identifier | String | ID is the last part of the URL, such as https://drive.google.com/drive/folders/0B2kqcwp2ycGZanhSR3JmREw5VTV | no |
Design / Implementation Tips
- Tip #1
- Tip #2
Design
Approach(s)
Properties
Security
Limitation(s)
Future Work
- Some future work – HYDRATOR-99999
- Another future work – HYDRATOR-99999
Test Case(s)
- Test case #1
- Test case #2
Sample Pipeline
Please attach one or more sample pipeline(s) and associated data.
Pipeline #1
Pipeline #2
Table of Contents
Table of Contents style circle
Checklist
- User stories documented
- User stories reviewed
- Design documented
- Design reviewed
- Feature merged
- Examples and guides
- Integration tests
- Documentation for feature
- Short video demonstrating the feature