Introduction

Cloud Vision plugins will allow users to use pre-trained Vision API models to detect emotion, understand text, and more. They will be useful in enriching data with additional attributes such as labels, faces, etc.

NOTE: These plugins will incur additional cost of the Cloud Vision APIs.

Use case(s)

As a user, I want to various features in my images and documents using the Cloud Vision API, so that I can add ML-driven enrichments to my Data Fusion pipelines that process unstructured data
As a user, I want easy, UI-driven ways of manipulating and understanding the output of the Cloud Vision API, so that I do not need to write any code for parsing it.

User Storie(s)

Plugin Type

Batch Source
Batch Sink
Real-time Source
Real-time Sink
Transform
Action
Post-Run Action
Aggregate
Join
Spark Model
Spark Compute

Configurables

File Path Batch Source

This source will read a directory, and instead of emitting records from files in the directory, it will emit all the file names as records. It should work for object stores as well.

Section	User Facing Name	Type	Description	Optional	Default
Basic	Service Account File path	text	Path to service account file on	Yes	None
	Project ID	text	GCP Project ID	Yes	auto-detect
	Path	textbox	The path to the directory where the files whose paths are to be emitted are located	No
	Recursive	toggle	Whether the plugin should recursively traverse the directory for subdirectories	Yes	False
	Last Modified After	date-time picker	A way to filter files to be returned based on their last modified timestamp	Yes	1/1/1970 (epoch)
Advanced	Split by	radio button	Determines splitting mechanisms. Choose amongst default (uses the default splitting mechanism of file input format), batch size (by number of files in a batch), directory (by each sub directory)	Yes	default
Advanced	Batch size	number	Specifies the number of files to process in a single batch. Only required when Split By is set to batch size.

Image Extractor transform

The image extractor transform can be used in conjunction with the file path batch source to extract enrichments from each image based on selected features.

It should send all errors to the error port.

Section	User Facing Name	Type	Description	Optional	Default
Basic	Access Token	Password	Authentication token for Cloud Vision API.	No
	Path Field	Input Field Selector	Field in the input schema containing the path to the image. Defaults to 'path'	Yes	path
	Features	function-dropdown-with-alias	The features to extract from documents, specified as a combination of feature type, max results, and model. Select from Face, Text, Handwriting, Crop Hints, Faces, Image properties, Labels, Landmarks, Logos, Multiple Objects, Explicit Content, Web Detection, Product Search, Object Localization	No
Advanced	Language Hints	multi-select	Optional hints to provide to Cloud Vision API in case it has trouble detecting the language of the text in the images. Only shown when the Text feature is selected. Select from supported languages	Yes	None/Empty
	Aspect Ratios	multi-select	Aspect ratios as a decimal number, representing the ratio of the width to the height of the image. For example, if the desired aspect ratio is 4/3, the corresponding float value should be 1.33333. Only shown when Crop Hints is selected as a feature. If not specified, the best possible crop is returned. The number of provided aspect ratios is limited to a maximum of 16; any aspect ratios provided after the 16th are ignored.	Yes	None
	Include Geo Results	toggle	Whether to include results derived from the geo information in the image. Only shown when Web Detection is selected as a feature	Yes	false
	Bounding Polynomial	JSON	The bounding polynomial for the image detection. Only shown when Product Search is selected as a feature. Should be a JSON in this format	Yes	None
	Product Set	text	The resource name of a `ProductSet` to be searched for similar images. Only shown when Product Search is selected as a feature. Format is: `projects/PROJECT_ID/locations/LOC_ID/productSets/PRODUCT_SET_ID`.	Yes	None
	Product Categories	select	The list of product categories to search in. Select one of either "homegoods", "apparel", or "toys". Only shown when Product Search is selected as a feature.	Yes	None
	Filter	text	The filtering expression. This can be used to restrict search results based on Product labels. We currently support an AND of OR of key-value expressions, where each expression within an OR must have the same key. An '=' should be used to connect the key and value. Only shown when Product Search is selected as a feature. For example, "(color = red OR color = blue) AND brand = Google" is acceptable, but "(color = red OR brand = Google)" is not acceptable. "color: red" is not acceptable because it uses a ':' instead of an '='.	Yes	None

Online Document Text Extractor transform

This transform plugin can detect and transcribe text from small (upto 5 pages) PDF and TIFF files stored in Cloud Storage in an online manner.

checkboxes

Section	User Facing Name	Type	Description	Optional	Default
Basic	Access Token	Password	Authentication token for Cloud Vision API.	No
	Path Field	Input field selector	Field in the input schema containing the path to the image. Defaults to 'path'	Yes	path
	Content Field	Input field selector	Field in the input schema containing the file content, represented as a stream of bytes. Defaults to 'content'.	Yes	content	Yes	Path	Features
	Mime Type	select	Choose one of "application/pdf", "image/tiff" and "image/gif"	Yes	application/pdf
	Features	function-dropdown-with-alias	The features to extract from documents, specified as a combination of feature type, max results, and model. Select from Face, Text, Handwriting, Crop Hints, Faces, Image properties, Labels, Landmarks, Logos, Multiple Objects, Explicit Content, Web Detection, Product Search, Object Localization	No
	Pages	multi-select	The pages in the file to perform image annotation. Select from 1-5.	Yes	1,2,3,4,5
Advanced	Language Hints	multi-select	Optional hints to provide to Cloud Vision API in case it has trouble detecting the language of the text in the images. Only shown when the Text feature is selected. Select from supported languages	Yes	None/Empty
	Aspect Ratios	multi-select	Aspect ratios as a decimal number, representing the ratio of the width to the height of the image. For example, if the desired aspect ratio is 4/3, the corresponding float value should be 1.33333. Only shown when Crop Hints is selected as a feature. If not specified, the best possible crop is returned. The number of provided aspect ratios is limited to a maximum of 16; any aspect ratios provided after the 16th are ignored.	Yes	None
	Include Geo Results	toggle	Whether to include results derived from the geo information in the image. Only shown when Web Detection is selected as a feature	Yes	false
	Bounding Polynomial	JSON	The bounding polynomial for the image detection. Only shown when Product Search is selected as a feature. Should be a JSON in this format	Yes	None
	Product Set	text	The resource name of a `ProductSet` to be searched for similar images. Only shown when Product Search is selected as a feature. Format is: `projects/PROJECT_ID/locations/LOC_ID/productSets/PRODUCT_SET_ID`.	Yes	None
	Product Categories	select	The list of product categories to search in. Select one of either "homegoods", "apparel", or "toys". Only shown when Product Search is selected as a feature.	Yes	None
	Filter	text	The filtering expression. This can be used to restrict search results based on Product labels. We currently support an AND of OR of key-value expressions, where each expression within an OR must have the same key. An '=' should be used to connect the key and value. Only shown when Product Search is selected as a feature. For example, "(color = red OR color = blue) AND brand = Google" is acceptable, but "(color = red OR brand = Google)" is not acceptable. "color: red" is not acceptable because it uses a ':' instead of an '='.	Yes	None

Offline Document Text Extractor action

This action plugin can detect and transcribe text from large (upto 2000 pages) PDF and TIFF files stored in Cloud Storage in an async manner.

Section	User Facing Name	Type	Description	Optional	Default
Basic	Access Token	Password	Authentication token for Cloud Vision API.	No
	Source Path	Text	Path to the location of the directory on GCS where the input files are stored	Yes	path
	Destination Path	Text	Path to the location of the directory on GCS where output files should be stored	Yes	content
	Mime Type	select	Choose one of "application/pdf", "image/tiff" and "image/gif"	Yes	application/pdf
	Features	function-dropdown-with-alias	The features to extract from documents, specified as a combination of feature type, max results, and model. Select from Face, Text, Handwriting, Crop Hints, Faces, Image properties, Labels, Landmarks, Logos, Multiple Objects, Explicit Content, Web Detection, Product Search, Object Localization	No
	Pages	multi-select	The pages in the file to perform image annotation. Select from 1-5.	Yes	1,2,3,4,5
	Batch size	number	The max number of responses to put into each output JSON file on Google Cloud Storage. The valid range is [1, 100]. If not specified, the default value is 20.	Yes	20
Advanced	Language Hints	multi-select	Optional hints to provide to Cloud Vision API in case it has trouble detecting the language of the text in the images. Only shown when the Text feature is selected. Select from supported languages	Yes	None

File Extractor transform

/Empty
Aspect Ratios	multi-select	Aspect ratios as a decimal number, representing the ratio of the width to the height of the image. For example, if the desired aspect ratio is 4/3, the corresponding float value should be 1.33333. Only shown when Crop Hints is selected as a feature. If not specified, the best possible crop is returned. The number of provided aspect ratios is limited to a maximum of 16; any aspect ratios provided after the 16th are ignored.	Yes	None
Include Geo Results	toggle	Whether to include results derived from the geo information in the image. Only shown when Web Detection is selected as a feature	Yes	false
Bounding Polynomial	JSON	The bounding polynomial for the image detection. Only shown when Product Search is selected as a feature. Should be a JSON in this format	Yes	None
Product Set	text	The resource name of a `ProductSet` to be searched for similar images. Only shown when Product Search is selected as a feature. Format is: `projects/PROJECT_ID/locations/LOC_ID/productSets/PRODUCT_SET_ID`.	Yes	None
Product Categories	select	The list of product categories to search in. Select one of either "homegoods", "apparel", or "toys". Only shown when Product Search is selected as a feature.	Yes	None
Filter	text	The filtering expression. This can be used to restrict search results based on Product labels. We currently support an AND of OR of key-value expressions, where each expression within an OR must have the same key. An '=' should be used to connect the key and value. Only shown when Product Search is selected as a feature. For example, "(color = red OR color = blue) AND brand = Google" is acceptable, but "(color = red OR brand = Google)" is not acceptable. "color: red" is not acceptable because it uses a ':' instead of an '='.

Offline Image Extractor action

This action plugin can asynchronously extract features from images stored on GCS, and store the extracted output on GCS.

Section	User Facing Name	Type	Description

Constraints

Optional	Default
Basic	Access Token	Password	Authentication token for Cloud Vision API.	No
	Source Path	Text	Path to the location of the directory on GCS where the input files are stored	Yes	path
	Destination Path

Featurescheckboxes

Text	Path to the location of the directory on GCS where output files should be stored	Yes	content
Features	function-dropdown-with-alias	The features to extract from documents, specified as a combination of feature type, max results, and model. Select from Face, Text, Handwriting, Crop Hints, Faces, Image properties, Labels, Landmarks, Logos, Multiple Objects, Explicit Content, Web Detection, Product Search, Object Localization	No
Advanced	Language Hints	multi-select	Optional hints to provide to Cloud Vision API in case it has trouble detecting the language of the text in the images. Only shown when the Text feature is selected. Select from supported languages	Yes	None/Empty
	Aspect Ratios	multi-select	Aspect ratios as a decimal number, representing the ratio of the width to the height of the image. For example, if the desired aspect ratio is 4/3, the corresponding float value should be 1.33333. Only shown when Crop Hints is selected as a feature. If not specified, the best possible crop is returned. The number of provided aspect ratios is limited to a maximum of 16; any aspect ratios provided after the 16th are ignored.	Yes	None
	Include Geo Results	toggle	Whether to include results derived from the geo information in the image. Only shown when Web Detection is selected as a feature	Yes	false
	Bounding Polynomial	JSON	The bounding polynomial for the image detection. Only shown when Product Search is selected as a feature. Should be a JSON in this format	Yes	None
	Product Set	text	The resource name of a `ProductSet` to be searched for similar images. Only shown when Product Search is selected as a feature. Format is: `projects/PROJECT_ID/locations/LOC_ID/productSets/PRODUCT_SET_ID`.	Yes	None
	Product Categories	select	The list of product categories to search in. Select one of either "homegoods", "apparel", or "toys". Only shown when Product Search is selected as a feature.	Yes	None
	Filter	text	The filtering expression. This can be used to restrict search results based on Product labels. We currently support an AND of OR of key-value expressions, where each expression within an OR must have the same key. An '=' should be used to connect the key and value. Only shown when Product Search is selected as a feature. For example, "(color = red OR color = blue) AND brand = Google" is acceptable, but "(color = red OR brand = Google)" is not acceptable. "color: red" is not acceptable because it uses a ':' instead of an '='.

Design / Implementation Tips

Tip #1
Tip #2

Design

Approach(s)

Properties

Security

Limitation(s)

Future Work

Some future work – HYDRATOR-99999
Another future work – HYDRATOR-99999

Test Case(s)

Test case #1
Test case #2

Sample Pipeline

Please attach one or more sample pipeline(s) and associated data.

Pipeline #1

Pipeline #2

Table of Contents

Table of Contents

style	circle

Checklist

User stories documented
User stories reviewed
Design documented
Design reviewed
Feature merged
Examples and guides
Integration tests
Documentation for feature
Short video demonstrating the feature

Versions Compared

Old Version 4

New Version Current

Key

Introduction

Use case(s)

User Storie(s)

Plugin Type

Configurables

File Path Batch Source

Image Extractor transform

Online Document Text Extractor transform

Offline Document Text Extractor action

Offline Image Extractor action

Design / Implementation Tips

Design

Approach(s)

Properties

Security

Limitation(s)

Future Work

Test Case(s)

Sample Pipeline

Pipeline #1

Pipeline #2

Page Comparison

Versions Compared

Old Version 4

New Version Current

Key

Introduction

Use case(s)

User Storie(s)

Plugin Type

Configurables

File Path Batch Source

Image Extractor transform

Online Document Text Extractor transform

Offline Document Text Extractor action

Offline Image Extractor action

Design / Implementation Tips

Design

Approach(s)

Properties

Security

Limitation(s)

Future Work

Test Case(s)

Sample Pipeline

Pipeline #1

Pipeline #2