Pipeline Microservices

Use the CDAP Pipeline Microservices to manage data pipelines and connections.

All methods or endpoints described in this API have a base URL (typically http://<host>:11015 or https://<host>:10443) that precedes the resource identifier, as described in the Microservices Conventions. These methods return a status code, as listed in the Microservices Status Codes.

Create or Update a Draft Pipeline

To create or update a pipeline in Draft mode in a namespace, submit an HTTP PUT request:

1 PUT /v3/namespaces/system/apps/pipeline/services/studio/methods/v1/contexts/<context>/drafts/<draft>

Parameter

Description

Parameter

Description

context

The namespace for the Draft pipelines.

draft

The Draft ID of the pipeline. The Draft ID can be any alphanumeric string. It must be unique in the namespace.

The request body is a JSON object specifying the name, description, and artifact for the pipeline. There are two supported artifacts for connections: cdap-data-pipeline and cdap-data-streams.

For example:

1 2 3 4 5 6 7 8 9 10 { "name": "Draft1", "description": "This is an example draft", "artifact": { "name": "cdap-data-pipeline", "version": "6.3.0-SNAPSHOT", "scope": "SYSTEM" }, "config": {<pipeline-config>} }

Delete a Draft Pipeline

To delete a pipeline in Draft mode in a namespace, submit an HTTP DELETE request:

1 DELETE /v3/namespaces/system/apps/pipeline/services/studio/methods/v1/contexts/<context>/drafts/<draft>

Parameter

Description

Parameter

Description

context

The namespace for the Draft pipeline.

draft

The Draft ID of the pipeline.

Details of a Draft Pipeline

To list the details of a pipeline in Draft mode in a namespace, submit an HTTP GET request:

1 GET /v3/namespaces/system/apps/pipeline/services/studio/methods/v1/contexts/<context>/drafts/<draft>

Parameter

Description

Parameter

Description

context

The namespace for the Draft pipelines.

draft

The Draft ID of the pipeline.

The information will be returned in the body of the response. It includes the Draft ID, name, description, and revision number of each pipeline in Draft mode; the artifact that the pipelines uses; and details of the pipeline configuration (pipeline json). For example:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 { "id": "d2a0cd35-89c2-40cb-87ac-91b77dc33730", "createdTimeMillis": 1610478516914, "updatedTimeMillis": 1610478516914, "configHash": -1944636578, "previousHash": "", "name": "POS_Sales_per_Region_v2", "description": "Data Pipeline Application", "revision": 0, "artifact": { "name": "cdap-data-pipeline", "version": "6.3.0-SNAPSHOT", "scope": "SYSTEM" }, "config": {<pipeline-config>} }

List Draft Pipelines

To list all of the pipeline in Draft mode in a namespace, submit an HTTP GET request:

1 GET /v3/namespaces/system/apps/pipeline/services/studio/methods/v1/contexts/<context>/drafts

Path Parameter

Description

Path Parameter

Description

context

The namespace for the Draft pipelines.

Query Parameter

Description

Query Parameter

Description

sortBy

Name of the fields to sort by. These field names must match the Draft pipeline json.

filter

Takes in any text and does prefix matching on the draft name field.

includeConfig

Includes the pipeline json for each draft in the response. Default value is false.

The information will be returned in the body of the response. It includes the Draft ID, create time, update time, name, description, and revision number of each pipeline in Draft mode; and the artifact that each pipelines uses. For example:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 [ { "id": "eb0f17d9-ea25-44c9-9143-96e973372ecc", "createdTimeMillis": 1610481853854, "updatedTimeMillis": 1610481869299, "configHash": 0, "previousHash": "", "name": "POS_Sales_per_Region_v2", "description": "Pipeline for POS Sales for each Region", "revision": 0, "artifact": { "name": "cdap-data-pipeline", "version": "6.3.0-SNAPSHOT", "scope": "SYSTEM" } }, { "id": "03ae53de-a2ec-4c4e-84bc-46d8519baffb", "createdTimeMillis": 1610482668794, "updatedTimeMillis": 1610482673341, "configHash": 0, "previousHash": "", "name": "SalesQ2", "description": "Pipeline for Total Q2 Sales", "revision": 0, "artifact": { "name": "cdap-data-pipeline", "version": "6.3.0-SNAPSHOT", "scope": "SYSTEM" } } ]

Create a Connection

To create a connection in a namespace, submit an HTTP PUT request:

1 PUT /v3/namespaces/system/apps/pipeline/services/studio/methods/v1/contexts/<context>/connections/<connection-id>

Parameter

Description

Parameter

Description

context

The namespace for the connection.

draft

The Connection ID of the connection. The Connection ID can be any alphanumeric string. It must be unique in the namespace.

The request body is a JSON object specifying the name, description, and artifact for the connection. These are the supported artifacts for connections: google-cloud, database-plugin, mysql-plugin, oracle-plugin, postgresql-plugin, sqlserver-plugin.

For example:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 { "name": "test-mysql", "description": "", "category": "Database", "plugin": { "category": "Database", "name": "Mysql", "type": "connector", "properties": { "host": "localhost ", "port": "3306", "jdbcPluginName": "mysql", "user": "joe", "password": "hello" }, "artifact": { "scope": "SYSTEM", "name": "mysql-plugin", "version": "1.6.0-SNAPSHOT" } } }

List Connection

To list all of the connections in a namespace, submit an HTTP GET request:

1 GET /v3/namespaces/system/apps/pipeline/services/studio/methods/v1/contexts/<context>/connections

Path Parameter

Description

Path Parameter

Description

context

The namespace for the connections.

The information will be returned in the body of the response. It includes the Connection ID, create time, update time, name, description, and plugin category, plugin type, and plugin properties for each in connection; and the artifact that each connection uses. For example:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 [ { "name": "sales-connection", "connectionId": "sales_connection", "connectionType": "GCS", "description": "Connection to access sales data in GCS", "preConfigured": false, "isDefault": false, "createdTimeMillis": 1627676547098, "updatedTimeMillis": 1627676547098, "plugin": { "category": "Google Cloud Platform", "name": "GCS", "type": "connector", "properties": { "project": "auto-detect", "serviceAccountType": "filePath", "serviceFilePath": "auto-detect" }, "artifact": { "scope": "SYSTEM", "name": "google-cloud", "version": "0.18.0-SNAPSHOT" } } }, { "name": "customers-connection", "connectionId": "customers_connection", "connectionType": "Mysql", "description": "Connection to access customer data in MySQL.", "preConfigured": false, "isDefault": false, "createdTimeMillis": 1627677148671, "updatedTimeMillis": 1627677148671, "plugin": { "category": "Database", "name": "Mysql", "type": "connector", "properties": { "host": "localhost", "port": "3306", "jdbcPluginName": "mysql", "user": "joe", "password": "hello" }, "artifact": { "scope": "SYSTEM", "name": "mysql-plugin", "version": "1.6.0-SNAPSHOT" } } } ]

Details of a Connection

To list the details of a connection in a namespace, submit an HTTP GET request:

1 GET /v3/namespaces/system/apps/pipeline/services/studio/methods/v1/contexts/<context>/connections/<connection-id>

Parameter

Description

Parameter

Description

context

The namespace for the connection.

connection-id

The Connection ID of the connection.

The information will be returned in the body of the response. It includes the Connection ID, create time, update time, name, description, and plugin category, plugin type, and plugin properties for each in connection; and the artifact that each connection uses. For example:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 [ { "name": "sales-connection", "connectionId": "sales_connection", "connectionType": "GCS", "description": "Connection to access sales data in GCS", "preConfigured": false, "isDefault": false, "createdTimeMillis": 1627676547098, "updatedTimeMillis": 1627676547098, "plugin": { "category": "Google Cloud Platform", "name": "GCS", "type": "connector", "properties": { "project": "auto-detect", "serviceAccountType": "filePath", "serviceFilePath": "auto-detect" }, "artifact": { "scope": "SYSTEM", "name": "google-cloud", "version": "0.18.0-SNAPSHOT" } } } ]

Delete a Connection

To delete a connection in a namespace, submit an HTTP DELETE request:

1 DELETE /v3/namespaces/system/apps/pipeline/services/studio/methods/v1/contexts/<context>/connections/<connection-id>

Parameter

Description

Parameter

Description

context

The namespace for the Connection.

connection-id

The Connection ID of the pipeline.

Test a Connection

To test a connection in a namespace, submit an HTTP POST request:

1 POST /v3/namespaces/system/apps/pipeline/services/studio/methods/v1/contexts/<context>/connections/test

Parameter

Description

Parameter

Description

context

The namespace for the Connection.

The request body is a JSON object specifying the name, description, and artifact for the connection. These are the supported artifacts for connections: google-cloud, database-plugin, mysql-plugin, oracle-plugin, postgresql-plugin, sqlserver-plugin.

For example:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 { "name": "test-mysql", "description": "", "category": "Database", "plugin": { "category": "Database", "name": "Mysql", "type": "connector", "properties": { "host": "localhost ", "port": "3306", "jdbcPluginName": "mysql", "user": "joe", "password": "hello" }, "artifact": { "scope": "SYSTEM", "name": "mysql-plugin", "version": "1.6.0-SNAPSHOT" } } }

Browse a Connection

To browse a connection in a namespace, submit an HTTP POST request:

1 POST /v3/namespaces/system/apps/pipeline/services/studio/methods/v1/contexts/<context>/connections/<connection-id>/browse

Parameter

Description

Parameter

Description

context

The namespace for the Connection.

connection-id

The Connection ID of the pipeline.

The request body is a JSON object specifying the name, description, and artifact for the connection. These are the supported artifacts for connections: google-cloud, database-plugin, mysql-plugin, oracle-plugin, postgresql-plugin, sqlserver-plugin.

For example:

1 2 3 4 { "path": "/database/schema/table", "properties" : {} }

If you are browsing a database that doesn’t have a schema, such as MySQL, enter the database name for the path. For example if the MySQL database name is mydb, enter:

1 2 3 4 { "path": "/mydb", "properties" : {} }

For information about setting the path for each type of connection, see the Connection Reference.

Get Sample Results for a Connection

To get sample results for a connection in a namespace, submit an HTTP POST request:

1 POST /v3/namespaces/system/apps/pipeline/services/studio/methods/v1/contexts/<context>/connections/<connection-id>/sample

Parameter

Description

Parameter

Description

context

The namespace for the Connection.

connection-id

The Connection ID of the pipeline.

The request body is a JSON object specifying the name, description, and artifact for the connection. These are the supported artifacts for connections: google-cloud, database-plugin, mysql-plugin, oracle-plugin, postgresql-plugin, sqlserver-plugin.

For example:

1 2 3 4 5 { "limit": 1000, "path": "/database/schema/table", "properties" : {} }

For information about setting the path for each type of connection, see the Connection Reference.

Get JSON Specification for a Connection

To get the JSON specification for a connection in a namespace, submit an HTTP POST request:

1 POST /v3/namespaces/system/apps/pipeline/services/studio/methods/v1/contexts/<context>/connections/<connection-id>/specification

Parameter

Description

Parameter

Description

context

The namespace for the Connection.

connection-id

The Connection ID of the pipeline.

The request body is a JSON object specifying the name, description, and artifact for the connection. These are the supported artifacts for connections: google-cloud, database-plugin, mysql-plugin, oracle-plugin, postgresql-plugin, sqlserver-plugin.

For example:

1 2 3 4 { "path": "/database/schema/table", "properties" : {} }

For information about setting the path for each type of connection, see the Connection Reference.

Call GetSchema from REST API

This topic describes the process for getting the output schema for plugins using the REST API.

The GetSchema button sends a POST request to this endpoint:/v3/namespaces/system/apps/pipeline/services/studio/methods/v1/contexts/{namespace}/validations/stage

This endpoint expects a payload with the stage to validate and the inputSchemas to this stage. The stage object can be directly copied from the pipeline json and the inputSchemas can be left empty since it is a source. Example payload:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 { "stage":{ "name": "BigQuery", "plugin": { "name": "BigQueryTable", "type": "batchsource", "label": "BigQuery", "artifact": { "name": "google-cloud", "version": "0.14.6", "scope": "SYSTEM" }, "properties": { "project": "auto-detect", "serviceFilePath": "auto-detect", "datasetProject": "meseifan-test", "dataset": "GCPQuickStart", "table": "ADT_Out", "referenceName": "test" } }, "outputSchema": [ { "name": "etlSchemaBody", "schema": "" } ] }, "inputSchemas":[] }

 

The response will be a fully populated stage object, one of the fields is "outputSchema" which will contain the schema of the input you are attempting to use. Example response:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 { "spec": { "name": "BigQuery", "plugin": { "type": "batchsource", "name": "BigQueryTable", "properties": { "serviceFilePath": "auto-detect", "project": "auto-detect", "datasetProject": "meseifan-test", "dataset": "GCPQuickStart", "table": "ADT_Out", "referenceName": "test" }, "artifact": { "name": "google-cloud", "version": { "version": "0.14.6", "major": 0, "minor": 14, "fix": 6 }, "scope": "SYSTEM" } }, "outputSchema": { "type": "record", "name": "output", "fields": [ { "name": "SRC_NTW_ID_", "type": [ "string", "null" ] }, { "name": "NTW_CUST_NUM_", "type": [ "long", "null" ] }, { "name": "NTW_CS_NO", "type": [ "string", "null" ] }, { "name": "SRC_NTW_ID_masked", "type": [ "string", "null" ] }, { "name": "NTW_CS_NO_masked", "type": [ "string", "null" ] } ] }, "inputSchemas": {}, "outputPorts": {}, "portSchemas": {}, "stageLoggingEnabled": true, "processTimingEnabled": true, "maxPreviewRecords": 100, "inputStages": [], "failures": [], } }