Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 20 Current »

The Salesforce Multi Objects batch source plugin is available in the Hub.

Plugin version: 1.5.0

This source reads multiple sObjects from Salesforce. The data which should be read is specified using list of sObjects and incremental or range date filters. The source will output a record for each row in the SObjects it reads, with each record containing an additional field that holds the name of the SObject the record came from. In addition, for each SObject that will be read, this plugin will set pipeline arguments where the key is multisink.[SObjectName] and the value is the schema of the SObject.

Configuration

Property

Macro Enabled?

Version Introduced

Description

Reference Name

No

Required. Used to uniquely identify this source for lineage, annotating metadata, etc.

Use Connection (yes/no toggle)

No

1.5.0

Optional. Use an existing connection.

Browse Connections

Yes

1.5.0

Optional. Select an existing connection to use or add a new connection.

Username

Yes

Required. Salesforce username.

Password

Yes

Required. Salesforce password. The Salesforce API requires an additional "security token" to be added to the Password field to enable API access.

Also the Salesforce API can be used to verify if the query and token works as expected. 

If you do not include the security token, the following error occurs when you click Get Schema: "Connection to salesforce with plugin configurations failed".

Security Token

Yes

Optional. Salesforce security Token. If the password does not contain the security token the plugin will append the token before authenticating with salesforce.

Consumer Key

Yes

Required. Application Consumer Key. This is also known as the OAuth client id. A Salesforce connected application must be created in order to get a consumer key.

Consumer Secret

Yes

Required. Application Consumer Secret. This is also known as the OAuth client secret. A Salesforce connected application must be created in order to get a client secret.

Login Url

Yes

Required. Salesforce OAuth2 login url.

Default is https://login.salesforce.com/services/oauth2/token

Connection Timeout

Yes

1.4.4

Optional. Maximum time in milliseconds to wait for connection initialization before time out.

Default is 30000 milliseconds.

Proxy URL

Yes

1.4.5

Optional. Proxy URL. Must contain a protocol, address and port.

Whitelist

Yes

Optional. List of SObjects to read from. By default all SObjects will be white listed. For each white listed SObject, a SOQL query will be generated of the form: select <FIELD_1, FIELD_2, ..., FIELD_N> from ${sObjectName}.

Blacklist

Yes

Optional. List of SObjects to avoid reading from. By default NONE of SObjects will be black listed.

Last Modified After

Yes

Optional. Filter data to only include records where the system field LastModifiedDate is greater than or equal to the specified date. The date must be provided in the Salesforce date format. If no value is provided, no lower bound for LastModifiedDate is applied.

See below for Salesforce date format examples.

Last Modified Before

Yes

Optional. Filter data to only include records where the system field LastModifiedDate is less than the specified date. The date must be provided in the Salesforce date format. Specifying this along with Last Modified After allows reading data modified within a specific time window. If no value is provided, no upper bound for LastModifiedDate is applied.

See below for Salesforce date format examples.

Duration

Yes

Optional. Filter data read to only include records that were last modified within a time window of the specified size. For example, if the duration is ‘6 hours’ and the pipeline runs at 9am, it will read data that was last updated from 3am (inclusive) to 9am (exclusive). The duration is specified using numbers and time units:

  • Seconds

  • Minutes

  • Hours

  • Days

  • Months

  • Years

Several units can be specified, but each unit can only be used once. For example, 2 days, 1 hours, 30 minutes. The duration is ignored if a value is already specified for Last Modified After or Last Modified Before.

Offset

Yes

Optional. Filter data to only read records where the system field LastModifiedDate is less than the logical start time of the pipeline minus the given offset. For example, if duration is ‘6 hours’ and the offset is ‘1 hours’, and the pipeline runs at 9am, data last modified between 2am (inclusive) and 8am (exclusive) will be read. The duration is specified using numbers and time units:

  • Seconds

  • Minutes

  • Hours

  • Days

  • Months

  • Years

Several units can be specified, but each unit can only be used once. For example, 2 days, 1 hours, 30 minutes. The offset is ignored if a value is already specified for Last Modified After or Last Modified Before.

SOQL Operation Type

No

Optional. Specify the query operation to run on the table. If query is selected, only current records will be returned. If queryAll is selected, all current and deleted records will be returned.

Default operation is query.

SObject Name Field

No

Optional. The name of the field that holds the SObject name. Must not be the name of any SObject column that will be read. Defaults to tablename.

Salesforce Date Format Examples

Format

Format Syntax

Example

Date, time, and time zone offset

YYYY-MM-DDThh:mm:ss+hh:mm

1999-01-01T23:01:01+01:00

YYYY-MM-DDThh:mm:ss-hh:mm

1999-01-01T23:01:01-08:00

YYYY-MM-DDThh:mm:ssZ

1999-01-01T23:01:01Z

Example

There are two SObjects of interest in Salesforce. The first SObject is named ‘Account’ and contains:

id

name

email

0

Samuel

sjax@example.net

1

Alice

a@example.net

The second is named ‘Activity’ and contains:

id

item

action

0

shirt123

view

0

carxyz

view

0

shirt123

buy

0

coffee

view

1

cola

buy

To read data from these two SObjects, both of them must be indicated in the White List configuration property.

The output of the the source will be the following records:

id

name

email

tablename

0

Samuel

sjax@example.net

Account

1

Alice

a@example.net

Account

id

item

action

tablename

0

shirt123

view

Activity

0

carxyz

view

Activity

0

shirt123

buy

Activity

0

coffee

view

Activity

1

cola

buy

Activity

The plugin will emit two pipeline arguments to provide multi sink plugin with the schema of the output records:

multisink.Account =
  {
    "type": "record",
    "name": "output",
    "fields": [
      { "name": "Id", "type": "long" } ,
      { "name": "Name", "type": "string" },
      { "name": "Email", "type": [ "string", "null" ] }
    ]
  }
multisink.Activity =
  {
    "type": "record",
    "name": "output",
    "fields": [
      { "name": "Id", "type": "long" } ,
      { "name": "Item", "type": "string" },
      { "name": "Action", "type": "string" }
    ]
  }

Data Type Mapping

Salesforce Data Type

CDAP Schema Data Type

_bool

bool

_int

int

_long

long

_double, currency, percent

double

date

date

datetime

timestamp (microseconds)

time

time (microseconds)

picklist

string

multipicklist

string

combobox

string

reference

string

base64

string

textarea

string

phone

string

id

string

url

string

email

string

encryptedstring

string

datacategorygroupreference

string

location

string

address

string

anyType

string

json

string

complexvalue

string

  • No labels