Datasets parameters (cdap-site.xml and cdap-default.xml)

Parameter Name

Default Value

Version Introduced

Description

Parameter Name

Default Value

Version Introduced

Description

data.local.storage

${local.data.dir}/ldb

 

Database directory for LevelDB, used for data fabric in CDAP Local Sandbox.

data.local.storage.compression.enabled

true

 

Whether compression is enabled for data fabric when in CDAP Local Sandbox.

data.local.storage.blocksize

1024

 

Block size in bytes for data fabric when in CDAP Local Sandbox.

data.local.storage.cachesize

104857600

 

Cache size in bytes for data fabric when in CDAP Local Sandbox.

data.event.topic

dataevent

 

Topic name for publishing data events to the messaging system.

data.storage.implementation

nosql

 

The database implementation CDAP will use.

data.storage.extensions.dir

/opt/cdap/master/ext/storageproviders

 

 

data.storage.sql.jdbc.driver.external

true

 

Indicates whether the JDBC driver has to be loaded from an external directory. If true, then the JDBC driver directory has to be specified using data.storage.sql.jdbc.driver.directory. If false, then the JDBC driver is present in the CDAP classpath. This config can only be used when the storage implementation is postgresql.

data.storage.sql.jdbc.driver.directory

/opt/cdap/master/ext/jdbc

 

The base directory for storing JDBC driver jars. Sub-directory with the name that matches with the value of “data.storage.implementation" setting will be searched for the corresponding JDBC driver and dependencies jars to connect to the configured sql instance. The JDBC driver class to load has to be specified using "data.storage.sql.jdbc.driver.name". This config can only be used when the storage implementation is postgresql.

data.storage.sql.jdbc.driver.name

 

 

The jdbc driver class name to connect to the sql instance. The jdbc url, username, password, connection properties can be set using cdap-security.xml.

data.storage.sql.jdbc.connection.url

 

 

The jdbc url to connect to the sql instance. No sensitive information should be provided using the jdbc url. The username and password can be specified in cdap-security.xml. For non-sensitive properties, it can be specified by adding a property with name prefixed with "data.storage.sql.jdbc.property.", followed by the sql property name.

data.storage.sql.scan.size.rows

100

6.6.0

The number of rows fetched for database reads from PostgreSQL.

data.storage.sql.jdbc.connection.pool.size

800

 

The max number of connections for the sql connection pool.

data.tx.enabled

true

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Determines if the transaction service is enabled.

data.tx.bind.address

0.0.0.0

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Transaction service bind address.

data.tx.bind.port

0

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Transaction service bind port; if 0, binds to a random port

data.tx.changeset.count.limit

2147483647

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Hard limit for the number of entries in a transaction's change set; if exceeded, the transaction fails. By default, this is unlimited (that is, Int.MAX_VALUE).

data.tx.changeset.count.warn.threshold

50000

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Soft limit for the number of entries in a transaction's change set; if exceeded, a warning is logged.

data.tx.changeset.size.limit

9223372036854775807

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Hard limit for the aggregate size in bytes of a transaction's change set; if exceeded, the transaction fails. By default, this is unlimited (that is, Long.MAX_VALUE).

data.tx.changeset.size.warn.threshold

5000000

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Soft limit for the aggregate size in bytes of a transaction's change set; if exceeded, a warning is logged.

data.tx.client.count

50

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

The number of pooled instances of the transaction client; increase this to increase transaction concurrency.

data.tx.client.provider

pool

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Provider strategy for transaction clients; valid values are "pool" and "thread-local"

data.tx.discovery.service.name

transaction

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Name in discovery service for the transaction service.

data.tx.grace.period

86400

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Time in seconds used to pad transaction maximum lifetime while pruning.

data.tx.hdfs.user

${hdfs.user}

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

User name for accessing HDFS (if not running in secure HDFS).

data.tx.janitor.enable

true

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Determines if the TransactionDataJanitor coprocessor is enabled on tables; normally should be true.

data.tx.max.instances

${master.service.max.instances}

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Maximum number of transaction service instances. Increasing the number of transaction service instances only improves availability, but not scalability.

data.tx.max.timeout

600

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

The limit for the allowed transaction timeout, in seconds. Attempts to start a transaction with a longer timeout will fail.

data.tx.memory.mb

${master.service.memory.mb}

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Memory in megabytes for each transaction service instance.

data.tx.num.cores

${master.service.num.cores}

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Number of virtual cores for the transaction service.

data.tx.num.instances

1

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Requested number of transaction service instances.

data.tx.prune.enable

false

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Enable invalid transaction list pruning.

data.tx.prune.plugins

data.tx.pruning.plugin

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

List of transaction pruning plugins; for CDAP HBase tables that use transaction functionality to skip or clean invalid data

data.tx.prune.state.table

${dataset.table.prefix}_system:tephra.state

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Table used to store intermediate state when invalid transaction list pruning is enabled.

data.tx.pruning.plugin.class

io.cdap.data2.txprune.DefaultHBaseTransactionPruningPlugin

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Class name for the default transaction pruning plugin.

data.tx.retain.client.id

committed

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Whether and how long to retain the client id of a transaction. Valid values are: "off" to disable retention of the client id; "active" to retain the client id until a transaction is committed; or "committed" to retain the client id as long as its change set participates in conflict detection. Retaining the client id slightly increases the memory footprint of the transaction service. Client ids are never retained past a restart or fail-over of the transaction manager.

data.tx.server.io.threads

2

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Number of IO threads for the transaction service.

data.tx.server.threads

25

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Number of threads for the transaction service.

data.tx.snapshot.codecs

org.apache.tephra.snapshot.SnapshotCodecV3,
org.apache.tephra.snapshot.SnapshotCodecV4

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Specifies the class names of all supported transaction state codecs.

data.tx.snapshot.dir

${hdfs.namespace}/tx.snapshot

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Directory in HDFS used to store snapshots and logs of transaction state.

data.tx.snapshot.interval

60

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Frequency of transaction snapshots in seconds.

data.tx.snapshot.local.dir

${local.data.dir}/tx.snapshot

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Storage directory on the local filesystem of snapshot and logs of transaction state when in CDAP Local Sandbox.

data.tx.snapshot.retain

10

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Number of transaction snapshot files to retain as backups.

data.tx.thrift.max.read.buffer

${thrift.max.read.buffer}

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Maximum read buffer size in bytes used by the transaction service; the value should be set to something greater than the maximum frame sent on the RPC channel.

data.tx.timeout

30

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Timeout value in seconds for a transaction; if the transaction is not finished in that time, it is marked invalid.

dataset.data.dir

data

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Base directory for user data on the filesystem.

dataset.executor.bind.address

0.0.0.0

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Dataset executor HTTP service bind address.

dataset.executor.bind.port

0

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Dataset executor bind port; if 0, binds to a random port.

dataset.executor.container.instances

1

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Number of dataset executor instances.

dataset.executor.container.memory.mb

${master.service.memory.mb}

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Memory in megabytes for each dataset executor instance.

dataset.executor.container.num.cores

1

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Number of virtual cores for each dataset executor instance.

dataset.executor.max.instances

${master.service.max.instances}

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Maximum number of dataset executor instances.

dataset.extensions.dir

/opt/cdap/ext/lib

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Directory where all dataset extensions are stored.

dataset.service.bind.port

0

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Dataset service bind port; if 0, binds to a random port

dataset.service.boss.threads

1

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Number of Netty service boss threads for the dataset service.

dataset.service.connection.backlog

20000

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Maximum connection backlog of the dataset service.

dataset.service.exec.threads

30

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Number of Netty service executor threads for the dataset service.

dataset.service.output.dir

/datasets

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Directory where all dataset modules archives are stored

dataset.service.worker.threads

10

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Number of Netty service worker threads for the dataset service.

dataset.table.prefix

${root.namespace}

 

Note: This parameter is deprecated and will be removed in CDAP 7.0.0.

Prefix for dataset table name.

Created in 2020 by Google Inc.