Using Secure Keys

In CDAP, you can use secure keys to store sensitive information in a secure and encrypted manner. You might use secure keys for a passphrase, cryptographic key, access token, or any other data that needs to be stored securely.

You can use secure keys in the CDAP Sandbox, in-memory CDAP, and Distributed CDAP. The basic steps for creating secure keys is the same for all three versions of CDAP. The main difference is that CDAP Sandbox and in-memory CDAP use the Sun JCEKS implementation for storing secure keys and Distributed CDAP uses Hadoop KMS (Key Management Server)-backed secure storage. For more information, see Secure Storage.

The secure keys framework is pluggable, and you can also build your own secure key implementation to store the data in encrypted storage of your choice. For example, Cloud Data Fusion, which is the managed version of CDAP on GCP, uses Cloud KMS to store secure keys.

Creating Secure Keys

Secure keys are stored in the namespace where you create them and are unique to that namespace. You cannot share or copy secure keys across namespaces.

After you create a secure key, you can use the secure key in a pipeline or a compute profile.

Creating a Secure Key (CDAP Sandbox and in-memory CDAP)

To create a secure key, complete the following steps:

  1. To configure CDAP to use secure storage, edit the cdap-site.xml and cdap-security files.
    For more information, see the “File-backed Secure Storage” section.

  2. To create a secure key, use the Secure Storage HTTP RESTful API. For more information, see the “Add a Secure Key” section.

  3. After you create the secure key, you can use it in pipelines or a compute profile.

Creating a Secure Key (Distributed CDAP)

  1. Configure CDAP to use Hadoop KMS. For more information on integration with Hadoop KMS, see to Apache Hadoop Key Management Server (KMS).

  2. To configure CDAP to use secure storage, edit the cdap-site.xml and cdap-security files.
    For more information, see the “Hadoop Key Management Server-backed Secure Storage Secure Storage” section.

  3. To create a secure key, use the Secure Storage HTTP RESTful API. For more information, see the “Add a Secure Key” section.

  4. After you create the secure key, you can use it in pipelines or a compute profile.

Using Secure Keys

You can use secure keys in plugins and compute profiles. For pipelines, you add the secure key as a macro in any plugin that requires authentication. Likewise, you can add it to a namespace compute profile or system compute profile. For example, you might create a compute profile for a Dataproc cluster and use a secure key and for the Service Account Key in the compute profile.

Example: Using a Secure Key in a Pipeline

You’re using CDAP and want to create a pipeline that reads from a source database. You also want to create a secure key for your database password. It’s easy to do this in CDAP. The following steps walk you through this example.

To add a secure key to a pipeline, complete the following steps:

  1. In CDAP, open the HTTP interface. Click System Admin > Configuration > Make HTTP Calls.

  2. Issue the following PUT command:
    namespaces/<namespace-id>/securekeys/<secure-key-name>
    with a JSON-formatted body that contains the description of the key, the data (password) to be stored under the key, and a map of descriptive properties associated with the key (these can help you identify the key in the future):

    { "description": "Example Secure Key", "data": "<secure-contents>", "properties": { "<property-key>": "<property-value>" } }

    For example, the following PUT command creates a secure key called mykey with a password of test123.
    PUT namespaces/default/securekeys/mykey

    JSON body:
    {
    "description": "Example Secure Key",
    "data": "test123",
    "properties": {
    "<property-key>": "<property-value>"
    }
    }

    The Status Code: 200 means you successfully created the secure key to store the encrypted password for the database. CDAP saves the secure key in the secure store located in the store folder under your CDAP Sandbox installation directory.
    Now that you’ve created a secure key, you can add it as a macro in the Password field of a Database plugin.

  3. In the Pipeline Studio, create a pipeline with a Database Batch Source plugin.

  4. Click the Database plugin Properties button.

  5. In the Password field, click the macro button and add a secure macro: The secure macro has the format ${secure(<secure-key-name>)}. In this example, add ${secure(mykey)}

     

  6. Build the rest of the pipeline and run it.

Example: Using a Secure Key to a Compute Profile for a Dataproc Cluster

You’re using CDAP and want to use Dataproc as your provisioner. You also want to create a secure key for your Service Account in your compute profile for the Dataproc cluster.

Note: When you add a secure key to a system compute profile, CDAP applies the key when you run a pipeline. The key must exist in the namespace where you are running the pipeline. If the key doesn’t exist in the namespace, the pipeline will fail.

Step 1: Download and Convert the Service Account JSON to a String

Before you add a secure key to the compute profile, you need to download the Service Account JSON and convert it to a valid JSON string, which means it must be converted into a single-line JSON string and all quotes need to be escaped. You need the secure account information in string format to populate the "data": "<secure-contents>" property when you use the HTTP PUT command to create the key.

To download and convert the Service Account JSON to a string, complete the following steps:

  1. In the Cloud Console, go to the Service Accounts page.

  2. Under Select a recent project, select the project you want to create a secure key for.

  3. Select the service account email and under Actions, select Create Key.

  4. To download the service account key, select JSON:


    The service account JSON is downloaded to your local drive.

  5. Convert the JSON file to a valid JSON string, which means it must be converted into a single-line JSON string and all quotes need to be escaped.

Step 2: Create a Secure Key

  1. In CDAP, open the HTTP interface. Click System Admin > Configuration > Make HTTP Calls.

  2. Issue the following PUT command:
    namespaces/<namespace-id>/securekeys/<secure-key-name>
    with a JSON-formatted body that contains the description of the key, the data (Secure Account JSON that you converted to a string) to be stored under the key, and a map of descriptive properties associated with the key (these can help you identify the key in the future):

    { "description": "Example Secure Key", "data": "<secure-contents>", "properties": { "<property-key>": "<property-value>" } }

    For example, the following PUT command creates a secure key called mykey with the secure account data:

    PUT namespaces/default/securekeys/mykey

    JSON body:
    {
    "description": "Example Secure Key",
    "data": "{ \"type\": \"service_account\", \"project_id\": \"XXXXXXXXXXXXXX\", \"private_key_id\": \"XXXXXXXXXXXXX\", \"private_key\": \"-----BEGIN PRIVATE KEY-----\nXXXXXXXXXXXXXXXXXX\n-----END PRIVATE KEY-----\n\", \"client_email\": \"1234567889-compute@developer.gserviceaccount.com\", \"client_id\": \"265283765238456238456234524857234\", \"auth_uri\": \"https://accounts.google.com/o/oauth2/auth\", \"token_uri\": \"https://oauth2.googleapis.com/token\", \"auth_provider_x509_cert_url\": \"https://www.googleapis.com/oauth2/v1/certs\", \"client_x509_cert_url\": \"https://www.googleapis.com/robot/v1/metadata/x509/1234567889-compute%40developer.gserviceaccount.com\"}",
    "properties": {
    "<property-key>": "<property-value>"
    }
    }

The Status Code: 200 means you successfully created the secure key to store the encrypted service account information in the compute profile. CDAP saves the secure key in the secure store located in the store folder under your CDAP Sandbox installation directory.

Now that you’ve created a secure key, you can add it as a macro in the Service Account field in a compute profile.

Step 3: Add the Secure Key to the Compute Profile

You can add the secure key to the compute profile in the following places in the CDAP:

  • Default Namespace Compute Profile

  • Custom Namespace Compute Profile

  • System Compute Profile

Adding a Secure Key to the Default Namespace or Custom Namespace Compute Profile

To add a secure key in the default namespace or custom namespace compute profile, complete the following steps:

  1. In the Pipeline Studio, click the hamburger menu and select the down arrow next to default namespace:

  2. To edit the default namespace, click the default area:

    To edit a custom namespace, click in the custom namespace area. For example, if you have a namespace called “test”, click in the test namespace area.

  3. In the Namespace configuration page, under Compute Profiles, click Create New.

  4. To add a secure key to a new Dataproc cluster, select Dataproc:

  5. On the Create a profile for Dataproc page, go to the Secure Account Key field and click the shield icon:

  6. Select the secure key. In this example, it’s mykey:

    CDAP adds the secure key as a macro: ${secure(mykey)}

  7. To save the secure key in the compute profile, click Create.

Adding a Secure Key to the System Compute Profile

When you add a secure key to the system compute profile, CDAP applies the key when you run a pipeline. The key must exist in the namespace where you are running the pipeline. If the key doesn’t exist in the namespace, the pipeline will fail.

To add a secure key in the System Admin Compute Profile, complete the following steps:

  1. Click System Admin > Configuration > System Compute Profiles.

  2. Click Create New Profile.

  3. To add a secure key to a new Dataproc cluster, select Dataproc:

  4. To add the secure key, in the Service Account Key field, click the shield icon and click Specify a different secure key.

  5. In the box that appears, type the name of the key. In this example, type mykey and click Enter.

    CDAP adds the secure key as a macro: ${secure(mykey)}

  6. To save the secure key in the compute profile, click Create.

Created in 2020 by Google Inc.