Namespaces

Overview

Namespace is a logical grouping of application, data and its metadata in CDAP. Conceptually, namespaces can be thought of as a partitioning of a CDAP instance. Any application or data (referred to here as an “entity”) can exist independently in multiple namespaces at the same time. The data and metadata of an entity is stored independent of another instance of the same entity in a different namespace.

The primary motivation for namespaces in CDAP is to achieve application and data isolation. This is an initial step towards introducing multi-tenancy into CDAP. Use-cases that benefit from namespaces include partitioning a single Hadoop Cluster into multiple namespaces:

  • to support different computing environments, such as development, QA, and staging;

  • to support multiple customers; and

  • to support multiple sub-organizations within an organization.

Namespace Components

A namespace has a namespace identifier (the namespace 'name') and a description.

Namespace IDs are composed from a limited set of characters. They are restricted to letters (a-z, A-Z), digits (0-9), and underscores (_). There is no size limit on the length of a namespace ID nor on the number of namespaces.

The namespace IDs cdapdefault, and system are reserved and cannot be deleted. The default namespace, however, can be used by anyone, except in the case where authorization is required for a secure cluster.

Independent and Non-hierarchical

Namespaces are flat, with no hierarchy inside them. (Namespaces are not allowed inside another namespace.)

As part of the independence of namespaces, inter-namespace operations are controlled: for example, an application from one namespace can access datasets from a different namespace, as long as you have the appropriate permissions.

Identifying Entities in a Namespace

The ID of an entity in a namespace is composed of a combination of the namespace ID plus the entity ID, since an entity cannot exist independently of a namespace.

Using Namespaces

The best practice for using namespaces is to create desired namespaces and use them for all operations. Otherwise, CDAP will use the default namespace for any operations undertaken.

Once a namespace has been created, you can edit its description and configuration preferences, either by using a Microservices or the Command Line Interface.

CDAP includes the default namespace out-of-the-box. It is guaranteed to always be present, and is recommended for:

  1. Proof-of-concept or sandbox applications for trying out CDAP; and

  2. Testing your apps before deploying them in development, QA, or production environments.

It is the namespace used when no other namespace is specified. However, for most use cases beyond the proof-of-concept stage, we recommend that you create appropriate namespaces and operate CDAP within them.

Namespaces can be deleted. When a namespace is deleted, all components (applications, datasets, MapReduce programs, Spark programs, metrics, etc.) are first deleted, and then the namespace itself is removed. To delete a namespace, see the Namespace Microservices documentation. In the case of the default namespace, the name is retained, as the default namespace is always available in CDAP.

As this is an unrecoverable operation, extreme caution must be used when deleting namespaces. It can only be done if all programs of the namespace have been stopped, and if the cdap-site.xml parameter enable.unrecoverable.reset has been enabled.

Custom Mapping of Storage Providers

When creating a namespace, the underlying storage provider can also be configured for the namespace. For example, a custom HBase namespace, or HDFS directory can be specified to be used for the data in a particular namespace. See Namespace Configurations for how to specify the properties of a namespace.

When configuring an underlying storage provider, CDAP will not manage the lifecycle of these entities. They must exist before the CDAP namespace is created, and they will not be removed upon the deletion of the CDAP namespace. The contents (the data) of the storage provider are deleted when a CDAP namespace is deleted through CDAP.

Namespace Examples

The CDAP Command Line Interface is namespace-aware. You set the namespace you are currently using. The command prompt displays it as a visual reminder.

Created in 2020 by Google Inc.