Adaptable Search Tips

Checklist

  • User Stories Documented
  • User Stories Reviewed
  • Design Reviewed
  • APIs reviewed
  • Release priorities assigned
  • Test cases reviewed
  • Blog post

Introduction 

Since the release of CDAP 6.0.0, several new metadata search features have been introduced: required, numeric, and date search fields now exist. As such, the search tips presented to the user must be updated to reflect the current state of metadata search. However, some search features are only available in the Elasticsearch implementation of metadata search, and are unavailable in the NoSQL implementation. For this reason, it is necessary to adapt the search features presented depending on the metadata storage implementation that CDAP is using. 

Use-Case 

  • A CDAP user interacting with CDAP through the UI would like to understand and make use of new search features.

Design

A new REST API endpoint will be introduced, returning a JSON object communicating the available search features depending on the available metadata storage implementation. The UI will then make a GET request to that endpoint and use the information returned to generate the appropriate message describing the use of the given search features. The general task is thus twofold: an endpoint will be created, and the endpoint will be used in the UI.

Implementation

Creating the endpoint

Detecting features

A fundamental part of making the search feature presentation adaptable is determining which search features are available, so that CDAP might exhibit different corresponding behavior.

Approach #1

An intuitive way to determine this might be to determine the metadata storage implementation being used. MetadataHttpHandler.java, the class that will contain the REST API endpoint in question, does not directly store a variable that communicates this; as such, how do we find the shortest path from the MetadataHttpHandler to that variable?

MetadataHttpHandler contains a MetadataAdmin, which contains a MetadataStorage interface member. Typically, this member is an AuditMetadataStorage object, which itself holds a MetadataStorage member. This last MetadataStorage member is either a DatasetMetadataStorage object, corresponding to the NoSQL implementation, or an ElasticsearchMetadataStorage object, corresponding to the Elasticsearch implementation. With this in mind, the following relationship chain be imagined: MetadataHttpHandler -> MetadataAdmin -> MetadataStorage -> MetadataStorage.

Currently, neither MetadataAdmin nor MetadataStorage objects contain a method for retrieving any MetadataStorage object contained therein. In the case of MetadataAdmin, only one implementer exists—DefaultMetadataAdmin—which would make adding a getter for its MetadataStorage straightforward. However, adding a getter for MetadataStorage objects to the MetadataStorage interface may be less advisable, as it is atypical for MetadataStorage objects to contain their own MetadataStorage objects; indeed, AuditMetadataStorage is the only implementer that does this, and such a getter would only be meaningful for that class. The getter would probably then return an Optional<MetadataStorage> object. Alternatively, MetadataStorage objects that do not themselves contain a MetadataStorage object could return themselves. This way, calling the getter twice should return a valid MetadataStorage object in all cases.

The final task concerns evaluating the MetadataStorage object returned. How do we verify that a MetadataStorage object is ElasticsearchMetadataStorage versus DatasetMetadataStorage? We could use the instanceof operator, or check if the string representation of it contains elasticsearch, for instance.

Approach #2

Rather than make an assumption about the available search features by looking at the implementation, we might determine the available search features directly, running a suite of miniature tests that we would trust to detect a search feature. These would be distinct from the more comprehensive tests in, for instance, ElasticsearchMetadataStorageTest.java.

Performance may be a consideration here; would it be excessively slow to run these tests every time we want to see the available features? This may be improved by storing the available search information for the duration of the CDAP session.

Sending a JSON object

The JSON object returned by the REST API endpoint may require little information in order to serve its purpose. There may be advantages to providing more over less information, however, and the approaches below look to explore the tradeoffs.

Approach #1

Assuming Approach #1 to determining features is preferred, we may only pass on the metadata storage implementation—a simple key:value pair. From this information, the UI can assume that a hard-coded set of features is present and display the corresponding help text. The UI could, for instance, display a particular HTML file for Elasticsearch, or a different HTML file for the NoSQL implementation. One disadvantage of this approach is that a call that returns the metadata storage implementation in use is only very helpful for those that know what search features correspond.

Approach #2

Regardless of how we detect search features, it may be preferable to pass a more complex JSON object from which the UI would extract more specific information. This object could hold the name of the search feature and its availability. The UI could then iterate through these fields and display the corresponding help text for each.

Using the endpoint

The future details of this section are pending review.

The UI will make a GET request to the new REST API endpoint and display the returned JSON object in a human-readable, user-friendly format.

API changes

New REST APIs

Path

Method

Description

Response Code

Response

/v3/metadata/search/features

GET

Returns the metadata search features available.

OR

Returns the metadata storage implementation being used.

200 - On success

404 - Metadata storage implementation not found

[
	{
		"feature":"requiredFields",
		"available":true,
	},
	{
		"feature":"numericFields",
		"available":true | false,
	},
	...
]

OR

{
	"implementation":"elasticsearch" | "noSQL"
}

UI Impact or Changes

The future details of this section are pending review.

The existing Search Tips section will be extended to include information regarding required search fields, numeric search fields (if applicable), and date search fields (if applicable).

Related Jira

Error rendering macro 'jira' : Unable to locate Jira server for this macro. It may be due to Application Link configuration.

Unable to locate Jira server for this macro. It may be due to Application Link configuration.

Related Work

Created in 2020 by Google Inc.