Upgrade Metadata
Checklist
- User Stories Documented
- User Stories Reviewed
- Design Reviewed
- APIs reviewed
- Release priorities assigned
- Test cases reviewed
- Blog post
IntroductionÂ
Recently, changes to metadata indexing have been introduced in order to support date and numeric metadata search. Because of this, date and numeric search will not work on users’ existing entities created with the old indexing pattern. Therefore, there needs to be a way to update the metadata of outdated entities.
Goals
Detect outdated metadata and update their indexing. Ensure that no concurrency issues arise between users and the upgrade method updating the same metadata.
User StoriesÂ
- As a pipeline developer, I had many entities already created with numeric metadata values and I would like to now use the new Data Fusion feature to create numeric search queries on these entities.
- As a pipeline developer, I had previously defined metadata properties with a date syntax that Data Fusion now supports, and I would like to do a date search over those properties.
Design
Update the indexing of outdated metadata entities. Outdated entities will be detected by their metadata version number (metadataVersion < 2 will be considered outdated). Since indexing information is stored in the Property class, new MetadataDocument instances with Property objects created with the new Property constructor will have to be made. Existing metadata information must be read and replaced in one transaction while keeping concurrency issues in mind (if both user and upgrade method attempt to update the same metadata information at the same time).
Approach
Change the value METADATA_VERSION in VersionInfo from 1 to 2.
Metadata entities to upgrade will have METADATA_VERSION < 2
This value can be changed again in the future for more upgrades.
Gather all outdated metadata entities into a list and pass them into a method [name TBD] to upgrade them.
Approach #1
Utilize:Â
batch(List<? extends MetadataMutation> mutations, MutationOptions options)
inElasticsearchMetadataStorage
MetadataMutation
is an abstract class andUpdate
is a class that extends it. AnUpdate
object hastype = UPDATE
and takes in aMetadataEntity
andMetadata
objects that will be updated.Â
Get
MetadataEntity
andMetadata
pairs fromMetadataRecords
in order to constructMetadataMutation / Update
 objects to pass into thebatch()
methodIn its implementation,
batch()
creates newMetadataDocument
objects along with newProperty
objects. The newly implementedProperty
constructor will store information necessary for the new indexing format. Updating this information should update the indexing.Âbatch()
attempts to rewrite metadata until there are no conflicts (concurrency issues) so it is safe to use.ÂSomething to think about:
batch() checks whether there are any duplicate entities in the input and if there are then it doesn’t do a batch() update, but since we control the input and know that there will not be any duplicates perhaps we can separate those two parts of the code.
Currently:
batch(): checks for duplicates, does batch
Call batch() on outdated entities
Alternative:Â
batch(): checks for duplicates, doBatchMethod()
Call doBatchMethod() on outdated entities
Created in 2020 by Google Inc.