Lookup in Transforms
Lookup in Transforms
Requirements
- Operations
- Perform single + batch read on single + multiple dataset from script transform
- Perform single + batch read on single + multiple files from script transform
- Supported tables for lookup
- KeyValueTable dataset
- ObjectMappedTable dataset
- CSV files treated as a list of key-value pairs
- Optional caching with time-based expiration
Design
Lookup interface
interface Lookup<T> { T lookup(String key); Map<String, T> lookup(String... keys); Map<String, T> lookup(Set<String> keys); }
- Implement Lookup in KeyValueTable and ObjectMappedTable
- KeyValueTable implements Lookup<String>
- ObjectMappedTable implements Lookup<StructuredRecord>
- DatasetConfigurer changes
- Add method: void useDataset(String datasetName);
- ScriptTransform changes
Add configuration property for declaring lookup tables to use, properties for each table (e.g. dataset properties)
"tables": [ { "name":"purchases", "type":"dataset", "properties": { "dataset":"purchases", "properties":{.. dataset properties ..}, "enableCache":"true", "cacheExpiry":1234 } }, {"name":"ip2geo", "type":"file", "properties":{"file":"/data/ip2geo.csv"}} ]
- configure(): verify tables (datasets and files) exist by calling DatasetConfigurer.useDataset()
- transform(): execute lookup methods in a transaction, provide Lookup instance to script
Options for lookup usage:
var result = context.getLookup("purchases").lookup(user);
Options for batch lookup usage:
var result = context.getLookup("purchases").lookup(["alice", "bob"]); // do something with result["alice"] // do something with result["bob"]
, multiple selections available,
Related content
JavaScript Transformation
JavaScript Transformation
More like this
Table Lookup directive
Table Lookup directive
More like this
Datasets Revamp
Datasets Revamp
More like this
Key Value Dataset Sink (Deprecated)
Key Value Dataset Sink (Deprecated)
More like this
Schema on Read with Wrangler Directives - WIP
Schema on Read with Wrangler Directives - WIP
More like this
Dynamic Multiple Fileset Sink (Deprecated)
Dynamic Multiple Fileset Sink (Deprecated)
More like this
Created in 2020 by Google Inc.