...
New APIs (in MapReduceContext, used in beforeSubmit):
| Code Block | ||
|---|---|---|
| ||
// addsspecify a Dataset and arguments, to thebe used setas ofan output Datasets for the MapReduce job: context.addOutput(String datasetName); context.addOutput(String datasetName, Map<String, DatasetString> datasetarguments); |
New APIs - note that this will be a custom mapper, reducer, and context classes which override the hadoop classes, providing the additional functionality of writing to multiple outputs:
...
New APIs (in BatchSinkContext, used in prepareRun of the BatchSink):
| Code Block | ||
|---|---|---|
| ||
// addsspecify a Dataset and arguments, to be theused setas ofan output Datasets for the Adapter job: context.addOutput(String datasetName); context.addOutput(String datasetName, Map<String, DatasetString> datasetarguments); |
Example Usage:
| Code Block | ||
|---|---|---|
| ||
public void beforeSubmit(MapReduceContext context) throws Exception {
context.addOutput("cleanCounts");
context.addOutput("invalidCounts");
// ...
}
public static class Counter extends AbstractReducer<Text, IntWritable, byte[], Long> {
@Override
public void reduce(Text key, Iterable<IntWritable> values, Context context) {
// do computation and output to the desired dataset
if ( ... ) {
context.write(key.getBytes(), val);
} else {
context.write("invalidCounts", key.getBytes(), val);
}
} |
...