...
Note: This would also satisfy user story 5, where a unique can be implemented as a Aggregation plugin, where you group by the fields you want to unique, and ignore the Iterable<> in aggregate and just emit the group key.
Story 2: Control Flow (Multiple Sources)
Option 1: Introduce different types of connections. One for data flow, one for control flow
...
Option2: Make it so that connections into certain plugin types imply control flow rather than data flow. For example, introduce "condition" plugin type. Connections into a condition imply control flow rather than data flow. Similarly, connections into an "action" plugin type would imply control flow
Code Block |
---|
{ "stages": [ { "name": "customersTable", "plugin": { "name": "Database", "type": "batchsource", ... } }, { "name": "customersFiles", "plugin": { "name": "TPFSParquet", "type": "batchsink", ... } }, { "name": "afterDump", "plugin": { "name": "AlwaysRun", "type": "condition" } }, { "name": "purchasesTable", "plugin": { "name": "Database", "type": "batchsource" } }, { "name": "purchasesFiles", "plugin": { "name": "TPFSParquet", "type": "batchsink", ... } }, ], "connections": [ { "from": "customersTable", "to": "customersFiles" }, { "from": "customersFiles", "to": "afterDump" }, { "from": "afterDump", "to": "purchasesTable" }, { "from": "purchasesTable", "to": "purchasesFiles" } ] } |
...