...
The data flows of a pipeline can be either batch or real-timerealtime, and a variety of processing paradigms (MapReduce or Spark) can be used.
...
Realtime pipelines poll sources periodically to fetch the data, perform optional transformations, and then write the output to one or more real-time realtime sinks.
Note: CDAP supports at-least-once output of data into sinks in real-time realtime pipelines, but it doesn't guarantee exactly-once delivery. If you require exactly-once output, plan for occasional duplication of data in sinks.
...