Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

A schedule must have a unique name within its application (the same name can be used in different applications), and additionally consists of:

  • the The workflow to be executed, along with properties that translate into runtime arguments for the workflow run;.

  • Trigger, which initiates the execution of the program by creating a Job for the workflow;.

  • a A set of Run Constraints, which can delay or prevent the execution of the workflow;.

  • a A timeout for the run constraints to be fulfilled; when . When this timeout is exceeded, the workflow will not execute.

...

  • The buildSchedule method returns a builder to create schedules with various kinds of triggers and run constraints.

  • Currently, the The only program type that can be scheduled is a workflow.

  • Replacing abortIfNotMet() with waitUntilMet() would have the effect that the workflow execution is delayed until no other concurrent run of the same workflow is executing.

  • This schedule does not specify properties for the workflow execution.

  • This schedule does not configure a timeout, such that the default timeout of one day is used.

...

Code Block
schedule(
  buildSchedule("Workflow1AndWorkflow2CompletedSchedule", ProgramType.WORKFLOW, "TriggeredWorkflow")
    .triggerOn(getTriggerFactory().and(getTriggerFactory().onProgramStatus(ProgramType.WORKFLOW, "Workflow1",
                                                                           ProgramStatus.COMPLETED),
                                       getTriggerFactory().onProgramStatus(ProgramType.WORKFLOW, "Workflow2",
                                                                           ProgramStatus.COMPLETED)));

This schedule uses an and trigger a trigger that can only be satisfied when both Workflow1 is completed and Workflow2 is completed.

...

Execution of workflows is initiated when the trigger of a schedule fires. This creates a Job for the workflow in the scheduler's Job Queue. This job will not necessarily execute immediately; instead. Instead, each job goes through a life cycle:

  • When a job is initially created, it is in state pending trigger. Most triggers are fulfilled immediately when the job is created. But some triggers may require additional input: . For example, a partition trigger can specify a minimum number of new partitions to be present in a dataset. When one or more partitions are added to the dataset, then this creates an event that leads to the creation of a job. But the number of partitions may not be sufficient yet, and it would require more partition events until the trigger is fulfilled. Until then, the job will remain in pending trigger state.

  • When the job’s trigger is fulfilled, the job’s state changes to pending constraints. If the job has no constraints, then it will not remain in this state; however. However, if it has constraints, then it remains pending constraints until all constraints are fulfilled. The scheduling system now continuously checks whether its constraints are fulfilled. During this check, if any constraint is not fulfilled and it was added with abortIfNotMet(), then the job is aborted and removed from the job queue.

  • When all of a job’s constraints are fulfilled, the job’s state changes to pending launch. At this time, the scheduling system will prepare the execution of the workflow, and once it is started successfully, the job is complete and removed from the job queue. Note that the workflow itself can still fail during its execution, but if the scheduler successfully submitted the workflow for execution, then the job is considered complete from the scheduler’s point of view. If starting the workflow fails, however, the job remains pending launch and the system will retry execution.

  • If a job does not reach pending launch state before its configured timeout, it is aborted and removed from the job queue.

  • If a schedule is deleted, modified or disabled, then all jobs for that schedule are aborted and removed from the job queue, regardless of their state. However, due to timing and concurrency, a job that is pending launch may still execute around the same time that the schedule was modified.

  • At any given time, there is only one job in state pending trigger or pending constraints for a given schedule. That means that if the schedule’s trigger fires again, it does not create a new job in the job queue. Only after the job transitions into pending launch state can the schedule's trigger create a new job.

...

A run constraint can either delay or prevent the execution of a schedule’s workflow, based on a condition represented by the constraint. The default behavior of whether the execution is delayed or aborted is different for each type of run constraint—it constraint. It can be configured explicitly by specifying either .waitUntilMet() or .abortIfNotMet() when adding the constraint to the schedule builder. Every individual type of run constraint also has its own default for this behavior. These The following constraints are available:

  • withConcurrency(int n): Fulfilled if less than n runs of the same workflow are currently executing. This is useful to limit the frequency and resource utilization of a single workflow. By default, this aborts the job if not fulfilled.

  • withDelay(long n, TimeUnit unit): Fulfilled at least n time units after the job is created. This is useful to delay the execution of a workflow after its trigger fires, for . For example, if it is known that after some new data arrives, more new data may arrive within short time, and the workflow should wait for that.

  • withTimeWindow(String startTime, String endTime): Fulfilled only in the time window between the given start and end time. Both times are given in “HH:mm” form, and an optional timezone can be given to interpret these times. By default, this delays the execution of the job, but it can be configured to abort the job if the trigger fires outside the time window. This is useful to limit the execution of certain workflows to times when the load on the cluster is low.

  • withDurationSinceLastRun(long n, TimeUnit unit): Fulfilled only after n time units since the start of the last successful run of the same workflow. This is useful to limit the frequency of execution of the workflow. By default, this aborts the execution if not met.

...

  • Create: This happens either as part of application deployment or through the Lifecycle Microservices. After creating a schedule, it is initially disabled and will not execute any jobs.

  • Disable: Disabling a schedule will delete all pending jobs for the schedule from the job queue, and prevent new jobs from being created. This action will not suspend or abort any current execution of the workflow.

  • Enable: This action will put the schedule back into an active state, after a Disable action. Note that if the schedule was previously disabled, that aborted all pending jobs for the schedule. Therefore new triggers have to create new jobs for this schedule before its workflow is executed again.

  • Delete: This first disables the schedule and then permanently deletes it.

  • Update: This is equivalant equivalent to deleting the current schedule and creating a new one. It happens either when an application which contains a schedule is redeployed in CDAP, or through the Lifecycle Microservices.

...