Job

About

A job is an execution of a pipeline triggered by new data detected in an input repository.

When a commit is made to the input repository of a pipeline, jobs are created for all downstream pipelines in a directed acyclic graph (DAG), but they do not run until the prior pipelines they depend on produce their output. Each job runs the user’s code against the current commit in a repository at a specified branch and then submits the results to the output repository of the pipeline as a single output commit.

Each job has a unique alphanumeric identifier (ID) that users can reference in the <pipeline>@<jobID> format. Jobs have the following states:

Sate Description
CREATED An input commit exists, but the job has not been started by a worker yet.
STARTING The worker has allocated resources for the job (that is, the job counts towards parallelism), but it is still waiting on the inputs to be ready.
UNRUNNABLE The job could not be run, because one or more of its inputs is the result of a failed or unrunnable job. As a simple example, say that pipelines Y and Z both depend on the output from pipeline X. If pipeline X fails, both pipeline Y and Z will pass from STARTING to UNRUNNABLE to signify that they had to be cancelled because of upstream failures.
RUNNING The worker is processing datums.
EGRESS The worker has completed all the datums and is uploading the output to the egress endpoint.
FINISHING After all of the datum processing and egress (if any) is done, the job transitions to a finishing state where all of the post-processing tasks such as compaction are performed.
FAILURE The worker encountered too many errors when processing a datum.
KILLED The job timed out, or a user called StopJob
SUCCESS None of the bad stuff happened.