Concepts¶
The token model¶
Orchestra borrows its execution model from Petri nets and BPMN. A process is a graph; a token is a marker that lives on one node of that graph and represents a point of control currently in progress. The engine does one thing, repeatedly: take a token, execute the node it sits on, and decide what happens to the token next.
| Concept | Stored as | Role |
|---|---|---|
| Tenant | Config entity | An isolated realm; runtime rows partition per tenant. |
| Workflow | Config entity | The template: nodes, flows and a start node. |
| Process instance | Content entity | One running execution of a workflow. |
| Token | Content entity | A marker on a node; the unit of execution. |
| Variable | Content entity | A named value carried by an instance. |
The lifecycle¶
Starting a workflow creates an instance and places one token on the start node. From there the engine advances tokens:
- Advance: the node's work is done. The token is consumed and a successor is produced on each outgoing flow the node's routing keeps (see Routing below). Several successors run as concurrent branches.
- Park: the node is waiting (a human task, a timer). The token parks on the node and the engine moves on. A parked token resumes only when it is signalled, which moves it past the node along the outgoing flows.
An instance completes once no token of it is still active or parked.
flowchart LR
start([Start]) --> work[Script] --> hold{{Wait}} --> finish([End])
In the flow above, a token advances through Start and Script, then parks on Wait. When the wait is over the token is signalled, advances to End, and the instance completes.
Asynchronous advancement¶
Advancement is asynchronous. Each active token is placed on the
orchestra_advance queue and processed by a queue worker on cron, so a parked
token simply leaves the queue and rejoins it when signalled. Because each step
is queued independently, parallel branches advance concurrently.
Task types¶
What a node does is a TaskType plugin. The kernel ships five primitives:
start: the entry point; advances immediately.end: a branch exit; advances and, with no successors, lets the instance complete.passthrough: an automated step; advances immediately (a no-op pass-through in the kernel, a hook for real automated work).wait: parks the token until signalled; the domain-agnostic primitive that the human-task submodule and other integrations build on.subprocess: runs another process as a child and resumes with its result. If the child does not complete (it fails or is cancelled) and the node sets a status variable, the parent resumes with__subprocess_failed__or__subprocess_cancelled__so its outgoing flows can route the error rather than wait forever. A failed child can also be retried: with aretrycount the engine re-launches a fresh child on cron up to that many times (after an optionalretry_backoff) before giving up with the failed status.
Timeouts¶
A parked task waits for a signal that may never come, so the engine bounds the
wait. A parking node opts in with a node-level timeout map (duration,
action, settings, and an anchor that sets when the window starts: park,
instance or node), and a site-wide default_timeout is a
safety net for every parked task. On expiry a cron sweep runs the node's
TimeoutAction plugin, by default resume, which records a timeout result
the outgoing flows route on. The action is pluggable, so a timeout can instead
notify, release a claim, spawn a parallel branch, or cancel; and a node can
stage several into a timers escalation. See Timers.
Routing: conditions, splits and joins¶
Branching is not a special node type. It falls out of three small, pluggable pieces that apply to every node.
Flow conditions¶
A flow may carry a condition (a FlowCondition plugin). When a node
finishes, the engine keeps only the outgoing flows whose condition holds (a
flow with no condition always holds): the live flows. The kernel ships:
comparison: a process variable against a value (==,!=,>,>=,<,<=,empty,not_empty).all/any: composites holding child conditions, so they nest into a boolean tree of any depth.count: how many entries of a list variable equal a value, compared against a threshold (>=,<, ...), the basis of a quorum.
A condition's variable may be a dotted path into a structured value:
decision.result reads the result key of the array held in decision. So a
structured result (e.g. a human decision recorded as {result, comment})
routes on decision.result while decision.comment rides along. A plain name
reads a scalar exactly as before.
A custom condition (the weather, a role, an API call) is just another plugin.
Split: which live flows to take¶
A node's split (a Split plugin) chooses, among the live flows, which the
token follows: all (the default) takes every one, first takes only the
first. Conditions and split compose: the conditions say which branches are
eligible, the split says how many. That one rule expresses the classic
shapes: mutually exclusive conditions are an exclusive choice, no conditions a
parallel fork, overlapping ones an inclusive split. And it works off any
node, so a task can branch on its own.
Join: when to fire¶
A node's join (a Join plugin) is the incoming side: when a token arrives,
it decides whether to fire now or wait for more. The decision runs atomically
under a per-node lock, so two branches completing at the same instant cannot
both believe they are last (firing twice) or both believe they are not (leaving
it stuck). The kernel ships:
immediate(the default): fire on each arrival (a merge).wait_all: the AND-join: fire once a token has arrived from every incoming branch.threshold: fire as soon as N of the M branches arrive, then cancel the rest (an early-firing discriminator).quorum: fire as soon as a vote is settled, either enough branches approve or approval becomes unreachable, then cancel the rest.timeout: fire when every branch arrives or a deadline passes, cancelling the branches still running.matching: the OR-join: fire once every incoming flow whose condition holds has arrived. Because the same conditions drove the upstream split, the join waits for exactly the branches that were activated, without tracking tokens across the graph.
Variable scope and the join merge¶
A variable is instance-wide by default, shared across the whole process. A variable may instead be token-local: set on one token, it is visible only to that token and the tokens descended from it (each token records its parent, so a value set on a branch flows down to that branch's successors). Token-local is the right scope for a per-branch decision, so parallel branches running the same step do not overwrite each other.
When a join fires, its optional merge policy collects one variable from each
joined branch into a list. With the count condition this turns parallel
branch decisions into a quorum, with no special node type:
flowchart LR
start([Start]) --> r1[Review 1]
start --> r2[Review 2]
start --> r3[Review 3]
r1 --> tally{{"wait_all<br/>merge vote → votes"}}
r2 --> tally
r3 --> tally
tally -->|approved ≥ 2| approve[Approved]
tally -->|else| reject[Rejected]
Each reviewer writes its vote token-locally; the join waits for all three and
merges their votes into a votes list; a count condition routes on how many
are approved. After the join, the continuing token is placed under the
branches' common ancestor, so per-branch locals do not leak past the join while
variables set before the split still resolve. This is the example_quorum
workflow in orchestra_examples.
Correlation: finding a process by its business key¶
An external event often has to reach the process waiting for it: a payment
return, an inbound webhook, a message keyed by a business reference. An instance
carries an optional first-class correlation key (a business key, like
Camunda's businessKey or Zeebe's correlation key), set at start():
$engine->start('order_fulfillment', $vars, NULL, (string) $order->id());
// ... later, when the payment provider calls back:
$instances = $engine->findInstancesByCorrelationKey((string) $order_id);
The key is an indexed scalar column on the instance, scoped to the acting tenant, so the lookup is a plain index-served equality, portable across databases. It need not be unique, so several instances may share one.
The key is a string, even when your reference is really an integer: resolve
with it (a varchar = string comparison is portable), but do not join an
integer column against it (integer = varchar has no portable cast and skips
the index). For a persistent link or a Views relationship, store the
instance id (the integer primary key) on your entity and join on that: the
correlation key is the handle that resolves the instance from an external
event, the instance id is the foreign key you join on.
Loops and re-entry¶
A loop is not a special construct; it is just a flow pointing back to an earlier node (a "send back for rework" arc, say). Two things make loops work without bookkeeping:
- Re-entering a parking node regenerates its work. When a token returns to
a
user(or any parking node), it parks again and a fresh task is created: the reviewer gets a new task each pass. - A join re-entered by a loop re-arms on its own. A join's only state is the set of tokens currently parked waiting at its node, and firing consumes them; so the next pass through the join starts from a clean slate, with no stale arrival to mis-fire it.
One caveat: a wait_all (AND) join inside a loop fires only when every
incoming branch arrives, so a loop that re-feeds just one of its branches will
wait forever; arrange the loop to re-enter the fork, not a single branch. (The
early-firing threshold, quorum and timeout joins, which discard late
"straggler" branches, scope that teardown per iteration with fork cohorts, so a
loop re-entering the fork starts a fresh cohort. See
Joins and splits.)
Task assignment¶
A user task is pooled by default: anyone with the permission sees it in the
inbox and may claim it. To target a task, give it an assignment. The plugin
type is Audience (Plugin/Audience, #[Audience]): an audience resolves who
a node reaches. An audience that can also staff a task implements
AssignmentInterface (extending AudienceInterface) and answers two more
questions in opaque string tokens: who a task is for (its candidates, minted
when the task is created) and which audiences a given viewer belongs to (their
viewerTokens, computed at the inbox). The inbox shows a task when those two
sets intersect; an empty candidate set means the task stays pooled. The resolved
candidates live on a multi-value field on the task, so assignment is matched as
a plain query, not recomputed per request.
Three staffing audiences ship: users (tokens like user:5), roles
(role:editor), and variable. The variable plugin resolves its audience at task-creation
time from a process variable ("the user named in approver"), so who
handles a task is decided at runtime by an upstream step rather than baked into
the workflow. The variable may hold a user ID or username, or a list of
either, and it is read against the parked token's lineage (a branch-local value
is seen); it mints user: tokens, so the named user finds the task in the same
inbox query. A node either carries the flat modeler fields assignee_users /
assignee_roles, or a structured assignments list whose entries union, so a
task can be offered to a role pool and a named user at once. Because tokens
are opaque strings, a new audience (a group, an org unit, an expression) is just
a new plugin minting its own token namespace, with no schema or storage change.
The three ballots of example_quorum show each form: the flat field, the
structured plugin, and a union of both.
Each assignment also carries a notify flag and resolves its audience to
concrete accounts (recipients(), the notification counterpart of
candidates()). This is a neutral intent the inbox never acts on itself: the
optional orchestra_mail submodule reads it to email the
audience, and any other notifier (an ECA model) can read it too. Named
audiences (users, variable) notify by default; a roles pool opts in, so
pooling to a broad role never mails its members by surprise.
Gateways¶
A gateway is a routing node defined as a (join, split) pair, with no task
of its own. The named gateways are presets over those two knobs:
| Gateway | Join | Split | Behavior |
|---|---|---|---|
parallel |
wait_all |
all |
Fork every branch; join waits for all. |
exclusive |
immediate |
first |
Take the first matching branch; merge. |
inclusive |
matching |
all |
Take every matching branch; join waits for those. |
An exclusive choice (approve or reject) needs only conditions on the flows:
flowchart LR
start([Start]) --> g{Approved?}
g -->|approved == true| approve[Provision]
g -->|else| reject[Notify]
approve --> done([End])
reject --> done
A parallel fork and AND-join run both branches and wait for both:
flowchart LR
start([Start]) --> fork{{Fork}}
fork --> a[Branch A]
fork --> b[Branch B]
a --> join{{Join}}
b --> join
join --> done([End])
Because join and split are independent and pluggable, the same machinery covers
more than the named gateways: a task node can itself be a join point, a
synchronous quorum is wait_all + a merge + a count condition (no special
node), and the early-firing threshold, quorum and timeout joins are each
just a Join plugin over the same machinery.