Joins and splits¶
Every node routes tokens in two directions, and each direction is a pluggable policy:
- a split governs the node's outgoing flows: how a firing node fans out;
- a join governs the node's incoming flows: when a node with several incoming branches fires.
A plain task node defaults to an immediate join and an all split, so it just flows straight through. You only set a join or split when a node fans out or converges. A gateway is a node that presets a join/split pair (see below); you can also set them directly on any node for finer control.
Splits (outgoing)¶
A split chooses which outgoing flows a firing node takes, among those whose condition holds (an unconditioned flow always holds).
| Split | Takes | Pattern |
|---|---|---|
All (all, default) |
every outgoing flow whose condition holds | parallel fork (unconditioned) or inclusive fan-out (conditioned) |
First (first) |
only the first outgoing flow whose condition holds, tried top to bottom | exclusive choice |
So the same first split is an exclusive gateway when its flows carry
conditions, and all is both the parallel fork and the inclusive fan-out
depending on whether the flows are conditioned.
Joins (incoming)¶
A join decides when a node with several incoming branches continues.
| Join | Fires when | Pair with |
|---|---|---|
Immediate (immediate, default) |
each branch arrives, no synchronization | a node with a single incoming flow |
Wait for all (wait_all) |
a token has arrived on every incoming arc | a parallel split (all branches always run) |
Wait for matching (matching) |
a token has arrived on every incoming arc whose condition holds | an inclusive / conditional split (only some branches run) |
Threshold (threshold) |
a token has arrived on N of the incoming arcs (early fire) | a parallel split where a partial result is enough (N of M) |
Quorum (quorum) |
the decision is settled, enough branches approve or approval becomes unreachable | a vote where either outcome can be called early |
Timeout (timeout) |
every branch arrives, or a deadline passes | a wait-for-all that must not hang on a slow branch |
Both wait_all and matching count each incoming arc once, however many
tokens it delivered, so a branch that produced two tokens does not fire the join
early.
wait_all vs matching¶
This is the distinction that matters most:
wait_allis structural: it waits for every incoming arc, regardless of conditions. Use it after a parallel split, where every branch is guaranteed to run, so "all arcs" is the active set.matchingis state-aware: when a token arrives it re-evaluates each incoming arc's condition against the current variables and waits for only the arcs whose conditions hold (the branches the upstream split actually activated). Use it after an inclusive or conditional split.
The practical consequence: matching never deadlocks on a branch that was
never started, because that branch's condition does not hold so it is not
waited for. wait_all after a conditional split would hang forever on the
branches that did not run.
Rule of thumb
The moment the set of branches that can reach a join varies at runtime, use
matching. Use wait_all only when every incoming branch is guaranteed to
run.
How a matching join decides¶
A matching join is local and does no graph analysis. Each time a token arrives it:
- re-evaluates the conditions on its own incoming arcs against that token's view of the variables;
- takes the arcs whose conditions hold as the set of branches it must wait for;
- fires once a token has arrived on every arc in that set (a set of one fires immediately).
It never inspects the split, the sibling branches, or the rest of the graph; there is no token-tracking and no topology analysis. And it recomputes the expected set on every arrival, the first one included. That last point is the crux: if the expected set comes out too small at the first arrival, the join sees the arrived arc(s) already cover it and fires, dropping branches it should have waited for. A later re-evaluation cannot undo an early fire.
Wiring a matching join correctly¶
A matching join is correct only when all three of these hold. Miss any one and it either deadlocks or fires early.
-
The condition is on both arcs. The same condition must sit on the split's outgoing arc and on the join's incoming arc for that branch. The split arc gates whether the branch starts; the join arc is what the join re-evaluates to decide whether to wait for it. Leave a join arc unconditioned and it always holds, so
matchingwaits for every branch, silently degrading towait_alland deadlocking on the branches that never ran. -
The deciding variable is visible from every branch. The join evaluates each arc from the view of whichever token happens to arrive, and a token sees only instance-wide variables plus the token-scoped ones in its own lineage (itself and its ancestors), never a sibling branch's locals (the engine skips a variable local to a token outside the current lineage). So the deciding variable must be instance-scoped, or token-scoped on a token at or above the fork (a common ancestor of every branch). A variable each branch sets independently is invisible to the others and cannot drive the decision.
-
The decision is settled by the first arrival, and stays stable. Because the expected set is computed at the first arrival, the conditions must already evaluate to their final, complete value by the time the first token reaches the join, and must not change before the join resolves. The exact boundary is "by the first arrival at the join"; set the variable before the fork is the safe, sufficient rule, because after a fork the branches advance asynchronously and which one arrives first is nondeterministic, so no post-fork write is guaranteed to beat the first arrival.
Three ways a matching join breaks
Mirror every deciding condition onto the join's incoming arcs, reading a
variable that is instance-scoped (or token-scoped at/above the fork) and
fixed before the fork. Miss the mirroring and the join deadlocks
(unconditioned arcs degrade it to wait_all); miss the scope or the timing
and it fires early (the deciding state was not settled at the first arrival,
so the expected set came out too small). The inclusive gateway pairs the arcs
for you, but the scope and timing rules still apply.
Why 'let each branch announce itself' fails
A tempting alternative is to drop the shared condition and have each branch
set its own did_x flag that the join arc tests. It breaks twice over.
Branch-local, the join cannot see another branch's flag (requirement 2). And
even instance-scoped it loses the race (requirement 3): the first branch to
finish reaches the join before the others have set their flags, so the
expected set is computed as that one branch and the join fires immediately,
dropping the rest. The set of branches must be decided at the split, from
state fixed beforehand, not reconstructed from what the branches do.
Threshold (N of M): firing early¶
Where wait_all waits for every branch, a threshold join fires the moment
the configured number have arrived. Set it explicitly on the convergence node,
with a count:
n_decision:
type: script
join:
plugin: threshold
settings:
count: 2 # fire once two of the incoming branches arrive
Use it for a partial result, "proceed once two of three reviewers approve,"
without waiting on the slowest branch. Like the other multi-branch joins it can
merge a variable from each arrived
branch, so a tally (say, the approvals) is available to the outgoing flows,
here over the branches that arrived in time. A count at or above the number of
incoming arcs simply behaves like wait_all.
What happens to the branches still running¶
When a threshold join fires early, the M-N branches still in flight are torn down: the engine cancels their tokens (and, through the inbox, their open tasks), so a reviewer is not left holding a task for a decision already made. This is the classic discriminator: once the join has fired, a late branch must neither carry on nor fire the join a second time.
The teardown is driven by the fork cohort each token records: the branches of one split share a cohort, and firing early closes it. A still-live branch of a closed cohort self-cancels at its next step rather than advancing or re-arriving at the join, so even a branch that was mid-flight at the instant of the fire stops on its own. Cohorts are per-split and per-iteration, so a loop that re-enters the split starts a fresh cohort.
Use one fork per threshold join
A threshold join tears down its fork's branches, so the supported shape is a single split fanning out to the join (the N-of-M pattern). Branches of that fork are expected to converge on the join, not diverge elsewhere.
Quorum: deciding either way early¶
A threshold join counts arrivals; a quorum join counts votes. It reads each arrived branch's collected vote and fires the moment the decision is settled, in either direction:
- enough branches approve (the quorum is reached), or
- so few can still approve that the quorum is out of reach (enough have come back negative).
So a "two of three approvals" review passes the instant the second approval arrives, and fails the instant the second rejection arrives, without waiting on the last reviewer. As with the threshold join, the branches still running are then torn down.
n_decide:
type: script
join:
plugin: quorum
settings:
count: 2 # approvals needed to pass
approve_value: approved # the vote value that counts as an approval
collect: vote # the per-branch vote variable
into: votes # the tally the outgoing flows route on
scope: token
split: { plugin: first }
The join only fires; routing to the approved or rejected path is the usual
merge plus a count condition on the
outgoing flows over the collected votes, which classifies both fire reasons
correctly (a quorum-reached fire has enough approvals; an unreachable fire does
not). Votes must be cast token-locally (scope: token, and each task writing
its result token-scoped) so each branch carries its own.
Quorum vs the synchronous count
Routing on a count condition after a wait_all join also decides pass or
fail, but only once every branch has voted. The quorum join is the
early-decision form: it stops as soon as the outcome cannot change.
Timeout: giving up on a slow branch¶
A timeout join waits for every branch like wait_all, but gives up once a
deadline passes: it fires with whatever has arrived and tears the branches
still running down. Use it when a branch might never finish, "collect every
approval, but give up after seven days."
n_gather:
type: script
join:
plugin: timeout
settings:
timeout: P7D # seconds or an ISO-8601 duration
The deadline is armed on the first branch to wait. The cron timeout sweep wakes the join when it passes (a later arrival fires it too), so the join needs no clock of its own beyond the sweep Orchestra already runs for parked tasks. Before the deadline it is an ordinary AND-join.
The split/join pairings (and gateways)¶
Most useful combinations correspond to a BPMN gateway, which is just a node that presets the pair:
| Pattern | Split | Join | Gateway |
|---|---|---|---|
| Parallel (AND) | all |
wait_all |
parallel |
| Inclusive (OR) | all |
matching |
inclusive |
| Exclusive (XOR) | first |
immediate |
exclusive |
Using a gateway node is the readable way to express these; setting the join and
split directly on a plain task node gives the same behavior with finer control
(for example a wait_all join and a merge policy on an ordinary script node,
as the quorum tally does).
Merge policy (collecting branch results)¶
A join that waits for several branches (wait_all or matching) can gather a
variable from each joined branch into a list on the continuing token, the
quorum-tally pattern. The merge policy is the join plugin's own settings:
| Setting | Meaning |
|---|---|
collect |
the variable read from each joined branch (empty merges nothing) |
into |
the variable the collected values are written to, as a list |
scope |
instance (shared with the whole process) or token (local to the continuing branch) |
When the join fires, it resolves collect through each joined branch's own
lineage and writes the gathered values as a list to into. A count flow
condition downstream can then decide on the list, e.g. "at least two of the
collected votes are approved".
n_tally:
type: script
join:
plugin: wait_all
settings:
collect: vote # each reviewer wrote their vote here (token-scoped)
into: votes # gathered into this list on the continuing token
scope: token
split:
plugin: first # then route exclusively on the count
With matching the list holds the results of only the branches that ran,
which is what makes it the right join for an applicability-based tally.
The collected variable is the branch-local one
collect is read from each branch's own lineage, so it is normally a
token-scoped (branch-local) variable: the quorum's vote is set
result_scope: token so the reviewers' votes do not overwrite each other. A
token-scoped variable is visible only to its own branch (its token lineage),
never to a sibling branch. That is exactly why it works for collecting
results but cannot drive a matching join's decision: the join cannot see
another branch's local value. One variable is gathered after the branches
join; the deciding condition is a separate, instance-scoped variable read
before they do (see the warning above).
When to use matching: concrete examples¶
Each of these has an inclusive/conditional split that activates only some
branches; matching converges exactly those (and wait_all would deadlock).
A worked example: the same condition on both arcs¶
A decision before the fork records what to do (here two instance-scoped booleans), so the choice is settled before any branch runs. The inclusive split and the matching join then carry the same condition on each branch:
flowchart LR
d["Choose channels<br/>sets notify_email, notify_sms<br/>(instance-scoped, before the fork)"]
d --> s{{"split: all"}}
s -->|notify_email| e["Send email"]
s -->|notify_sms| m["Send SMS"]
e -->|notify_email| j{{"join: matching"}}
m -->|notify_sms| j
j --> done["Log delivery"]
flows:
# split -> branches
- id: f_email
from: g_split
to: n_email
condition: { plugin: comparison, settings: { variable: notify_email, operator: '==', value: true } }
- id: f_sms
from: g_split
to: n_sms
condition: { plugin: comparison, settings: { variable: notify_sms, operator: '==', value: true } }
# branches -> join: the SAME conditions, mirrored
- id: f_email_join
from: n_email
to: g_join
condition: { plugin: comparison, settings: { variable: notify_email, operator: '==', value: true } }
- id: f_sms_join
from: n_sms
to: g_join
condition: { plugin: comparison, settings: { variable: notify_sms, operator: '==', value: true } }
- Pick email only: the split takes
f_email, so onlyn_emailruns; the join evaluates both incoming arcs against the same instance variables, finds onlyf_email_joinholds, and waits for exactly that branch. - Pick both: all four arcs hold, so the join waits for both branches.
- Because
notify_email/notify_smswere fixed before the fork, the join computes the correct set on the first arrival, never too early.
The shipped example_inclusive expresses the same idea with a single channel
choice and an any condition; two booleans are shown here for readability.
The bullets below show the split side only
For brevity each example names its condition once, on the split. Every one
still needs that same condition mirrored onto the join's incoming arc, on
a variable that meets the three requirements above (or use an inclusive
gateway). Omit the mirror and the join degrades to wait_all and deadlocks
on the branches that never ran.
- Multi-channel notifications: the user picks email / SMS / push; only the
chosen channels send, and the join waits for the selected ones. (Shipped as
example_inclusive.) - Applicability-based review: Legal reviews only if there is a contract,
Finance only if the amount exceeds a threshold, Security only if it touches
PII. The join waits for just the applicable reviews; a merge collects each
applicable reviewer's verdict so a
countcan require no rejections. - Optional data enrichment: geocode if an address is present, credit-check if an amount is set, KYC if the customer is new. The join waits for whichever enrichments were triggered.
- Multi-target publishing: website always, plus social / newsletter / partner feed when flagged. The join waits for the selected destinations' confirmations.
- Order fulfillment sub-tasks: gift-wrap if requested, customs paperwork if international, age verification if restricted. The join waits for the sub-tasks that apply before shipping.
- Conditional integrations: sync to CRM if customer-facing, ERP if it affects inventory, analytics always. The join waits for the integrations that fired.
Configuring joins and splits¶
- In the Complete modeler each routing-capable node shows
a Join and/or Split select. The chosen plugin's description appears
under the select, and any settings it has (a merging join's
collect/into/scope) appear in a collapsible section below. - In config a node carries
joinandsplitas{plugin, settings}maps; a gateway node presets them and is not edited directly. The merge policy lives in the join'ssettings. - Examples to read:
example_parallel(all+wait_all),example_inclusive(all+matching), andexample_quorum(wait_all+ merge + acountcondition).