Skip to content

Joins and splits

Every node routes tokens in two directions, and each direction is a pluggable policy:

  • a split governs the node's outgoing flows: how a firing node fans out;
  • a join governs the node's incoming flows: when a node with several incoming branches fires.

A plain task node defaults to an immediate join and an all split, so it just flows straight through. You only set a join or split when a node fans out or converges. A gateway is a node that presets a join/split pair (see below); you can also set them directly on any node for finer control.

Splits (outgoing)

A split chooses which outgoing flows a firing node takes, among those whose condition holds (an unconditioned flow always holds).

Split Takes Pattern
All (all, default) every outgoing flow whose condition holds parallel fork (unconditioned) or inclusive fan-out (conditioned)
First (first) only the first outgoing flow whose condition holds, tried top to bottom exclusive choice

So the same first split is an exclusive gateway when its flows carry conditions, and all is both the parallel fork and the inclusive fan-out depending on whether the flows are conditioned.

Joins (incoming)

A join decides when a node with several incoming branches continues.

Join Fires when Pair with
Immediate (immediate, default) each branch arrives, no synchronization a node with a single incoming flow
Wait for all (wait_all) a token has arrived on every incoming arc a parallel split (all branches always run)
Wait for matching (matching) a token has arrived on every incoming arc whose condition holds an inclusive / conditional split (only some branches run)
Threshold (threshold) a token has arrived on N of the incoming arcs (early fire) a parallel split where a partial result is enough (N of M)
Quorum (quorum) the decision is settled, enough branches approve or approval becomes unreachable a vote where either outcome can be called early
Timeout (timeout) every branch arrives, or a deadline passes a wait-for-all that must not hang on a slow branch

Both wait_all and matching count each incoming arc once, however many tokens it delivered, so a branch that produced two tokens does not fire the join early.

wait_all vs matching

This is the distinction that matters most:

  • wait_all is structural: it waits for every incoming arc, regardless of conditions. Use it after a parallel split, where every branch is guaranteed to run, so "all arcs" is the active set.
  • matching is state-aware: when a token arrives it re-evaluates each incoming arc's condition against the current variables and waits for only the arcs whose conditions hold (the branches the upstream split actually activated). Use it after an inclusive or conditional split.

The practical consequence: matching never deadlocks on a branch that was never started, because that branch's condition does not hold so it is not waited for. wait_all after a conditional split would hang forever on the branches that did not run.

Rule of thumb

The moment the set of branches that can reach a join varies at runtime, use matching. Use wait_all only when every incoming branch is guaranteed to run.

How a matching join decides

A matching join is local and does no graph analysis. Each time a token arrives it:

  1. re-evaluates the conditions on its own incoming arcs against that token's view of the variables;
  2. takes the arcs whose conditions hold as the set of branches it must wait for;
  3. fires once a token has arrived on every arc in that set (a set of one fires immediately).

It never inspects the split, the sibling branches, or the rest of the graph; there is no token-tracking and no topology analysis. And it recomputes the expected set on every arrival, the first one included. That last point is the crux: if the expected set comes out too small at the first arrival, the join sees the arrived arc(s) already cover it and fires, dropping branches it should have waited for. A later re-evaluation cannot undo an early fire.

Wiring a matching join correctly

A matching join is correct only when all three of these hold. Miss any one and it either deadlocks or fires early.

  1. The condition is on both arcs. The same condition must sit on the split's outgoing arc and on the join's incoming arc for that branch. The split arc gates whether the branch starts; the join arc is what the join re-evaluates to decide whether to wait for it. Leave a join arc unconditioned and it always holds, so matching waits for every branch, silently degrading to wait_all and deadlocking on the branches that never ran.

  2. The deciding variable is visible from every branch. The join evaluates each arc from the view of whichever token happens to arrive, and a token sees only instance-wide variables plus the token-scoped ones in its own lineage (itself and its ancestors), never a sibling branch's locals (the engine skips a variable local to a token outside the current lineage). So the deciding variable must be instance-scoped, or token-scoped on a token at or above the fork (a common ancestor of every branch). A variable each branch sets independently is invisible to the others and cannot drive the decision.

  3. The decision is settled by the first arrival, and stays stable. Because the expected set is computed at the first arrival, the conditions must already evaluate to their final, complete value by the time the first token reaches the join, and must not change before the join resolves. The exact boundary is "by the first arrival at the join"; set the variable before the fork is the safe, sufficient rule, because after a fork the branches advance asynchronously and which one arrives first is nondeterministic, so no post-fork write is guaranteed to beat the first arrival.

Three ways a matching join breaks

Mirror every deciding condition onto the join's incoming arcs, reading a variable that is instance-scoped (or token-scoped at/above the fork) and fixed before the fork. Miss the mirroring and the join deadlocks (unconditioned arcs degrade it to wait_all); miss the scope or the timing and it fires early (the deciding state was not settled at the first arrival, so the expected set came out too small). The inclusive gateway pairs the arcs for you, but the scope and timing rules still apply.

Why 'let each branch announce itself' fails

A tempting alternative is to drop the shared condition and have each branch set its own did_x flag that the join arc tests. It breaks twice over. Branch-local, the join cannot see another branch's flag (requirement 2). And even instance-scoped it loses the race (requirement 3): the first branch to finish reaches the join before the others have set their flags, so the expected set is computed as that one branch and the join fires immediately, dropping the rest. The set of branches must be decided at the split, from state fixed beforehand, not reconstructed from what the branches do.

Threshold (N of M): firing early

Where wait_all waits for every branch, a threshold join fires the moment the configured number have arrived. Set it explicitly on the convergence node, with a count:

n_decision:
  type: script
  join:
    plugin: threshold
    settings:
      count: 2          # fire once two of the incoming branches arrive

Use it for a partial result, "proceed once two of three reviewers approve," without waiting on the slowest branch. Like the other multi-branch joins it can merge a variable from each arrived branch, so a tally (say, the approvals) is available to the outgoing flows, here over the branches that arrived in time. A count at or above the number of incoming arcs simply behaves like wait_all.

What happens to the branches still running

When a threshold join fires early, the M-N branches still in flight are torn down: the engine cancels their tokens (and, through the inbox, their open tasks), so a reviewer is not left holding a task for a decision already made. This is the classic discriminator: once the join has fired, a late branch must neither carry on nor fire the join a second time.

The teardown is driven by the fork cohort each token records: the branches of one split share a cohort, and firing early closes it. A still-live branch of a closed cohort self-cancels at its next step rather than advancing or re-arriving at the join, so even a branch that was mid-flight at the instant of the fire stops on its own. Cohorts are per-split and per-iteration, so a loop that re-enters the split starts a fresh cohort.

Use one fork per threshold join

A threshold join tears down its fork's branches, so the supported shape is a single split fanning out to the join (the N-of-M pattern). Branches of that fork are expected to converge on the join, not diverge elsewhere.

Quorum: deciding either way early

A threshold join counts arrivals; a quorum join counts votes. It reads each arrived branch's collected vote and fires the moment the decision is settled, in either direction:

  • enough branches approve (the quorum is reached), or
  • so few can still approve that the quorum is out of reach (enough have come back negative).

So a "two of three approvals" review passes the instant the second approval arrives, and fails the instant the second rejection arrives, without waiting on the last reviewer. As with the threshold join, the branches still running are then torn down.

n_decide:
  type: script
  join:
    plugin: quorum
    settings:
      count: 2            # approvals needed to pass
      approve_value: approved   # the vote value that counts as an approval
      collect: vote       # the per-branch vote variable
      into: votes         # the tally the outgoing flows route on
      scope: token
  split: { plugin: first }

The join only fires; routing to the approved or rejected path is the usual merge plus a count condition on the outgoing flows over the collected votes, which classifies both fire reasons correctly (a quorum-reached fire has enough approvals; an unreachable fire does not). Votes must be cast token-locally (scope: token, and each task writing its result token-scoped) so each branch carries its own.

Quorum vs the synchronous count

Routing on a count condition after a wait_all join also decides pass or fail, but only once every branch has voted. The quorum join is the early-decision form: it stops as soon as the outcome cannot change.

Timeout: giving up on a slow branch

A timeout join waits for every branch like wait_all, but gives up once a deadline passes: it fires with whatever has arrived and tears the branches still running down. Use it when a branch might never finish, "collect every approval, but give up after seven days."

n_gather:
  type: script
  join:
    plugin: timeout
    settings:
      timeout: P7D        # seconds or an ISO-8601 duration

The deadline is armed on the first branch to wait. The cron timeout sweep wakes the join when it passes (a later arrival fires it too), so the join needs no clock of its own beyond the sweep Orchestra already runs for parked tasks. Before the deadline it is an ordinary AND-join.

The split/join pairings (and gateways)

Most useful combinations correspond to a BPMN gateway, which is just a node that presets the pair:

Pattern Split Join Gateway
Parallel (AND) all wait_all parallel
Inclusive (OR) all matching inclusive
Exclusive (XOR) first immediate exclusive

Using a gateway node is the readable way to express these; setting the join and split directly on a plain task node gives the same behavior with finer control (for example a wait_all join and a merge policy on an ordinary script node, as the quorum tally does).

Merge policy (collecting branch results)

A join that waits for several branches (wait_all or matching) can gather a variable from each joined branch into a list on the continuing token, the quorum-tally pattern. The merge policy is the join plugin's own settings:

Setting Meaning
collect the variable read from each joined branch (empty merges nothing)
into the variable the collected values are written to, as a list
scope instance (shared with the whole process) or token (local to the continuing branch)

When the join fires, it resolves collect through each joined branch's own lineage and writes the gathered values as a list to into. A count flow condition downstream can then decide on the list, e.g. "at least two of the collected votes are approved".

n_tally:
  type: script
  join:
    plugin: wait_all
    settings:
      collect: vote      # each reviewer wrote their vote here (token-scoped)
      into: votes        # gathered into this list on the continuing token
      scope: token
  split:
    plugin: first        # then route exclusively on the count

With matching the list holds the results of only the branches that ran, which is what makes it the right join for an applicability-based tally.

The collected variable is the branch-local one

collect is read from each branch's own lineage, so it is normally a token-scoped (branch-local) variable: the quorum's vote is set result_scope: token so the reviewers' votes do not overwrite each other. A token-scoped variable is visible only to its own branch (its token lineage), never to a sibling branch. That is exactly why it works for collecting results but cannot drive a matching join's decision: the join cannot see another branch's local value. One variable is gathered after the branches join; the deciding condition is a separate, instance-scoped variable read before they do (see the warning above).

When to use matching: concrete examples

Each of these has an inclusive/conditional split that activates only some branches; matching converges exactly those (and wait_all would deadlock).

A worked example: the same condition on both arcs

A decision before the fork records what to do (here two instance-scoped booleans), so the choice is settled before any branch runs. The inclusive split and the matching join then carry the same condition on each branch:

flowchart LR
  d["Choose channels<br/>sets notify_email, notify_sms<br/>(instance-scoped, before the fork)"]
  d --> s{{"split: all"}}
  s -->|notify_email| e["Send email"]
  s -->|notify_sms| m["Send SMS"]
  e -->|notify_email| j{{"join: matching"}}
  m -->|notify_sms| j
  j --> done["Log delivery"]
flows:
  # split -> branches
  - id: f_email
    from: g_split
    to: n_email
    condition: { plugin: comparison, settings: { variable: notify_email, operator: '==', value: true } }
  - id: f_sms
    from: g_split
    to: n_sms
    condition: { plugin: comparison, settings: { variable: notify_sms, operator: '==', value: true } }
  # branches -> join: the SAME conditions, mirrored
  - id: f_email_join
    from: n_email
    to: g_join
    condition: { plugin: comparison, settings: { variable: notify_email, operator: '==', value: true } }
  - id: f_sms_join
    from: n_sms
    to: g_join
    condition: { plugin: comparison, settings: { variable: notify_sms, operator: '==', value: true } }
  • Pick email only: the split takes f_email, so only n_email runs; the join evaluates both incoming arcs against the same instance variables, finds only f_email_join holds, and waits for exactly that branch.
  • Pick both: all four arcs hold, so the join waits for both branches.
  • Because notify_email / notify_sms were fixed before the fork, the join computes the correct set on the first arrival, never too early.

The shipped example_inclusive expresses the same idea with a single channel choice and an any condition; two booleans are shown here for readability.

The bullets below show the split side only

For brevity each example names its condition once, on the split. Every one still needs that same condition mirrored onto the join's incoming arc, on a variable that meets the three requirements above (or use an inclusive gateway). Omit the mirror and the join degrades to wait_all and deadlocks on the branches that never ran.

  • Multi-channel notifications: the user picks email / SMS / push; only the chosen channels send, and the join waits for the selected ones. (Shipped as example_inclusive.)
  • Applicability-based review: Legal reviews only if there is a contract, Finance only if the amount exceeds a threshold, Security only if it touches PII. The join waits for just the applicable reviews; a merge collects each applicable reviewer's verdict so a count can require no rejections.
  • Optional data enrichment: geocode if an address is present, credit-check if an amount is set, KYC if the customer is new. The join waits for whichever enrichments were triggered.
  • Multi-target publishing: website always, plus social / newsletter / partner feed when flagged. The join waits for the selected destinations' confirmations.
  • Order fulfillment sub-tasks: gift-wrap if requested, customs paperwork if international, age verification if restricted. The join waits for the sub-tasks that apply before shipping.
  • Conditional integrations: sync to CRM if customer-facing, ERP if it affects inventory, analytics always. The join waits for the integrations that fired.

Configuring joins and splits

  • In the Complete modeler each routing-capable node shows a Join and/or Split select. The chosen plugin's description appears under the select, and any settings it has (a merging join's collect / into / scope) appear in a collapsible section below.
  • In config a node carries join and split as {plugin, settings} maps; a gateway node presets them and is not edited directly. The merge policy lives in the join's settings.
  • Examples to read: example_parallel (all + wait_all), example_inclusive (all + matching), and example_quorum (wait_all + merge + a count condition).