Execution mode¶

A process advances one node at a time. How those advances are scheduled is the execution mode: synchronous (the default) or queued.

Synchronous (the default)¶

In synchronous mode, after a top-level operation (start, signal, or a timer fork) commits, the engine advances that process's own tokens inline, in the same request, so it runs to its next stable state immediately. Completing a human task advances the process at once; starting a process runs it straight to its first wait or to completion, with no cron in the loop.

The advance is scoped to the one process: a synchronous run only ever advances its own tokens (and the children of its synchronous subprocesses), never another process that happens to have work waiting. So turning on synchronous execution for one workflow never makes an unrelated request pay to advance someone else's queued process.

This is the default because it is what most sites want: a process advances as soon as it can, with no cron in the loop and no "why has nothing happened yet?" surprise. It is the right fit for interactive and low-latency processes and for development. The cost is a longer request for a heavy process (many parallel branches or subprocesses advance before the request returns), which is why high-volume sites may prefer queued.

Each inline advance still runs in its own transaction with the same retry and dead-letter handling as cron; synchronous mode changes only when advancement happens, not how. The inline run is bounded by a per-request cap: a process that needs more advances than the cap (in practice only a misconfigured never-parking loop) finishes on the next cron, with a logged warning.

Queued¶

Each active token is placed on the orchestra_advance queue and processed by the queue worker on cron. Starting or resuming a process advances it one node and enqueues its successors; the next cron run advances those, and so on, until every token reaches a stable state (parked on a human task, waiting at a join, or terminal).

Queued execution smooths load (a spike of work is spread across cron runs) and keeps the request that starts or resumes a process short, so it suits high-volume sites and heavy processes. It does mean cron must run regularly: on a site whose cron is not running, queued processes simply sit in the queue. Orchestra surfaces this on the status report, where a warning points here if work is waiting and cron has not run for hours.

Configuration¶

Site-wide. Orchestra settings has an Execution mode select (queued or synchronous). It is the default for every workflow.
Per workflow. Each workflow's Execution tab can inherit the site default or force queued or synchronous, so an interactive workflow can run synchronously on an otherwise queued site, or vice versa.

The mode is resolved once, when the process starts: the workflow's override wins, falling back to the site-wide setting, which itself defaults to synchronous. That resolved choice is then stored on the instance, so a running process keeps the mode it started with even if the configuration changes later, and the mode of a single running process can be switched (queued to synchronous and back) without affecting any other run.

Cron always remains the safety net: every advance is also enqueued, so anything a synchronous request does not finish (the request dies, the per-request cap is hit, or the workflow is queued) is advanced on the next cron run, and timeouts still fire on the cron sweep. Running cron regularly therefore stays good practice even on a mostly-synchronous site.

What synchronous mode does and does not cover¶

Synchronous mode changes when a run is advanced, not the rest of the engine's behavior. A few consequences are worth knowing.

Failures and timeouts still resolve on cron. A node that throws is left enqueued and retried (with its backoff), then dead-lettered into an incident, by cron exactly as in queued mode; a synchronous start does not surface the incident in the request. Task timeouts and escalation timers likewise fire on the cron sweep, not inline. So a synchronous site should still run cron.
Bounded per request. One run's inline advancement, including any synchronous subprocesses it spawns, is capped per request; a run that needs more (in practice only a never-parking loop) finishes on the next cron, with a logged warning. Starting many processes in one request (a bulk import, or an event that fires on every node save) advances each inline, so a high-volume or bulk path is a good place to force queued.
Subprocesses run within the parent's advance. A synchronous subprocess is advanced inside the parent step's transaction, so the child and parent commit together. This is consistent for workflow state, but a child's external side effects (an API call) are not rolled back if that transaction later fails. One advance holds its lock for the whole synchronous subtree, and that lock auto-expires after five minutes, so a single synchronous advance (including any subprocess subtree it drains) must finish within that window; force queued execution for work that can run longer, or cron may re-enter it.
Side effects run in the advancing request's transaction. A node's action runs, and orchestra_mail sends, inside the advance transaction, just as on cron, so synchronous mode does not change that; it only moves it into the request. A slow mail transport lengthens the request, and a notification a later rollback cannot recall is possible (a narrow window that exists in queued mode too). Use a queued mail backend if sending should not happen in the request.
Redundant queue items rely on cron to clear. Every token is enqueued even when it is advanced inline (that is the safety net), and the inline drain does not remove the matching queue item. The cron queue worker prunes those now-redundant items as it runs: it finds the token already advanced and drops the item, which cannot mis-advance anything (the same active-status check that makes two cron workers safe applies). This never affects how a process advances. But the items do accumulate without cron: a site that never runs cron grows the queue table, one stale item per advance, and a site that runs cron only rarely makes each pass spend part of its time budget clearing the backlog before it reaches real work, while the backlog inflates the queue size the status report reads. So run cron periodically even on a synchronous site.

High concurrency¶

Each node advance takes a short lock to run exactly once (see the engine for why). With the default database lock backend that lock is a row in the semaphore table, which becomes a write hot spot when many requests advance processes at the same instant (for example a popular booking session opening). For that kind of load, point Drupal's lock service at a non-database backend (the redis module provides one) in your site's services.yml; the locking is then off the database and the ceiling rises to ordinary database throughput. This is a deployment choice, not a code change, and only matters at high concurrency: a few hundred simultaneous advances are fine on the default backend.

One more concurrency note: when two parallel branches of the same process write the same instance-wide variable at the same time, the last write wins (the engine does not lock individual variables). This is the usual shared-state caveat of parallel branches; give a variable a single writer, or scope it to the token, when a deterministic value matters.