Surfaces & Transports

A surface is the transport a participant communicates over — a browser widget, a phone call, a WhatsApp thread, a video meeting. Surfaces are orthogonal to how a run starts and to how a voice conversation feels. This page explains the model so you can pick the right combination.

The four axes

Channel configuration is described by four independent axes. Keeping them separate is what makes the model predictable:

AxisWhere it lives
1. How a run startsThe workflow's triggers[] inbound_message, webhook, cron, or a widget open. Inbound phone calls arrive as a provider webhook.
2. Transport (family)Segment 1 of the participant's surface widget, phone, messaging, inbox, dashboard, reserved meeting.
3. Mode / platformSegment 2 of the surface — widget/voice, widget/form, messaging/telegram, phone/outbound.
4. Voice style & providerThe participant's config style, provider, hold_timeout.
Family = transport, not modality: Voice is not a family of its own. It is a mode of a transport: widget/voice for the browser and the phone family for telephony. How the voice conversation behaves is the config.style.

Transport families

SurfaceDescription
widgetBrowser chat widget — the default, rich-UI surface.
widget/formBrowser widget rendered as a single form.
widget/voiceBrowser voice. Supports all three voice styles via config.style.
inbox · dashboardInternal operator surfaces for team members reviewing or actioning runs.
phone · phone/outboundTelephony over the voice provider (vobiz). Inbound answers a call; outbound dials a number. Continuous and live styles only.
messaging/{sms,whatsapp,telegram,slack,discord,teams,email}Text transports. The reply drives workflow progression; the run suspends between messages and resumes on inbound.
meeting/{google_meet,teams,zoom}Reserved — video meeting transports. Defined but not yet routable. Room-as-single-participant semantics; continuous and live styles only.
Platform ≠ family: A platform can appear under more than one family. messaging/teams (a Teams DM/channel) and meeting/teams (a Teams video meeting) are different transports that happen to share a vendor.

Voice styles

Any voice-capable transport picks a style in its config. The style — not the surface name — determines how turn-taking works:

styleBehaviour
push_to_talkButton-gated capture. Requires rich UI, so it is available on widget/voice only.
continuousVoice-activity-detected half-duplex turn taking. Works on widget/voice, phone, and meeting.
liveFull-duplex streaming (Gemini Live). Provider is gemini. Works on widget/voice, phone, and meeting.
json
{
  "name": "caller",
  "bind": "run_initiator",
  "surface": "widget/voice",
  "config": { "style": "live", "provider": "gemini", "voice": "Kore" }
}

Connections vs. surfaces

Connections and surfaces are the two ways a workflow touches the outside world, and they are easy to confuse. The distinction is about direction and whether the run waits:

Surface — inbound, advances the run

How a participant talks to a running workflow: a widget, a phone call, a WhatsApp thread. When a participant replies on a surface, that reply drives the workflow forward. The run pauses at a human-in-the-loop step and waits for input.

Connection — outbound, never blocks

How a workflow calls an external service: an API request, an MCP tool call, sending an email. The workflow fires and continues — it gets a result and moves to the next node without waiting for a human.

The litmus test: Ask: does a reply advance the run?If a participant's response is what moves the workflow to its next step, that is a surface. If a node calls out, takes the result, and proceeds immediately, that is a connection (a tool).

The same vendor can be both. A messaging/telegram surface is where a participant chats with the run and their replies advance it; a telegram connection is a bot token a Notify node uses to fire a one-way alert and move on. One waits for a human; the other does not.

Triggers vs. surfaces

A trigger decides when a run begins; a surface decides how a participant talks to it during the run. They are independent. An inbound phone call (delivered as a webhook from the telephony provider) starts a run whose caller participant is bound to the phone surface — but the same workflow could just as easily be started by a cron trigger that places a phone/outbound call.

Hold policy & async

The engine is always asynchronous at its core. Whether a surface can hold a synchronous leg open while it waits for the next step is a capability, not a workflow setting. Text transports gracefully suspend and resume on the next inbound message; phone and live legs end after config.hold_timeout seconds; email is high-latency and never holds a leg.

Per-node overrides

A participant declares a default surface. A WorkflowCall node can override the surface for a single sub-workflow invocation via participant_surfaces. A bare mode (form) is interpreted relative to the participant's family; a full surface (messaging/sms) replaces it outright.

Support matrix

TransportStyles supported
widget/voicepush_to_talk · continuous · live
phonecontinuous · live
meeting/* (reserved)continuous · live
messaging/*text only (no voice style)

For outbound calls to external services — the other half of how a workflow touches the world — see Connections. For the design rationale and migration details see Workflows & DSL.