Concepts
The mental model — sessions, flows, nodes, and reports. Read this once; build confidently.
FaceSign is built around four primitives. You create sessions, which run flows, which are graphs of nodes, which produce reports. Understand these four and you can build anything.
Sessions
A session represents a single verification attempt by an end user. You create a session on your backend by specifying a flow and optional configuration. FaceSign returns a client secret — a short-lived, single-use token — that you hand to your frontend. The end user opens the hosted URL and completes the verification. When finished, FaceSign produces a structured report with the results.
Sessions are the core primitive of the FaceSign API. Every verification, every conversation, every liveness check happens inside a session.
Session lifecycle
Every session progresses through a linear status lifecycle:
created — Your backend has called POST /sessions and received a session ID and client secret. The user has not opened the verification URL yet.
inProgress — The user has opened the hosted URL and is completing the flow. Six AI models run in parallel throughout: liveness detection, coercion analysis, environmental scanning, behavioral profiling, predictive risk scoring, and adaptive conversation.
complete or incomplete — The session has ended. complete means the user finished the flow and results are available in the session report. incomplete means the session ended without the user reaching an END node (abandoned, expired, or terminated).
created ──> inProgress ──> complete
──> incompleteSessions are one-time use. Each client secret can only open one session. If a user needs to verify again, create a new session.
Client secrets
When you create a session, FaceSign returns a client secret alongside the session object. The client secret includes:
- A token (prefixed
cs...) that authenticates the user for this session - A hosted URL that embeds the token as a query parameter
- An expiration timestamp after which the URL is no longer valid
The most common integration pattern is to redirect the user to the hosted URL or embed it as an <iframe> in your application. The client secret is safe to expose to the frontend — it grants access only to this specific session and cannot be reused.
Never expose your API key (sk_test_ or sk_prod_) to the client side. The client secret is the only token your frontend should see.
Session configuration
When you create a session, you can configure:
- Flow — The verification steps the user completes (required)
- Client reference ID — Your internal identifier for this verification (e.g., your user ID)
- Metadata — Custom key-value pairs for tracking and filtering
- Provided data — Pre-filled user data (name, email, phone) available to nodes during the flow
- Avatar — Which AI avatar guides the conversation
- Languages — Supported language codes and the default language
- Zone — Data residency zone (
usoreu) - Customization — UI branding options for the permissions page and controls
- Video AI analysis — Whether to enable post-session video analysis
All configuration is set at session creation time and cannot be changed after the session starts.
Timestamps
All timestamps in the session object and webhook payloads are in Unix milliseconds (not seconds). This applies to createdAt, startedAt, finishedAt, and all nested timestamp fields.
Flows
A flow is the verification journey you define for each session. It is an array of nodes — each representing a verification step — connected by outcomes that determine the path a user takes from start to finish.
Every flow is a directed acyclic graph (DAG). It starts at a single START node, branches through verification steps based on outcomes, and terminates at one or more END nodes. There are no loops — the user always moves forward through the graph.
The directed acyclic graph model
Flows are not linear checklists. They are graphs with branching paths. Each node produces one or more outcomes, and each outcome points to the next node. This lets you build flows that adapt to what happens during verification:
- A liveness check passes — proceed to document scan
- A liveness check detects a deepfake — route to a rejection endpoint
- A conversation node determines the user said their name — continue to the next step
- A conversation node determines the user refused — route to a different branch
START --- CONVERSATION --- LIVENESS_DETECTION --- DOCUMENT_SCAN --- END
| |
| +-- (deepfake) ---- END
|
+-- (refused) ---- ENDThe graph model gives you full control over error handling, fallback paths, and conditional logic without writing code beyond the flow definition itself.
Flow as configuration, not code
Flows are pure data — JSON arrays you pass to the API. You do not write code to implement verification logic. The AI models, camera handling, document parsing, and facial recognition are all handled by FaceSign. Your flow definition simply tells the platform what steps to run and how to connect them.
This means you can:
- Change verification requirements without redeploying your application
- A/B test flows by creating sessions with different flow definitions
Flow validation
FaceSign validates your flow at session creation time. If the flow is invalid, the API returns an error and the session is not created. Validation rules:
- Exactly one START node
- At least one END node
- No cycles (the graph must be acyclic)
- Every outcome must reference an existing node
idin the flow - All required fields for each node type must be present
Nodes
A node is a single step in a FaceSign verification flow. Each node performs one discrete action: greet the user, check liveness, scan a document, send an OTP, or end the session.
Every node has:
- A unique id you assign (e.g.,
"node_ask_name") - A type that determines its behavior (e.g.,
conversation,liveness_detection) - One or more outcomes that connect it to the next node
Node types
| Type | Purpose | Key outcomes |
|---|---|---|
start | Entry point of every flow. Exactly one per flow. | outcome: 'nextNodeId' |
end | Exit point. At least one per flow. Marks successful completion. | none |
conversation | AI avatar speaks a prompt and branches based on the user's response. | outcomes: [{ id, condition, targetNodeId }] |
recognition | 1:N face recognition to check if the user exists in your database. | { recognized, newUser, noFace } |
liveness_detection | Anti-spoofing check to confirm a real person is present. | { livenessDetected, deepfakeDetected, noFace } |
document_scan | Captures and verifies an identity document (passport, license, ID card). | { scanSuccess, userCancelled, scanTimeout } |
face_scan | Captures the user's face and compares it against a reference image. | { passed, notPassed, cancelled, error } |
face_compare | Compares two captured face sources for a match. | { match, noMatch, imageUnavailable } |
enter_email | Collects the user's email address. | { emailEntered, canceled } |
two_factor_email | Sends a one-time passcode via email for two-factor authentication. | { verified, delivery_failed, failed_unverified, cancelled, error } |
two_factor_sms | Sends a one-time passcode via SMS for two-factor authentication. | Same as email OTP |
data_validation | Routes the flow based on conditional logic against collected data. | Conditional array |
permissions | Requests camera and microphone permissions from the user. | { permissionsGranted, permissionsDenied } |
For the complete field-level reference — parameters, outcome keys, configuration options — see the Node Reference. For the authoritative machine-readable reference, fetch the facesign://node-types MCP resource at runtime.
How nodes connect
Nodes connect through outcomes. The outcome model varies by node category.
Single-outcome nodes. START has exactly one outcome — a single outcome field pointing to the first verification step. END has no outcomes; it terminates the flow.
{ "id": "start_node", "type": "start", "outcome": "first_step" }
{ "id": "done", "type": "end" }Fixed-outcome nodes. Most verification nodes (liveness detection, document scan, face scan, recognition, enter email, two-factor) define a fixed set of named outcomes. Each outcome corresponds to a specific result — success, failure, cancellation, or error.
{
"id": "liveness_check",
"type": "liveness_detection",
"outcomes": {
"livenessDetected": "next_step",
"deepfakeDetected": "rejected",
"noFace": "retry_step"
}
}Conditional-outcome nodes. CONVERSATION and DATA_VALIDATION nodes support dynamic branching. Instead of fixed outcome keys, you define an array of conditions. FaceSign's AI evaluates which condition matches and follows the corresponding path.
{
"id": "ask_purpose",
"type": "conversation",
"prompt": "Say: What brings you here today?",
"outcomes": [
{ "id": "tr_1", "condition": "The user wants to verify their identity", "targetNodeId": "verify_id" },
{ "id": "tr_2", "condition": "The user needs help with something else", "targetNodeId": "general_help" }
]
}Conditions are evaluated in order. The first matching condition determines the next node.
If you need retry-on-unclear-input behavior, fold it into the conversation node's prompt text, not into outcome edges. Self-loops (an outcome whose targetNodeId equals its own node's id) are rejected by the validator.
Common flow patterns
Simple conversational verification:
start -> greeting -> recognition -> liveness -> post_id -> closing -> endKYC onboarding with document scan:
start -> greeting -> conv_doc_prep -> document_scan -> face_compare -> closing -> endAccount recovery with knowledge-based questions:
start -> greeting -> recognition -> liveness -> [branch by recognition result] -> questions -> closing -> endStep-up for a sensitive action:
start -> greeting -> recognition -> liveness -> two_factor_email -> intent_confirm -> closing -> endComplete example flows
These are battle-tested, production-ready flows. Copy them directly into your integration.
English-only: recognition + liveness + conversational
A minimal flow using Say: prefixes for tight wording control. This is the most common pattern for English-only deployments.
const flow = [
{ id: 'start', type: 'start', outcome: 'greeting' },
{
id: 'greeting', type: 'conversation',
prompt: "Say: Hi! I just need to quickly verify your identity. It only takes a few seconds. Ready?",
outcomes: [{ id: 'g1', condition: 'User has responded or acknowledged', targetNodeId: 'recognition' }]
},
{
id: 'recognition', type: 'recognition',
outcomes: { recognized: 'liveness_known', newUser: 'liveness_new', noFace: 'liveness_new' }
},
{
id: 'liveness_known', type: 'liveness_detection',
outcomes: { livenessDetected: 'greet_known', deepfakeDetected: 'greet_known', noFace: 'greet_known' }
},
{
id: 'greet_known', type: 'conversation',
prompt: "The user was recognized. Greet by full name and ask for DOB. Say: 'Hey [full name], great to see you! Can you quickly confirm your date of birth?' Use their full name, not just first name.",
outcomes: [{ id: 'gk1', condition: 'User gave DOB', targetNodeId: 'closing' }]
},
{
id: 'liveness_new', type: 'liveness_detection',
outcomes: { livenessDetected: 'ask_name', deepfakeDetected: 'ask_name', noFace: 'ask_name' }
},
{
id: 'ask_name', type: 'conversation',
prompt: "Say: Welcome! What's your name?",
outcomes: [{ id: 'an1', condition: 'User provided name', targetNodeId: 'ask_dob' }]
},
{
id: 'ask_dob', type: 'conversation',
prompt: "Say: Can you confirm your date of birth for me? If they respond unclearly, ask once more gently.",
outcomes: [
{ id: 'dob_done', condition: 'User gave DOB, or the conversation has exceeded two exchanges', targetNodeId: 'closing' }
]
},
{
id: 'closing', type: 'conversation', doesNotRequireReply: true,
prompt: "Say: You're all set! Your identity's been verified. Thanks!",
outcomes: [{ id: 'c1', condition: 'The assistant has delivered the closing message', targetNodeId: 'end' }]
},
{ id: 'end', type: 'end' }
]Pass this to client.session.create({ flow, avatarId, langs: ['en'], defaultLang: 'en' }).
The greeting node must have the avatar speaking for 5 seconds or more so the recognition node has enough video to work with. Short greetings cause recognition to fail with newUser even for returning users.
Multilingual: recognition + liveness + conversational
The same flow adapted for multilingual use. Prompts are descriptive rather than Say:, so the LLM can translate at runtime.
const flow = [
{ id: 'start', type: 'start', outcome: 'greeting' },
{
id: 'greeting', type: 'conversation',
prompt: "Warmly greet the user and tell them you need to quickly verify their identity. Explain that it only takes a few seconds. Ask if they're ready.",
outcomes: [{ id: 'g1', condition: 'User has responded or acknowledged', targetNodeId: 'recognition' }]
},
{
id: 'recognition', type: 'recognition',
outcomes: { recognized: 'liveness_known', newUser: 'liveness_new', noFace: 'liveness_new' }
},
{
id: 'liveness_known', type: 'liveness_detection',
outcomes: { livenessDetected: 'greet_known', deepfakeDetected: 'greet_known', noFace: 'greet_known' }
},
{
id: 'greet_known', type: 'conversation',
prompt: "The user was just recognized. Greet them by their full name and ask them to confirm their date of birth. Use their full name, not just their first name.",
outcomes: [{ id: 'gk1', condition: 'User gave DOB', targetNodeId: 'closing' }]
},
{
id: 'liveness_new', type: 'liveness_detection',
outcomes: { livenessDetected: 'ask_name', deepfakeDetected: 'ask_name', noFace: 'ask_name' }
},
{
id: 'ask_name', type: 'conversation',
prompt: "Welcome the user and ask for their full name.",
outcomes: [{ id: 'an1', condition: 'User provided name', targetNodeId: 'ask_dob' }]
},
{
id: 'ask_dob', type: 'conversation',
prompt: "Ask the user to confirm their date of birth. If they respond unclearly, ask once more gently.",
outcomes: [
{ id: 'dob_done', condition: 'User gave DOB, or the conversation has exceeded two exchanges', targetNodeId: 'closing' }
]
},
{
id: 'closing', type: 'conversation', doesNotRequireReply: true,
prompt: "Warmly tell the user that they are all set and their identity has been verified. Thank them.",
outcomes: [{ id: 'c1', condition: 'The assistant has delivered the closing message', targetNodeId: 'end' }]
},
{ id: 'end', type: 'end' }
]Pass this to client.session.create({ flow, avatarId, langs: ['en', 'it', 'de', 'fr', 'es', 'pt'], defaultLang: 'en' }).
For multilingual demos, do NOT use the "Say: ..." prefix. Say: is a literal-speech directive — the avatar speaks that exact text verbatim. Descriptive prompts let the LLM paraphrase and translate into the resolved session language at runtime. See Languages & Localization for details.
Reports & Outcomes
When a user completes a FaceSign session, the platform produces a session report — a structured object attached to the session that contains everything that happened during verification. The report is the single source of truth for the session's outcome.
You access the report by retrieving the session via the API or by receiving it through webhooks.
The report accumulates data from the moment a user begins interacting with the FaceSign widget. Some fields are available immediately when the session completes; others arrive asynchronously 5–20 seconds later.
Report structure overview
The session report contains six top-level sections:
| Section | What it contains | Availability |
|---|---|---|
| transcript | Every message exchanged between the AI avatar and the user, in order | Immediate |
| nodeReports | Per-node outcomes — which path the flow took at each step | Delayed (5–20s after session ends) |
| aiAnalysis | Screenshot-based analysis: estimated age, gender, real-vs-virtual detection, behavioral observations | Immediate |
| videoAIAnalysis | Video-based fraud and liveness analysis with confidence scores across multiple criteria | Delayed (5–20s after session ends) |
| location | GeoIP-derived city, country, coordinates, and timezone | Immediate |
| device | Browser, operating system, platform, and device type (mobile/desktop) | Immediate |
Transcript
The transcript is an ordered array of every spoken exchange. Each entry includes who spoke (AI avatar or user), what was said, and a Unix millisecond timestamp. The transcript preserves the exact conversation flow, usable for audit trails, compliance records, or downstream analysis.
Node reports
Node reports tell you the outcome of each verification step in the flow. For every node the user passed through, you get the node id and type, the outcome selected (which transition was taken), and a timestamp.
Node reports are the primary way to answer questions like "Did liveness pass?", "Which conversation branch was taken?", or "Did the document scan succeed?"
Node reports arrive 5–20 seconds after the session ends. If you fetch the session immediately on completion, this field may be empty. Use the session.status webhook to know when the session completes, then either poll GET /sessions/:id or wait for the delayed data via subsequent webhooks.
AI analysis
The AI analysis section provides observations derived from screenshots captured during the session: age estimate (min/max range), gender detection, real-person-or-virtual assessment, overall natural-language summary, and detailed analysis sections (person analysis, location observations, behavior and mood assessment, real-vs-virtual reasoning). This comes from the 6 parallel AI models and is available as soon as the session completes.
Video AI analysis
When video AI analysis is enabled, FaceSign performs a deeper analysis of the recorded video after the session ends. This covers fraud-specific criteria with individual confidence scores.
Video AI analysis is opt-in. Enable by setting videoAIAnalysisEnabled: true at session creation. Results arrive asynchronously — listen for the analysis.video webhook event.
Location and device
Every session captures contextual signals: location (city, country ISO code, latitude/longitude, timezone, derived from GeoIP) and device (browser name/version, operating system, platform, mobile/desktop). These fields are available immediately and are useful for risk scoring, geo-fencing, and anomaly detection.
Immediate vs. delayed fields
Not all report data is available at the same time. Understanding the timing helps you build robust integrations:
| Timing | Fields | How to receive |
|---|---|---|
| Immediate (on session complete) | transcript, aiAnalysis, location, device, lang | session.status webhook or GET /sessions/:id |
| Delayed (5–20s after complete) | nodeReports, videoAIAnalysis | analysis.video / analysis.screenshot webhooks, or poll GET /sessions/:id |
| Media (varies) | User photo, document photo, user video | media.* webhook events with signed download URLs |
Media URLs in webhook payloads are signed and expire after 15 minutes. Download and store any media you need for permanent access immediately upon receipt.
Interpreting outcomes
The report gives you raw signals. How you interpret them depends on your use case:
- Pass/fail decisions — Check node reports for the outcomes you care about. If
liveness_detectionresolved tolivenessDetectedandface_scanresolved topassed, the user is verified. - Risk scoring — Combine AI analysis confidence levels, location data, and device signals to compute a risk score in your system.
- Audit and compliance — Store the full transcript, node reports, and media for regulatory records.
- Escalation — Route sessions with ambiguous outcomes (e.g., low face-match confidence) to human review.
FaceSign provides the data; your application owns the decision logic.
How to receive results
Webhooks (recommended). FaceSign sends HTTP POST events to your webhook endpoint as the session progresses. The session.status event fires on every status transition. Media and analysis events fire as those artifacts become available. Webhooks are the best approach for production — real-time, no polling.
Polling. Call GET /sessions/:id to fetch the current session state at any time. Useful for debugging, one-off checks, or as a fallback if your webhook endpoint is temporarily unavailable.