Skip to content

fast2flow

fast2flow is a high-performance routing extension that routes incoming messages to the appropriate flow using deterministic token-based scoring (BM25) with optional LLM fallback.

Core concept: A user sends a message like “refund please” → fast2flow checks tenant-specific indexes → returns a routing directive (Dispatch, Respond, Deny, or Continue).

Key principles:

  • Deterministic first — Token-based BM25 scoring for predictable, explainable routing
  • Fail-open — Errors, timeouts, or missing indexes produce a Continue directive
  • Time-bounded — Hard timeout enforcement via time_budget_ms
  • Policy-driven — Runtime behavior changes without code changes
Incoming Message
┌──────────────────────────────────────────────────┐
│ fast2flow Pipeline │
│ │
│ ┌────────────────────────────────────────────┐ │
│ │ 1. Hook Filter │ │
│ │ Allow/deny lists, respond rules, policy │ │
│ └────────────────────────────────────────────┘ │
│ │ │
│ ┌────────────────────────────────────────────┐ │
│ │ 2. Index Lookup │ │
│ │ Load TF-IDF index for tenant scope │ │
│ └────────────────────────────────────────────┘ │
│ │ │
│ ┌────────────────────────────────────────────┐ │
│ │ 3. Deterministic Strategy (BM25) │ │
│ │ Token scoring with title boosting (2x) │ │
│ └────────────────────────────────────────────┘ │
│ │ │
│ ┌────────────────────────────────────────────┐ │
│ │ 4. Confidence Gate │ │
│ │ min_confidence threshold check │ │
│ └────────────────────────────────────────────┘ │
│ │ │
│ ┌────────────────────────────────────────────┐ │
│ │ 5. LLM Fallback (optional) │ │
│ │ OpenAI or Ollama for ambiguous cases │ │
│ └────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────┘
Routing Directive (Dispatch / Respond / Deny / Continue)

Every routing decision produces one of four directives:

DirectivePurposeFields
dispatchRoute to a specific flowtarget, confidence, reason
respondReturn an immediate responsemessage
denyBlock the requestreason
continueNo decision — let the caller handle it
// Dispatch to a flow
{"type": "dispatch", "target": "support-pack:refund_request", "confidence": 0.92, "reason": "BM25 match"}
// Auto-respond without routing
{"type": "respond", "message": "Use the self-service refund form at /refund."}
// Block the request
{"type": "deny", "reason": "Denied by scope policy"}
// Pass through (fail-open default)
{"type": "continue"}

fast2flow is distributed as a .gtpack artifact via GHCR:

Terminal window
# Pull from GHCR
oras pull ghcr.io/greentic-biz/providers/routing-hook/fast2flow.gtpack:latest
# Or reference a specific version
oras pull ghcr.io/greentic-biz/providers/routing-hook/fast2flow.gtpack:v0.4.6

The pack registers a post_ingress hook that intercepts messages before they reach any flow.

fast2flow ships three WASM components (targeting wasm32-wasip2):

ComponentPurposeOperation
IndexerBuilds a searchable TF-IDF index from flow metadatabuild, update
MatcherFast BM25-based intent matching against the indexmatch
RouterOrchestrates the full routing pipelineroute

These components are coordinated by three flows defined in the pack:

# flows/index.ygtc — Runs at deploy time to build indexes
# flows/match.ygtc — Runtime BM25 intent matching
# flows/route.ygtc — Full routing pipeline with LLM fallback

fast2flow indexes flows from your bundle’s .ygtc files. The indexer scans your bundle directory, extracts metadata (title, description, tags), and builds a TF-IDF index with BM25 scoring.

my-bundle/
├── packs/
│ ├── support-pack/
│ │ └── flows/
│ │ ├── refund.ygtc
│ │ ├── shipping.ygtc
│ │ └── faq.ygtc
│ └── hr-pack/
│ └── flows/
│ ├── leave.ygtc
│ └── booking.ygtc

Each flow file provides the metadata used for intent matching:

refund.ygtc
id: refund_request
title: Process Refund Request
description: Handle customer refund requests for orders and payments
type: messaging
tags:
- refund
- payment
- billing
- return
start: collect_info
nodes:
collect_info:
templating.handlebars:
text: "Please provide your order number for the refund."
routing:
- out: true

Use the CLI to build an index from your bundle:

Terminal window
greentic-fast2flow bundle index \
--bundle ./my-bundle \
--output ./state/indexes \
--tenant demo \
--team default \
--verbose

This produces:

  • index.json — TF-IDF index with term frequencies and document frequencies
  • intents.md — Human-readable intent documentation
Terminal window
greentic-fast2flow bundle validate --bundle ./my-bundle

Policies control routing behavior at runtime without code changes. They are JSON files loaded from /mnt/registry/fast2flow-policy.json or a custom path.

fast2flow-policy.json
{
"stage_order": ["scope", "channel", "provider"],
"default": {
"min_confidence": 0.5,
"llm_min_confidence": 0.5,
"candidate_limit": 20
},
"scope_overrides": [],
"channel_overrides": [],
"provider_overrides": []
}

All rule fields are optional — only specified fields are applied:

FieldTypeDescription
min_confidencef32Minimum BM25 score to dispatch (0.0–1.0)
llm_min_confidencef32Minimum LLM confidence to dispatch (0.0–1.0)
candidate_limitusizeMaximum candidates to evaluate
allow_channelsstring[]Whitelist channels (null = allow all)
deny_channelsstring[]Blacklist channels
allow_providersstring[]Whitelist providers (null = allow all)
deny_providersstring[]Blacklist providers
allow_scopesstring[]Whitelist scopes (null = allow all)
deny_scopesstring[]Blacklist scopes
respond_rulesobject[]Auto-respond rules (keyword matching)

Overrides are applied in stage order (scope → channel → provider) with priority sorting within each stage.

Scope override — stricter confidence for a VIP tenant:

{
"id": "vip-tenant",
"priority": 10,
"scope": "tenant-vip",
"rules": {
"min_confidence": 0.8,
"candidate_limit": 10
}
}

Channel override — auto-respond on email channel:

{
"id": "email-autorespond",
"priority": 20,
"channel": "email",
"rules": {
"respond_rules": [
{
"needle": "refund",
"message": "Refund requests via email take 3–5 business days. Use chat for instant support.",
"mode": "contains"
}
]
}
}

Provider override — restrict to specific provider:

{
"id": "slack-only",
"priority": 30,
"provider": "slack",
"rules": {
"deny_providers": ["telegram"]
}
}

Auto-respond rules match text before the routing pipeline runs:

{
"needle": "business hours",
"message": "Our business hours are Mon–Fri 9AM–5PM UTC.",
"mode": "contains"
}

Supported modes: exact, contains (default), regex.

Terminal window
# Print default policy
greentic-fast2flow policy print-default
# Validate a policy file
greentic-fast2flow policy validate --file ./my-policy.json

When the deterministic BM25 strategy produces low confidence scores, fast2flow can fall back to an LLM for classification.

ProviderEnvironment Variables
OpenAIFAST2FLOW_OPENAI_API_KEY_PATH, FAST2FLOW_OPENAI_MODEL_PATH
OllamaFAST2FLOW_OLLAMA_ENDPOINT_PATH, FAST2FLOW_OLLAMA_MODEL_PATH
DisabledFAST2FLOW_LLM_PROVIDER=disabled (default)
Terminal window
# Enable OpenAI fallback
FAST2FLOW_LLM_PROVIDER=openai \
FAST2FLOW_OPENAI_API_KEY_PATH=/run/secrets/openai-key \
greentic-fast2flow-routing-host < request.json
Terminal window
# Build TF-IDF index from bundle
greentic-fast2flow bundle index \
--bundle ./my-bundle \
--output ./indexes \
--tenant demo \
--team default \
--generate-docs \
--verbose
# Validate bundle has indexable flows
greentic-fast2flow bundle validate --bundle ./my-bundle
Terminal window
# Build index from flow definitions JSON
greentic-fast2flow index build \
--scope tenant-a \
--flows flows.json \
--output /tmp/indexes
# Inspect a built index
greentic-fast2flow index inspect \
--scope tenant-a \
--input /tmp/indexes
Terminal window
# Simulate a routing decision
greentic-fast2flow route simulate \
--scope tenant-a \
--text "I need a refund" \
--indexes-path /tmp/indexes
Terminal window
# Print default policy template
greentic-fast2flow policy print-default
# Validate policy file
greentic-fast2flow policy validate --file policy.json
VariableDefaultDescription
FAST2FLOW_LLM_PROVIDERdisabledLLM provider: disabled, openai, ollama
FAST2FLOW_POLICY_PATH/mnt/registry/fast2flow-policy.jsonPolicy file path
FAST2FLOW_TRACE_POLICYSet to 1 to emit policy trace to stderr
FAST2FLOW_MIN_CONFIDENCE0.5Default minimum confidence threshold
FAST2FLOW_LLM_MIN_CONFIDENCE0.5Default LLM minimum confidence
FAST2FLOW_CANDIDATE_LIMIT20Default max candidates

fast2flow is optimized for low-latency routing:

StageTypical Latency
Hook filter (allow/deny)< 0.1ms
BM25 index lookup< 1ms
Policy resolution< 0.1ms
LLM fallback (if enabled)200–500ms
  1. Write descriptive titles — Title words get 2x TF-IDF boost for better scoring
  2. Use specific tags — Tags are the primary signal for BM25 matching
  3. Set appropriate thresholds — Start with min_confidence: 0.5 and tune up
  4. Use policies for overrides — Change behavior per scope/channel/provider without redeploying
  5. Monitor Continue rate — High Continue output indicates gaps in your flow coverage
  6. Keep LLM as fallback — Deterministic routing is faster and more predictable