跳转到内容

fast2flow

fast2flow 是一个高性能路由扩展,使用确定性的基于 token 的评分(BM25)并可选使用 LLM 回退,将传入消息路由到合适的 flow。

核心概念:用户发送类似 “refund please” 的消息 → fast2flow 检查租户专属索引 → 返回一个路由指令(DispatchRespondDenyContinue)。

关键原则:

  • 先确定性 - 基于 token 的 BM25 评分,带来可预测、可解释的路由
  • 失败时放行 - 错误、超时或索引缺失会产生 Continue 指令
  • 时间受限 - 通过 time_budget_ms 强制执行硬超时
  • 策略驱动 - 无需改代码即可改变运行时行为
Incoming Message
┌──────────────────────────────────────────────────┐
│ fast2flow Pipeline │
│ │
│ ┌────────────────────────────────────────────┐ │
│ │ 1. Hook Filter │ │
│ │ Allow/deny lists, respond rules, policy │ │
│ └────────────────────────────────────────────┘ │
│ │ │
│ ┌────────────────────────────────────────────┐ │
│ │ 2. Index Lookup │ │
│ │ Load TF-IDF index for tenant scope │ │
│ └────────────────────────────────────────────┘ │
│ │ │
│ ┌────────────────────────────────────────────┐ │
│ │ 3. Deterministic Strategy (BM25) │ │
│ │ Token scoring with title boosting (2x) │ │
│ └────────────────────────────────────────────┘ │
│ │ │
│ ┌────────────────────────────────────────────┐ │
│ │ 4. Confidence Gate │ │
│ │ min_confidence threshold check │ │
│ └────────────────────────────────────────────┘ │
│ │ │
│ ┌────────────────────────────────────────────┐ │
│ │ 5. LLM Fallback (optional) │ │
│ │ OpenAI or Ollama for ambiguous cases │ │
│ └────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────┘
Routing Directive (Dispatch / Respond / Deny / Continue)

每个路由决策都会产生以下四种指令之一:

指令用途字段
dispatch路由到特定 flowtarget, confidence, reason
respond立即返回响应message
deny阻止请求reason
continue不做决定,由调用方处理
// Dispatch to a flow
{"type": "dispatch", "target": "support-pack:refund_request", "confidence": 0.92, "reason": "BM25 match"}
// Auto-respond without routing
{"type": "respond", "message": "Use the self-service refund form at /refund."}
// Block the request
{"type": "deny", "reason": "Denied by scope policy"}
// Pass through (fail-open default)
{"type": "continue"}

fast2flow 以 .gtpack 制品形式通过 GHCR 分发:

Terminal window
# Pull from GHCR
oras pull ghcr.io/greentic-biz/providers/routing-hook/fast2flow.gtpack:latest
# Or reference a specific version
oras pull ghcr.io/greentic-biz/providers/routing-hook/fast2flow.gtpack:v0.4.6

该 pack 会注册一个 post_ingress hook,在消息到达任何 flow 之前进行拦截。

fast2flow 包含三个 WASM components(目标为 wasm32-wasip2):

Component用途操作
Indexer根据 flow 元数据构建可搜索的 TF-IDF 索引build, update
Matcher基于索引进行快速 BM25 意图匹配match
Router协调整个路由管线route

这些 components 由 pack 中定义的三个 flows 协同工作:

# flows/index.ygtc — Runs at deploy time to build indexes
# flows/match.ygtc — Runtime BM25 intent matching
# flows/route.ygtc — Full routing pipeline with LLM fallback

fast2flow 会从你的 bundle 的 .ygtc 文件中建立 flows 索引。indexer 会扫描 bundle 目录、提取元数据(title、description、tags),并构建带有 BM25 评分的 TF-IDF 索引。

my-bundle/
├── packs/
│ ├── support-pack/
│ │ └── flows/
│ │ ├── refund.ygtc
│ │ ├── shipping.ygtc
│ │ └── faq.ygtc
│ └── hr-pack/
│ └── flows/
│ ├── leave.ygtc
│ └── booking.ygtc

每个 flow 文件都会提供用于意图匹配的元数据:

refund.ygtc
id: refund_request
title: Process Refund Request
description: Handle customer refund requests for orders and payments
type: messaging
tags:
- refund
- payment
- billing
- return
start: collect_info
nodes:
collect_info:
templating.handlebars:
text: "Please provide your order number for the refund."
routing:
- out: true

使用 CLI 从 bundle 构建索引:

Terminal window
greentic-fast2flow bundle index \
--bundle ./my-bundle \
--output ./state/indexes \
--tenant demo \
--team default \
--verbose

这会生成:

  • index.json - 带有词频和文档频率的 TF-IDF 索引
  • intents.md - 供人阅读的意图文档
Terminal window
greentic-fast2flow bundle validate --bundle ./my-bundle

策略用于在运行时控制路由行为,而无需改代码。它们是从 /mnt/registry/fast2flow-policy.json 或自定义路径加载的 JSON 文件。

fast2flow-policy.json
{
"stage_order": ["scope", "channel", "provider"],
"default": {
"min_confidence": 0.5,
"llm_min_confidence": 0.5,
"candidate_limit": 20
},
"scope_overrides": [],
"channel_overrides": [],
"provider_overrides": []
}

所有规则字段都是可选的,只有显式指定的字段才会生效:

字段类型说明
min_confidencef32触发 dispatch 所需的最低 BM25 分数(0.0–1.0)
llm_min_confidencef32触发 dispatch 所需的最低 LLM 置信度(0.0–1.0)
candidate_limitusize评估的最大候选数
allow_channelsstring[]渠道白名单(null = 全部允许)
deny_channelsstring[]渠道黑名单
allow_providersstring[]Provider 白名单(null = 全部允许)
deny_providersstring[]Provider 黑名单
allow_scopesstring[]Scope 白名单(null = 全部允许)
deny_scopesstring[]Scope 黑名单
respond_rulesobject[]自动回复规则(关键字匹配)

Overrides 会按照阶段顺序(scope → channel → provider)应用,并在每个阶段内按优先级排序。

Scope override - 为 VIP tenant 设置更严格的置信度:

{
"id": "vip-tenant",
"priority": 10,
"scope": "tenant-vip",
"rules": {
"min_confidence": 0.8,
"candidate_limit": 10
}
}

Channel override - 在 email 渠道自动回复:

{
"id": "email-autorespond",
"priority": 20,
"channel": "email",
"rules": {
"respond_rules": [
{
"needle": "refund",
"message": "Refund requests via email take 3–5 business days. Use chat for instant support.",
"mode": "contains"
}
]
}
}

Provider override - 限制为特定 provider:

{
"id": "slack-only",
"priority": 30,
"provider": "slack",
"rules": {
"deny_providers": ["telegram"]
}
}

自动回复规则会在路由管线运行前匹配文本:

{
"needle": "business hours",
"message": "Our business hours are Mon–Fri 9AM–5PM UTC.",
"mode": "contains"
}

支持的模式:exactcontains(默认)、regex

Terminal window
# Print default policy
greentic-fast2flow policy print-default
# Validate a policy file
greentic-fast2flow policy validate --file ./my-policy.json

当确定性的 BM25 策略产生较低置信度分数时,fast2flow 可以回退到 LLM 进行分类。

Provider环境变量
OpenAIFAST2FLOW_OPENAI_API_KEY_PATH, FAST2FLOW_OPENAI_MODEL_PATH
OllamaFAST2FLOW_OLLAMA_ENDPOINT_PATH, FAST2FLOW_OLLAMA_MODEL_PATH
DisabledFAST2FLOW_LLM_PROVIDER=disabled(默认)
Terminal window
# Enable OpenAI fallback
FAST2FLOW_LLM_PROVIDER=openai \
FAST2FLOW_OPENAI_API_KEY_PATH=/run/secrets/openai-key \
greentic-fast2flow-routing-host < request.json
Terminal window
# Build TF-IDF index from bundle
greentic-fast2flow bundle index \
--bundle ./my-bundle \
--output ./indexes \
--tenant demo \
--team default \
--generate-docs \
--verbose
# Validate bundle has indexable flows
greentic-fast2flow bundle validate --bundle ./my-bundle
Terminal window
# Build index from flow definitions JSON
greentic-fast2flow index build \
--scope tenant-a \
--flows flows.json \
--output /tmp/indexes
# Inspect a built index
greentic-fast2flow index inspect \
--scope tenant-a \
--input /tmp/indexes
Terminal window
# Simulate a routing decision
greentic-fast2flow route simulate \
--scope tenant-a \
--text "I need a refund" \
--indexes-path /tmp/indexes
Terminal window
# Print default policy template
greentic-fast2flow policy print-default
# Validate policy file
greentic-fast2flow policy validate --file policy.json
变量默认值说明
FAST2FLOW_LLM_PROVIDERdisabledLLM provider:disabledopenaiollama
FAST2FLOW_POLICY_PATH/mnt/registry/fast2flow-policy.json策略文件路径
FAST2FLOW_TRACE_POLICY设为 1 可将策略跟踪输出到 stderr
FAST2FLOW_MIN_CONFIDENCE0.5默认最低置信度阈值
FAST2FLOW_LLM_MIN_CONFIDENCE0.5默认 LLM 最低置信度
FAST2FLOW_CANDIDATE_LIMIT20默认最大候选数

fast2flow 针对低延迟路由进行了优化:

阶段典型延迟
Hook filter(allow/deny)< 0.1ms
BM25 索引查找< 1ms
策略解析< 0.1ms
LLM 回退(若启用)200–500ms
  1. 编写描述性标题 - 标题词会获得 2 倍 TF-IDF 加权,从而提升评分效果
  2. 使用具体标签 - 标签是 BM25 匹配的主要信号
  3. 设置合适阈值 - 从 min_confidence: 0.5 开始,再逐步调优
  4. 使用策略做 overrides - 无需重新部署即可按 scope/channel/provider 改变行为
  5. 监控 Continue 比例 - Continue 输出过高表示你的 flow 覆盖存在缺口
  6. 让 LLM 保持回退角色 - 确定性路由更快,也更可预测