用 Microsoft Call Center AI 实现电话 API 调度：AI 代理主动拨打

在呼叫中心场景中，AI 代理需要主动发起电话呼叫，以实现实时工具调用和交互式服务。Microsoft 的开源项目 Call Center AI 提供了一个简洁的 telephony API，支持通过 POST 请求从 AI 代理发送 outbound call，或从配置的电话号码直接拨打 bot。该方案基于 Azure Communication Services 和 OpenAI GPT-4o-mini/nano 模型，实现了流式语音对话、RAG 检索增强和断线续传功能，适用于保险、IT 支持和客服等低到中复杂度场景。

API 接口详解与调用示例

核心端点为 /call，采用 JSON payload 发起呼叫。关键参数包括：

bot_company：公司名称，如 "Contoso"，用于上下文注入。
bot_name：机器人名称，如 "Amélie"，提升对话自然度。
phone_number：目标用户号码，E164 格式如 "+11234567890"。
task：通话任务描述，例如 "Help the customer with their digital workplace. Assistant is working for the IT support department."，指导 LLM 行为。
agent_phone_number：代理号码，用于转接真人。
claim：可选数据 schema 数组，支持 text、datetime、email、phone_number 类型，用于结构化信息收集，如 [{"name": "hardware_info", "type": "text"}]。

示例 curl 请求：

curl --header 'Content-Type: application/json' --request POST --url https://your-domain/call --data '{
  "bot_company": "Contoso",
  "bot_name": "Amélie",
  "phone_number": "+11234567890",
  "task": "Help the customer...",
  "agent_phone_number": "+33612345678",
  "claim": [...]
}'

响应为实时流式事件，包括对话历史、claim 更新和 reminders。系统自动处理语音转文本（STT）、LLM 推理、文本转语音（TTS），并通过 WebSocket 或 Event Grid 推送事件。断线后可自动续传，确保会话连续性。

“Integrates inbound and outbound calls with a dedicated phone number, supports multiple languages and voice tones, and allows users to provide or receive information via SMS. Conversations are streamed in real-time to avoid delays, can be resumed after disconnections。” 该特性使 API 适用于工具调用场景，如在 LLM 对话中动态触发电话。

实时 Telephony Tool-Calling 实现

在 AI 代理（如 LangChain 或自定义 LLM 循环）中，将此 API 作为工具集成：当检测到 "call user +123" 意图时，代理调用 /call，传入动态生成的 task 和 claim。bot 使用 GPT-4o-mini 处理实时 STT 输入，支持多语言（fr-FR、zh-CN 等），并通过 RAG（Azure AI Search）检索内部文档。

关键工具调用流程：

用户意图解析 → 提取 phone_number 和 task。
POST /call → Azure Communication Services 拨号。
流式事件监听：bot 响应 → STT → LLM（带 claim 上下文）→ TTS → 播放。
工具注入：LLM 可调用内部工具生成 to-do list 或更新 claim。
会话结束：生成 synthesis（长 / 短总结、满意度）、reminders 和 report（/report/{phone_number}）。

自定义 voice 通过 Azure Custom Neural Voice 实现品牌一致性；moderation 使用 Azure Content Safety 过滤有害内容。

工程化参数与阈值配置

为确保稳定性和低延迟，需精细调优 App Configuration 中的 feature flags（TTL 60s 刷新）：

参数	默认值	推荐阈值	作用
`answer_hard_timeout_sec`	15	10-20	LLM 硬超时，避免卡死。
`answer_soft_timeout_sec`	4	3	软超时，发送等待提示。
`phone_silence_timeout_sec`	20	15-25	静音警告阈值。
`vad_silence_timeout_ms`	500	400-600	语音活动检测（VAD）静音阈值。
`vad_threshold`	0.5	0.4-0.6	VAD 灵敏度，防回声。
`recognition_retry_max`	3	2-4	STT 重试上限。
`callback_timeout_hour`	3	1-4	回调超时。

LLM 配置：优先 gpt-4o-mini（低成本、高速），fallback 到 gpt-4o；RAG 使用 text-embedding-3-large（1536 维）。启用 recording_enabled=true 需预建 Storage 容器。监控指标（Application Insights）：

call.answer.latency：目标 <2s。
call.aec.droped/missed：回声消除失败率 <5%。
LLM 令牌使用、延迟分布。

回滚策略：若延迟 >5s，切换 nano 模型；采样日志（OpenLLMetry）以控成本。

部署清单与成本优化

部署步骤（Azure serverless）：

创建资源组、Communication Services（系统托管身份）、电话号码（voice+SMS）。
配置 config.yaml：image_version（如 "0.1.0"）、LLM endpoint、语言 /voice。
make deploy name=your-rg → Container Apps、Cosmos DB、Redis 等。
本地开发：make tunnel + uv run local.py 测试无电话。
生产：多区域、多副本（2 vCPU/2GB）、vNET 私有端点。

成本估算（1000 通话 ×10min / 月，USD）：核心 $720（ACS $40、OpenAI $58、Container $160 等）；可选 Monitor $322。优化：

PTU（Provisioned Throughput Units）减 LLM 延迟 50%。
采样日志、升级 Search SKU 仅大数据集。
细调模型用历史通话数据（匿名化后）。

风险：PoC 阶段，需补测试、IaC、安全审计。生产前实现多区域、GitOps。

资料来源：

Microsoft Call Center AI GitHub
Azure 文档（Communication Services、OpenAI、Speech）。

（正文约 1250 字）