通过 API 驱动的 AI 电话代理：Call-Center-AI 的 outbound 呼叫工程实践

在呼叫中心场景中，AI 代理通过 API 直接拨打用户电话，能显著提升自动化响应效率，避免人工调度瓶颈。这种 API 驱动的 outbound calling 方式，让开发者只需一次 HTTP POST，即可配置 bot 身份、任务目标和数据 schema，即刻启动实时语音对话，支持复杂交互如信息采集、问题诊断和后续提醒生成。

核心实现依赖 Azure Communication Services 处理 inbound/outbound 呼叫，结合 Cognitive Services 的实时 STT（Speech-to-Text）和 TTS（Text-to-Speech），前端通过 OpenAI GPT-4o-mini 或 GPT-4o 模型驱动对话逻辑。系统架构为 serverless Container App，支持流式传输：用户语音实时转文本注入 LLM，bot 响应即时合成语音回传，避免传统批处理延迟。repo 中强调，“Send a phone call from AI agent, in an API call”，这使得集成门槛极低，适用于保险理赔、IT 支持等场景。

实际发起呼叫的 API 调用如下，使用 curl POST 到 /call 端点，传入 JSON payload：

data='{
  "bot_company": "Contoso",
  "bot_name": "Amélie",
  "phone_number": "+11234567890",
  "task": "Help the customer with their digital workplace. Assistant is working for the IT support department. The objective is to help the customer with their issue and gather information.",
  "agent_phone_number": "+33612345678",
  "claim": [
    {"name": "hardware_info", "type": "text"},
    {"name": "first_seen", "type": "datetime"},
    {"name": "building_location", "type": "text"}
  ]
}'
curl --header 'Content-Type: application/json' --request POST --url https://your-domain/call --data $data

关键参数解析：

bot_company & bot_name：定义代理身份，注入提示模板中，提升对话自然度。
phone_number：目标拨打号码，支持 E.164 格式。
task：英文任务描述，指导 LLM 行为，如 “采集理赔信息并生成待办”，覆盖整个通话目标。
agent_phone_number：可选，转接人工 fallback 号码。
claim：数组定义数据 schema，每项包含 name（字段名）、type（text/datetime/email/phone_number），可选 description。bot 会自动验证并填充数据，确保结构化输出。

这种 schema 驱动采集机制是亮点：LLM 不止聊天，还强制提取指定字段，避免信息遗漏。例如 IT 支持场景，采集 hardware_info 和 building_location，后续存入 Cosmos DB，便于 CRM 集成。

实时交互参数需通过 App Configuration 的 feature flags 调优，支持热更新（TTL 60s）：

answer_hard_timeout_sec=15：LLM 无响应超时，发送错误提示后重试。
answer_soft_timeout_sec=4：软超时，播放等待音（如音乐 + beep）。
phone_silence_timeout_sec=20：用户静默阈值，bot 主动提示。
vad_threshold=0.5（0.1-1）：语音活动检测灵敏度，vad_silence_timeout_ms=500ms，vad_cutoff_timeout_ms=250ms。
recognition_retry_max=3：STT 失败重试上限，recognition_stt_complete_timeout_ms=100ms。

这些参数直接影响用户体验：过低超时易中断，过高增延迟。建议从默认起步，结合 Application Insights 监控 call.answer.latency（用户说完到 bot 回应的端到端时延）和 call.aec.droped（回声消除丢帧）指标迭代。启用 recording_enabled=true 后，录音存 Azure Storage，便于 QA。

状态管理是工程化核心：每轮交互存 Cosmos DB，包括 messages（带 persona/timestamp/action）、claim（填充值）、next（行动如 case_closed + justification）、reminders（带 due_date_time/owner）。通话后，通过 GET /report/{phone_number} 获取 JSON 报告或 HTML 视图，包含 synthesis（short/long 摘要、satisfaction）。支持断线续传：历史上下文从 Redis 缓存和 DB 恢复，确保无缝。

部署清单（Azure 前提：Communication Services + 号码）：

创建资源组（如 cc-ai-rg），Communication Services（同名，system managed identity）。
购买号码（inbound/outbound + voice/SMS）。
配置 config.yaml：填入 endpoint keys（OpenAI、Speech、Search for RAG）、image_version（如 main）。
make deploy name=cc-ai-rg，自动 Bicep IaC 部署 Container App、Cosmos、Redis 等。
本地开发：make deploy-bicep deploy-post，devtunnel 公网暴露，uv run local.py 热重载测试。
监控：Application Insights 追踪 LLM token/latency，OpenLLMetry 语义指标。

风险与限界：POC 阶段，成本估算 1000 通话 ×10min ≈720 USD/mo（主导 Cosmos RU/s + Speech），生产需 vNET/private endpoint 增支。延迟瓶颈在 LLM TTFT（建议 PTU 或 gpt-4o-mini），无 LLM 框架直接用 OpenAI SDK 自定义工具 / 重试。优化路径：fine-tune 历史数据、A/B 测试 prompts、RAG 注入领域知识（AI Search index）。

落地此方案，可快速构建 24/7 AI 呼叫中心，API 抽象复杂度，参数化确保可控。通过自定义 prompts（如 tts.hello_tpl 多变体）和 claim schema，适配保险 / IT / 客服多场景。未来扩展：IVR 菜单、SMS 跟进（Twilio 兼容）。

资料来源：

Microsoft Call-Center-AI GitHub Repo（主要事实与示例）
项目 demo 与 architecture 文档。