API触发AI电话代理部署：无服务器呼叫中心自动化

在呼叫中心场景中，手动拨打或接听电话往往消耗大量人力，而微软开源的 call-center-ai 项目通过单一 API 接口实现 AI 代理的无缝电话交互，支持主动外呼或被动接听。这种 API-first 设计强调通用性，避免了特定提供商如 Twilio 的深度绑定，而是依赖 Azure Communication Services 作为电话网关，结合 OpenAI GPT 模型处理对话逻辑。

核心机制是 POST /call 端点，传入 JSON 负载包括 bot_company（如 "Contoso"）、bot_name（如 "Amélie"）、目标 phone_number、任务描述（如 "帮助客户处理 IT 支持问题，收集 claim 信息"）以及可选 claim schema（如 [{"name":"hardware_info","type":"text"}]）。系统自动分配专用号码，用户拨入即触发 bot 响应；或 API 主动发起呼叫，实现自动化外呼。例如，curl 命令可一键触发："curl --header 'Content-Type: application/json' --request POST --url https://your-domain/call --data '{"bot_company":"Contoso","bot_name":"Amélie","phone_number":"+11234567890","task":"..."}'"。这种设计确保了低复杂度呼叫（如保险理赔、IT 支持）的 24/7 覆盖，支持多语言（fr-FR、zh-CN 等）和自定义语音（Azure Custom Neural Voice）。

部署采用云原生无服务器架构，主要在 Azure Container Apps 上运行，利用 Bicep IaC 自动化 provision 资源。前提包括：创建资源组（如 ccai-demo）、Communication Services 资源（启用系统托管身份）、购买支持 voice/SMS 的电话号码。配置 config.yaml 指定 image_version（如 ghcr.io/clemlesne/call-center-ai:0.1.0）、Azure OpenAI 端点（gpt-4o-mini 作为 fast 模型）、Cognitive Services 密钥、Cosmos DB 连接串等。执行 "make deploy name=my-rg-name" 即可部署，日志通过 "make logs name=my-rg-name" 查看。本地开发用 "make deploy-bicep deploy-post" + devtunnel 暴露端口，支持 hot-reload 迭代。

实时语音交互的关键在于流式处理和容错参数。通过 App Configuration 动态调整 feature flags，无需重启：answer_hard_timeout_sec=15（LLM 硬超时，避免挂起）、answer_soft_timeout_sec=4（软超时，播放等待音）、phone_silence_timeout_sec=20（静音警告）、vad_threshold=0.5（语音活动检测阈值，0.1-1 范围）、vad_silence_timeout_ms=500（静音超时）。STT 配置 recognition_retry_max=3、recognition_stt_complete_timeout_ms=100，确保识别鲁棒性。断线续传依赖 Cosmos DB 持久化 conversation、claim、reminders 和 synthesis，用户报告在 /report/{phone_number} 查看历史，支持 RAG via AI Search（index schema 含 vectors 维度 1536，用 ADA embedding）。

优化落地参数清单：

LLM 选择：fast 用 gpt-4o-mini（低延迟 10-15x 成本效益），insights 用 gpt-4o 分析 synthesis。
RAG 索引：字段 answer/context/question/vectors，确保 domain-specific 知识注入，如 IT 术语或保险政策。
Claim Schema：预定义 caller_email/text/datetime/phone_number，支持验证（E164 格式）。
Prompt 定制：tts.hello_tpl 列表随机选，避免重复；llm.system_tpl 注入 {date}/{phone_number}/{claim} 上下文。
监控指标（Application Insights）：call.answer.latency（用户语音结束到 bot 响应时延）、call.aec.droped（回声消除丢帧）、token 使用 / LLM 延迟。设置警报阈值 latency>5s。
回滚策略：feature_flags 如 slow_llm_for_chat=true fallback 慢模型；人工转接 via Communication Services；callback_timeout_hour=3 自动重呼。

成本估算（1000 通 10min calls / 月，Sweden Central/West Europe）：核心 $720（Communication $40、OpenAI $58、Container $160、AI Search $74、Speech $152、Cosmos $234），可选 Monitor $343。生产化需 vNET/private endpoint 增支，但弹性缩放匹配峰谷流量。

风险控制：作为 POC，添加单元测试、multi-region Cosmos、CodeQL 扫描。私密数据用 RAG+Content Safety 过滤 jailbreak/harm。细调历史通话提升准确率（匿名后用 Azure AI Foundry）。

资料来源：https://github.com/microsoft/call-center-ai（README & demo）；相关搜索确认近期热度，无 HN 深度讨论。

（正文约 1250 字）