Memori：AI代理的分层记忆引擎

Memori 是一个开源的 SQL 原生记忆引擎，专为 LLM 和 AI 代理设计，通过明确的分层记忆架构解决代理 “失忆” 问题，实现跨会话上下文保持、多代理协作记忆共享，以及高效的更新与查询操作。这种分层设计模仿人类认知：短期工作记忆（Short-Term Memory, STM）处理即时上下文，长期记忆（Long-Term Memory, LTM）存储持久知识，并通过后台代理动态晋升关键信息，避免 token 爆炸和检索低效。

核心观点在于，Memori 的 hierarchical memory 不依赖昂贵向量数据库，而是利用标准 SQL（如 SQLite、PostgreSQL）的全文搜索（FTS）和结构化表 schema，实现 80-90% 的成本节省，同时支持语义级检索和实体关系映射。这使得 AI 代理在多轮交互中能 “记住” 用户偏好、项目细节（如 “用户在建 FastAPI 项目”），并在后续查询中自动注入相关上下文，提升响应一致性和智能性。

分层记忆架构与代理协作

Memori 的数据库 schema 明确定义了分层：chat_history存储原始对话，short_term_memory作为 STM 表（包含 importance_score、frequency_score、recency_score、expires_at），long_term_memory作为 LTM 表（category、subcategory、retention_type、reasoning）。此外，还有memory_entities和memory_relationships表，支持实体提取和关系图谱。

工作流程分为三阶段：

Pre-Call 检索（Context Injection）：通过 LiteLLM 回调拦截 LLM 调用。Conscious 模式下，Conscious Agent 一次性注入 STM 关键记忆（限 3-5 条，~150 tokens）；Auto 模式下，Retrieval Agent 针对用户查询执行智能搜索（semantic strategy，返回 top-5，~250 tokens）。Combined 模式结合两者，总 tokens 控制在 250 以内，避免 context rot。
Post-Call 更新（Recording）：Memory Agent 使用 OpenAI Structured Outputs + Pydantic 提取实体、分类（facts/preferences/skills/rules/context），计算 importance_score，存入 LTM。全文索引（FTS5）确保 O (1) 级查询。
后台晋升（Promotion）：每 6 小时，Conscious Agent 分析 LTM 模式（频率 / 时效 / 重要性），将 essential memories 提升至 STM，并更新分析时间戳。

这种设计证据充分：架构图显示从 App→Interceptor→DB→LLM 的闭环，sequence diagrams 详述 agent 交互。相比平坦向量存储，Memori 的层级 + 索引（idx_memory_importance、idx_memory_timestamp）实现 2-4x 查询加速，且数据 100% 可移植（SQLite 导出）。

在多代理场景，Memori 通过 namespace 隔离用户 / 会话（如 "production"），CrewAI/AutoGen 示例展示共享记忆：代理群聊中，共同记忆用户技能，避免重复解释。Swarms 集成支持 persistent multi-agent memory。

可落地参数与工程化清单

部署 Memori 仅需一行memori.enable()，但生产需精细调参。以下是核心配置清单：

1. 初始化参数（Memori.init）

memori = Memori(
    database_connect="postgresql://user:pass@localhost/memori",  # 优先PostgreSQL池化20连接
    conscious_ingest=True,  # 启用STM，limit=3
    auto_ingest=True,       # 启用动态检索，limit=5，min_relevance=0.7
    openai_api_key="sk-...", # Retrieval/Memory Agent模型，推荐gpt-4o-mini
    provider_config=ProviderConfig.from_azure(...)  # Azure支持
)

阈值调优：auto_ingest中 Retrieval Agent 的 relevance_scores 阈值 0.6-0.8，避免噪声；STM expires_at=7 天，结合 recency_score 衰减（e^(-λt)，λ=0.02）。
Namespace：MEMORI_MEMORY__NAMESPACE="agent_team_1"，多代理共享设相同值。

2. 监控与回滚策略

Metrics：追踪 conversation volume、memory growth（db_size<1GB / 月）、agent success rate（>95%）、context tokens（<4k）。

Health Check：

def health_check():
    return {
        "db_conn": db_manager.ping(),
        "stm_size": len(short_term_memory.query(limit=10)),
        "ltm_growth": get_stats_24h(),
        "modes": {"conscious": memori.conscious_ingest}
    }

Fallback：Agent 失败时降级至 direct FTS 搜索（limit=3）；DB 断连指数退避重连。
Retention：retention_policy="30_days"，自动清理 low-importance LTM。

3. 多代理集成示例（CrewAI）

from crewai import Agent, Task, Crew
from memori import Memori
memori = Memori(conscious_ingest=True, namespace="crew_shared")
memori.enable()

researcher = Agent(role="Researcher", goal="...", llm=OpenAI(), memory=True)  # 自动用Memori
# Crew共享memory via namespace
crew = Crew(agents=[researcher, writer], tasks=[...])
result = crew.kickoff()

高效 Ops：更新 O (1) via batch insert；查询 < 100ms（FTS + 索引）；背景任务 async 不阻塞。

4. 风险限界与优化

Perf 瓶颈：>10k memories 时，升级 PostgreSQL 分片；Hybrid 检索若需纯 embedding，v3 beta 集成。
一致性：Namespace+ACID 防冲突；错误时 graceful degradation（禁用 advanced features）。
成本：无向量 DB，1M tokens ~$0.3；监控 token 优化（essential+relevant）。

实际落地 FastAPI 多用户 App：examples/multiple-users/fastapi_multiuser_app.py，Swagger 测试跨会话记忆（如 “上次认证用 JWT”）。

Memori 的分层引擎让 AI 代理从 “健忘” 变 “长智”，特别适配 enterprise multi-agent。参数化配置 + 监控，确保可靠扩展。

资料来源：

GitHub: https://github.com/GibsonAI/Memori
架构文档: https://memorilabs.ai/docs/open-source/architecture