AI 情感陪伴：长对话记忆与个性化人设的工程实现-迪斯科星球

AI 情感陪伴：长对话记忆与个性化人设的工程实现

一、失忆的 AI 伴侣：当每次对话都从零开始

当前的 AI 对话助手存在一个根本性的问题：它们没有长期记忆。每次开启新对话，AI 都是一个"陌生人"——不记得用户的名字、不记得上次聊了什么、不了解用户的性格和偏好。对于情感陪伴场景，这种"失忆"是致命的——用户无法与一个每次都重新认识自己的"朋友"建立情感连接。

长对话记忆和个性化人设是解决这个问题的两个关键能力。长对话记忆让 AI 记住历史交互的关键信息（用户的家庭状况、最近的烦恼、偏好的聊天风格）；个性化人设让 AI 以一致的性格和语气与用户交流，而非千篇一律的助手腔调。

flowchart TB Input[用户消息] --> Context[上下文构建] Context --> Memory[长期记忆检索] Context --> Persona[人设注入] Context --> History[近期对话历史] Memory --> Merge[合并为完整Prompt] Persona --> Merge History --> Merge Merge --> LLM[大语言模型] LLM --> Response[生成回复] Response --> Update[更新记忆库] subgraph 记忆管理 Update --> Extract[提取关键信息] Extract --> Store[存储到记忆库] Store --> Decay[记忆衰减<br/>重要信息保留更久] end

二、长对话记忆的核心机制

2.1 记忆的分层架构

记忆分为三层：工作记忆（当前对话上下文）、短期记忆（最近 7 天的关键信息）和长期记忆（用户画像和重要事件）。工作记忆直接放入 Prompt，短期记忆按相关性检索后放入，长期记忆只在相关时检索。

2.2 记忆提取与衰减

AI 在每次对话后，从交互中提取关键信息（如"用户养了一只猫叫小花"、"用户最近在准备考试"），存入记忆库。记忆有衰减机制——不重要的信息（如"今天天气不错"）7 天后自动降权，重要信息（如"用户母亲住院了"）长期保留。重要性由 AI 自动判断。

sequenceDiagram participant User as 用户 participant System as 记忆系统 participant LLM as 大语言模型 User->>System: "我家小花最近不太吃东西" System->>System: 检索记忆: 用户养了猫叫小花 System->>LLM: Prompt + 记忆: 用户养猫小花 + 近期对话 LLM->>User: "小花最近食欲不好，持续几天了？有没有其他症状？" User->>System: "大概三天了，有点打喷嚏" System->>System: 提取: 小花生病了, 打喷嚏, 3天 System->>System: 存储: 重要性=高(宠物健康) Note over System: 下次对话时<br/>AI会记得小花生病了

三、生产级代码实现

3.1 分层记忆管理器

import time from dataclasses import dataclass, field from typing import Dict, List, Optional from enum import Enum import logging logger = logging.getLogger(__name__) class MemoryImportance(Enum): LOW = 1 # 日常琐事，7天后衰减 MEDIUM = 2 # 个人偏好，30天后衰减 HIGH = 3 # 重要事件，长期保留 @dataclass class MemoryEntry: """记忆条目""" content: str # 记忆内容 importance: MemoryImportance # 重要性 category: str # 类别: family/pet/health/work/hobby created_at: float # 创建时间 last_accessed: float # 最后访问时间 access_count: int = 0 # 访问次数 source_conversation_id: str = "" # 来源对话ID class LayeredMemoryManager: """分层记忆管理器 设计考量： - 工作记忆：当前对话的完整上下文 - 短期记忆：最近7天的重要信息 - 长期记忆：用户画像和关键事件 - 记忆衰减：不重要的信息随时间降权 - 相关性检索：只检索与当前话题相关的记忆 """ DECAY_RATES = { MemoryImportance.LOW: 7 * 86400, # 7天 MemoryImportance.MEDIUM: 30 * 86400, # 30天 MemoryImportance.HIGH: 365 * 86400, # 1年（实际长期保留） } def __init__(self): self._memories: Dict[str, List[MemoryEntry]] = {} # user_id -> memories self._user_profiles: Dict[str, Dict] = {} # 用户画像 def store(self, user_id: str, content: str, importance: MemoryImportance, category: str) -> None: """存储记忆""" if user_id not in self._memories: self._memories[user_id] = [] entry = MemoryEntry( content=content, importance=importance, category=category, created_at=time.time(), last_accessed=time.time(), ) self._memories[user_id].append(entry) logger.info(f"存储记忆: user={user_id}, category={category}, importance={importance.name}") def retrieve(self, user_id: str, current_topic: str = "", limit: int = 10) -> List[MemoryEntry]: """检索相关记忆""" all_memories = self._memories.get(user_id, []) if not all_memories: return [] # 计算每条记忆的综合得分 scored = [] for m in all_memories: score = self._compute_score(m, current_topic) scored.append((m, score)) # 按得分排序，返回 Top-N scored.sort(key=lambda x: x[1], reverse=True) result = [m for m, s in scored[:limit]] # 更新访问时间 for m in result: m.last_accessed = time.time() m.access_count += 1 return result def _compute_score(self, memory: MemoryEntry, current_topic: str) -> float: """计算记忆的综合得分""" now = time.time() # 1. 时间衰减 age_seconds = now - memory.created_at decay_threshold = self.DECAY_RATES[memory.importance] time_decay = max(0, 1 - age_seconds / decay_threshold) # 2. 重要性权重 importance_weight = memory.importance.value / MemoryImportance.HIGH.value # 3. 访问频率加成 access_bonus = min(0.2, memory.access_count * 0.05) # 4. 话题相关性（简化：基于类别匹配） relevance = 0.5 # 默认中等相关 if current_topic and memory.category in current_topic: relevance = 1.0 return time_decay * importance_weight + access_bonus + relevance * 0.3 def extract_and_store(self, user_id: str, conversation: List[Dict]) -> List[str]: """从对话中提取关键信息并存储 设计考量： - 使用规则 + LLM 双重提取 - 规则提取确定性高的信息（人名、地名、日期） - LLM 提取语义层面的信息（情感状态、偏好变化） """ extracted = [] for msg in conversation: content = msg.get("content", "") role = msg.get("role", "") if role != "user": continue # 规则提取：宠物相关 pet_keywords = ["猫", "狗", "宠物", "养了"] if any(kw in content for kw in pet_keywords): importance = MemoryImportance.MEDIUM category = "pet" extracted.append(f"[宠物] {content[:100]}") self.store(user_id, content[:200], importance, category) # 规则提取：健康相关 health_keywords = ["生病", "住院", "手术", "不舒服", "医院"] if any(kw in content for kw in health_keywords): importance = MemoryImportance.HIGH category = "health" extracted.append(f"[健康] {content[:100]}") self.store(user_id, content[:200], importance, category) # 规则提取：家人相关 family_keywords = ["妈妈", "爸爸", "孩子", "家人", "老公", "老婆"] if any(kw in content for kw in family_keywords): importance = MemoryImportance.HIGH category = "family" extracted.append(f"[家人] {content[:100]}") self.store(user_id, content[:200], importance, category) return extracted @dataclass class PersonaConfig: """AI 人设配置""" name: str = "小暖" personality: str = "温柔、善解人意、偶尔幽默" speaking_style: str = "使用温暖的语气，偶尔用生活化的比喻，避免说教" boundaries: str = "不提供医疗诊断，不替代专业心理咨询，遇到危机情况建议拨打热线" class PersonaAwareResponder: """人设感知的回复生成器""" def build_system_prompt(self, persona: PersonaConfig, memories: List[MemoryEntry]) -> str: """构建包含人设和记忆的系统提示""" memory_text = "\n".join( f"- {m.content}（{m.category}，{m.importance.name}）" for m in memories[:5] # 限制记忆数量，避免 Prompt 过长 ) return ( f"你是{persona.name}，一个{persona.personality}的AI陪伴助手。\n" f"说话风格：{persona.speaking_style}\n" f"边界：{persona.boundaries}\n\n" f"你了解以下关于用户的信息：\n{memory_text if memory_text else '暂无历史信息'}\n\n" f"请基于以上信息，以{persona.name}的身份自然地回复用户。" f"如果用户提到了你已知的信息，自然地表现出你记得，但不要刻意罗列。" )

四、边界分析与架构权衡

4.1 记忆的隐私风险

长期记忆存储了大量用户个人信息，一旦泄露后果严重。必须对记忆数据进行加密存储，且用户有权查看、修改和删除所有记忆。更严格的方案是"记忆遗忘"——用户可以要求 AI 忘记特定信息，系统必须从记忆库中彻底删除。

4.2 记忆注入的 Prompt 长度

每条记忆约 50-100 Token，10 条记忆就占用了 500-1000 Token。在 Token 预算紧张的场景下，记忆和对话历史竞争 Prompt 空间。解决方案是动态调整记忆数量——对话历史短时多注入记忆，对话历史长时减少记忆。

4.3 人设一致性

AI 人设需要在不同对话中保持一致——不能昨天温柔今天冷淡。但 LLM 的输出具有随机性，完全一致的人设难以保证。解决方案是在系统提示中明确约束人设，并在生成后进行人设一致性检查。

五、总结

长对话记忆和个性化人设是 AI 情感陪伴的基础能力。分层记忆架构平衡了记忆的完整性和 Prompt 长度限制，记忆衰减机制确保重要信息长期保留。人设注入让 AI 以一致的性格与用户交流，建立情感连接。

落地路线建议：第一步，实现基于规则的简单记忆提取（关键词匹配），验证记忆存储和检索流程；第二步，添加记忆衰减和重要性分级，优化记忆检索的相关性；第三步，引入 LLM 辅助的语义提取，提升记忆提取的准确度；第四步，添加记忆管理界面，让用户可以查看和删除自己的记忆数据。

企业官网建设流程全解析