topfans/backend/docs/AI-Chat-Service设计方案.md

# AI Chat Service 设计方案

> **目标：** 在 TopFans Backend 微服务体系中，新增 AI 伴侣对话服务，实现「用户输入→人设注入→大模型调用→记忆召回→合规审核→流式输出」完整链路。
>
> **通信方式：** WebSocket（移动端使用 UniApp，WebSocket 支持更好）

---

## 一、架构设计

### 1.1 整体架构

```
┌─────────────────────────────────────────────────────────────────────────────┐
│                              Mobile App (UniApp)                            │
│  ┌─────────────────────────────────────────────────────────────────────┐    │
│  │                    WebSocket Client                                  │    │
│  │  连接 /ws/ai-chat              # WebSocket 连接                      │    │
│  │  发送消息 → 接收流式回复        # 双向通信                           │    │
│  └─────────────────────────────────────────────────────────────────────┘    │
└─────────────────────────────────┬───────────────────────────────────────────┘
                                  │ WebSocket
                                  ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                         Gateway (:8080)                                       │
│  ┌─────────────────────────────────────────────────────────────────────┐    │
│  │              AIChatWebSocketHandler (WebSocket 处理)                │    │
│  │  /ws/ai-chat              # WebSocket 连接升级                      │    │
│  │  GET  /api/v1/ai-chat/personas     # 获取人设列表                     │    │
│  │  GET  /api/v1/ai-chat/history/{sessionId}  # 获取对话历史             │    │
│  └─────────────────────────────────────────────────────────────────────┘    │
│                                    │                                         │
│                                    │ Dubbo Triple (gRPC)                     │
└────────────────────────────────────┼─────────────────────────────────────────┘
                                      │
                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                    AIChatService (:20008)                                    │
│                                                                              │
│  ┌──────────────────────────────────────────────────────────────────────┐   │
│  │                        Provider 层                                     │   │
│  │            AIChatProvider (Dubbo RPC 入口)                              │   │
│  └──────────────────────────────────────────────────────────────────────┘   │
│                                    │                                         │
│                                    ▼                                         │
│  ┌──────────────────────────────────────────────────────────────────────┐   │
│  │                        Service 层                                      │   │
│  │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌────────────┐ │   │
│  │  │LLMService   │  │PersonaService│ │MemoryService │  │AuditService│ │   │
│  │  │大模型调用    │  │人设管理      │  │记忆管理      │  │合规审核    │ │   │
│  │  └─────────────┘  └─────────────┘  └─────────────┘  └────────────┘ │   │
│  └──────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
│  ┌──────────────────────────────────────────────────────────────────────┐   │
│  │                       Repository 层                                   │   │
│  │  ┌───────────────────────────┐  ┌───────────────────────────────────┐ │   │
│  │  │ PostgreSQL                │  │ Redis                              │ │   │
│  │  │ ai_user_memories (长期记忆) │  │ context:{sessionId} - 短期上下文  │ │   │
│  │  │ ai_personas (人设)         │  │ persona_cache:{userId}:{id}      │ │   │
│  │  └───────────────────────────┘  └───────────────────────────────────┘ │   │
│  └──────────────────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────────────┘
```

### 1.2 服务间调用关系

```
Gateway  ──────────────────────────────────────────────────────────────────────
   │                                                                           │
   │  Dubbo Triple 协议                                                        │
   ▼                                                                           │
AIChatService                                                                  │
   │                                                                           │
   ├─── 调用 MiniMax M2-her API ──► AI 大模型 (外部服务)                        │
   │                                                                           │
   ├─── 读写 ──► Redis (短期上下文、Session)                                    │
   │                                                                           │
   └─── 读写 ──► PostgreSQL (长期记忆 + 用户自定义人设)                                          │
```

### 1.3 数据流

```
1. WebSocket 连接建立
   Mobile → Gateway (WebSocket Upgrade)
   Gateway 验证 JWT Token，获取 user_id 和 star_id
   建立 WebSocket 连接，关联 user_id

2. 用户发送消息
   Mobile → Gateway (WebSocket 发送 JSON)
   {
     "action": "send_message",
     "session_id": "10001_123",
     "message": "今天工作好累",
     "persona_id": "uuid-xxx"  // 可选
   }

3. Gateway → AIChatService
   通过 Dubbo Triple 调用 SendMessage

4. 前置审核
   AIChatService.AuditService.audit_text() → 通过/拒绝

5. 记忆召回
   AIChatService.MemoryService.recall_memories() → 召回相关记忆

6. Prompt 组装
   AIChatService.PersonaService.get_persona() → 获取人设
   组装: SystemPrompt + 召回记忆 + 对话历史 + 用户输入

7. 大模型调用 (流式)
   AIChatService.LLMService.stream_chat() → MiniMax API

8. 后置审核 (逐Token)
   AIChatService.AuditService.audit_response() → 拦截敏感输出

9. 流式返回 (WebSocket)
   AIChatService → Gateway (Dubbo Stream)
   Gateway → Mobile (WebSocket)
   {
     "type": "message",
     "content": "宝，",
     "session_id": "10001_123",
     "is_end": false
   }
   ... (持续推送)
   {
     "type": "message",
     "content": "",
     "session_id": "10001_123",
     "is_end": true
   }

10. 保存上下文
    AIChatService.MemoryService.save_context() → Redis

11. 触发记忆提取 (每5轮)
    AIChatService.MemoryService.extract_memory() → PostgreSQL
```

### 1.4 WebSocket 消息协议

#### 1.4.1 客户端 → 服务器

**发送消息：**
```json
{
  "action": "send_message",
  "session_id": "10001_123",
  "message": "今天工作好累",
  "persona_id": "uuid-xxx"  // 可选，不传则使用用户默认人设
}
```

**心跳检测：**
```json
{
  "action": "ping"
}
```

**获取历史：**
```json
{
  "action": "get_history",
  "session_id": "10001_123",
  "limit": 20
}
```

**获取人设列表：**
```json
{
  "action": "get_personas"
}
```

#### 1.4.2 服务器 → 客户端

**鉴权响应：**
```json
{
  "type": "auth_response",
  "success": true,
  "user_id": 10001,
  "star_id": 123
}
```

**消息片段（流式）：**
```json
{
  "type": "message",
  "content": "宝，",
  "session_id": "10001_123",
  "is_end": false
}
```

**消息结束：**
```json
{
  "type": "message",
  "content": "",
  "session_id": "10001_123",
  "is_end": true
}
```

**心跳响应：**
```json
{
  "type": "pong"
}
```

**错误响应：**
```json
{
  "type": "error",
  "code": "AUDIT_BLOCKED",
  "message": "抱歉，这个话题我无法继续，我们换个话题聊聊吧。",
  "session_id": "10001_123"
}
```

**历史消息响应：**
```json
{
  "type": "history_response",
  "session_id": "10001_123",
  "history": [
    {"role": "user", "content": "今天工作好累"},
    {"role": "assistant", "content": "宝，辛苦了~"}
  ]
}
```

**人设列表响应：**
```json
{
  "type": "personas_response",
  "personas": [
    {"id": "uuid-xxx", "name": "小雪", "description": "温柔陪伴型闺蜜", "is_default": true},
    {"id": "uuid-yyy", "name": "阿逗", "description": "幽默搭子", "is_default": false}
  ]
}
```

---

## 二、目录结构

```
services/aiChatService/
├── main.go                          # 程序入口
├── configs/
│   └── dubbo.yaml                   # Dubbo 配置
├── go.mod
├── go.sum
├── provider/
│   └── ai_chat_provider.go         # Dubbo Provider (RPC 入口)
├── service/
│   ├── llm_service.go               # 大模型调用 (MiniMax + 通义备用)
│   ├── persona_service.go            # 人设管理
│   ├── memory_service.go             # 记忆管理 (Redis + PostgreSQL)
│   ├── audit_service.go              # 合规审核
│   └── prompt_builder.go             # Prompt 组装
├── repository/
│   ├── memory_repository.go          # 长期记忆 PostgreSQL 存储
│   └── persona_repository.go          # 人设 PostgreSQL 存储
├── model/
│   ├── ai_chat_models.go             # 数据模型定义
│   └── ai_chat_errors.go             # 错误定义
└── pkg/
    └── ai_chat_config.go              # 配置加载

migrations/
└── ai_chat.sql                       # 数据库迁移 SQL 文件
```

---

## 三、接口设计

### 3.1 WebSocket 接口 (Gateway → Mobile)

#### 连接地址
```
ws://gateway:8080/ws/ai-chat?token=Bearer_xxx
```

> **鉴权方式：** 连接时通过 URL 参数传递 JWT Token，Gateway 验证通过后建立连接。
> 相比连接后发送 auth 消息，此方式可避免恶意连接浪费资源。

#### 连接流程
1. Mobile 携带 Token 建立 WebSocket 连接：`ws://gateway:8080/ws/ai-chat?token=Bearer_xxx`
2. Gateway 解析并验证 Token，获取 user_id 和 star_id
3. 验证成功，返回 `{"type": "auth_response", "success": true}`
4. 连接建立成功，开始收发消息

#### 连接失败响应
```json
{
  "type": "auth_response",
  "success": false,
  "error": "invalid_token"
}
```

#### 心跳保活
- Mobile 每 30 秒发送一次 ping 消息
- Gateway 回复 pong 消息
- 如果 60 秒内未收到任何消息，Gateway 主动关闭连接

### 3.2 HTTP 接口 (Gateway → Mobile) - 兼容性接口

#### GET /api/v1/ai-chat/history/{sessionId}
**获取对话历史**

> 注意：user_id 从 JWT Token 中解析获取，无需请求参数。sessionId 格式为 `{userId}_{starId}`

Response:
```json
{
  "history": [
    {"role": "user", "content": "今天工作好累"},
    {"role": "assistant", "content": "宝，辛苦了~"}
  ]
}
```

#### GET /api/v1/ai-chat/personas
**获取用户的所有人设列表**

> 注意：user_id 从 JWT Token 中解析获取，无需请求参数

Response:
```json
{
  "personas": [
    {"id": "uuid-xxx", "name": "小雪", "description": "温柔陪伴型闺蜜", "is_default": true},
    {"id": "uuid-yyy", "name": "阿逗", "description": "幽默搭子", "is_default": false}
  ]
}
```

### 3.3 Dubbo Triple 接口 (Gateway → AIChatService)

```protobuf
service AIChatService {
  // 发送消息 (流式返回)
  rpc SendMessage(ChatMessageRequest) returns (stream ChatMessageResponse);

  // 获取对话历史
  rpc GetHistory(ChatHistoryRequest) returns (ChatHistoryResponse);

  // ============= 人设管理 =============

  // 获取用户的所有人设
  rpc GetPersonas(GetPersonasRequest) returns (PersonaListResponse);
}
```

---

## 四、核心模块设计

### 4.1 LLM Service (大模型调用)

**功能：** 封装 MiniMax M2-her 文本对话 API，支持流式输出和模型降级。

**实现要点：**

```go
type LLMService struct {
    minimaxClient *http.Client
    qwenClient    *http.Client
}

func (s *LLMService) StreamChat(ctx context.Context, messages []Message) (*StreamReader, error)
```

**StreamReader 接口定义：**
```go
// StreamReader 流式读取器接口
type StreamReader interface {
    // Next 返回下一条消息内容，done=true 表示流结束
    Next() (content string, done bool, err error)
    // Close 关闭流式读取器
    Close() error
}
```

**API 调用：**
- 主模型：MiniMax `M2-her` ( `/v1/text/chatcompletion_v2` )
- 备用模型：通义 `qwen-plus` ( `/compatible-mode/v1/chat/completions` )

**流式处理：**
- MiniMax 原生 SSE 格式，逐行解析 `choices[0].delta.content`
- 错误时自动切换备用模型
- 模型选择通过环境变量配置

**环境变量：**
```bash
MINIMAX_API_KEY=xxx
MINIMAX_API_URL=https://api.minimaxi.com/v1
MINIMAX_MODEL=M2-her

QWEN_API_KEY=xxx
QWEN_API_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
QWEN_MODEL=qwen-plus
```

### 4.2 Persona Service (人设管理)

**功能：** 管理用户自定义 AI 角色人设。

**默认人设：** 系统在用户首次使用 AI Chat 时自动创建一个默认人设，默认 SystemPrompt 如下：

```
你是一个温柔体贴的AI伴侣，名字叫小雪。你善于倾听，能理解用户的情绪，
用温暖的话语陪伴用户。说话风格亲切自然，像朋友聊天一样。
不要过于正式或说教，当用户情绪低落时，先给予共情和安慰。
```

**人设数据结构：**
```go
type Persona struct {
    ID           string `json:"id"`           // UUID
    UserID       int64  `json:"user_id"`      // 所属用户ID
    Name         string `json:"name"`         // 人设名称
    Description  string `json:"description"`  // 人设描述
    AvatarURL    string `json:"avatar_url"`   // 头像URL（可选）
    SystemPrompt string `json:"system_prompt"` // 核心设定Prompt
    TalkStyle    string `json:"talk_style"`   // 说话风格（可选）
    IsDefault    bool   `json:"is_default"`   // 是否默认人设
    CreatedAt    int64  `json:"created_at"`
    UpdatedAt    int64  `json:"updated_at"`
}
```

**获取人设 GetPersonas：**
- 根据 user_id 从数据库查询该用户的所有人设
- 系统自动创建默认人设，用户不可能无人设

**错误定义：**
```go
ErrPersonaNotFound = errors.New("persona_not_found", "人设不存在")
```

### 4.3 Memory Service (记忆管理)

**功能：** 短期上下文 + 长期记忆的分层记忆系统。

**短期记忆 (Redis)：**
```go
// Key: context:{sessionId}
// Value: JSON array of messages
// TTL: 24小时 (86400秒)

// sessionId 格式: {userId}_{starId}，例如 "10001_123"
// 注意：session 与 persona 分离，同一个 session 可以切换不同 persona 对话
```

**长期记忆 (PostgreSQL)：**
详见 7.2 节 `ai_user_memories` 表定义

**记忆召回流程：**
1. 从用户输入提取关键词
2. PostgreSQL 数组匹配 `keywords && $1`
3. 按 weight 降序、created_at 降序返回 Top 5
4. 组装成 "# 用户核心记忆\n- ...\n- ..." 格式注入 Prompt

**记忆提取触发：**
- 每 5 轮对话触发一次（1轮 = user发送 + assistant回复，5轮 = 10条消息）
- 从最近 5 轮用户消息（共10条消息，取最后5条user消息）中提取关键词
- 简单规则匹配：累/忙→工作状态，开心/高兴→正面情绪，生日/纪念日→重要日期

### 4.4 Audit Service (合规审核)

**功能：** 输入/输出内容安全审核。

#### 4.4.1 审核维度

| 类别 | 关键词示例 |
|------|-----------|
| 政治类 | 台独、港独、藏独、疆独 |
| 色情类 | 色情、裸聊、约炮 |
| 暴力类 | 杀人、虐待、暴力 |
| 违规诱导 | 转账、汇款、银行卡 |
| AI身份冒充 | "我是真人"、"我是人类" |

#### 4.4.2 审核策略

| 层级 | 位置 | 策略 | 说明 |
|------|------|------|------|
| **输入审核** | 后端前置 | 前置拦截 | 用户发送消息后、大模型处理前，检测敏感词，违规直接返回错误 |
| **输出审核** | 后端流式 | 后置拦截 | AI 回复时逐 Token 审核，检测到敏感词立即终止流并返回标准回复 |

**后置审核工作原理（逐 Token 检查）：**
```go
// 后置审核在流式输出循环中实时检查每个 token
for {
    token, done, err := streamReader.Next()
    if err != nil {
        break
    }

    // 逐 Token 审核 - 检测到违规立即终止
    if auditService.audit_response(token) == false {
        // 终止流式输出
        sendFinalMessage("抱歉，这个话题我无法继续，我们换个话题聊聊吧。")
        streamReader.Close()
        return
    }

    // 审核通过，发送 token 给客户端
    sendToClient(token)

    if done {
        break
    }
}
```

**回复替换（检测到违规时的标准回复）：**
```go
defaultSafeResponse = "抱歉，这个话题我无法继续，我们换个话题聊聊吧。"
```

#### 4.4.3 前端敏感词过滤（UniApp 客户端 - 可选额外保护）

前端敏感词过滤是**可选的额外保护层**，不替代后端审核。用于减少不必要的网络请求和在展示层面做基础过滤。

**前端过滤时机：**
| 时机 | 过滤目标 | 说明 |
|------|----------|------|
| 发送前 | 用户输入 | 检测到敏感词时本地拦截，不发送请求 |
| 接收后 | AI 回复 | 展示前做基础过滤（可选） |

**前端敏感词列表：**
```javascript
// 基础敏感词列表（建议与后端同步，定期更新）
const FRONTEND_SENSITIVE_WORDS = [
    // 政治类
    '台独', '港独', '藏独', '疆独', '分裂', '颠覆',
    // 色情类
    '色情', '裸聊', '约炮', '成人',
    // 暴力类
    '杀人', '虐待', '暴力',
    // 违规诱导
    '转账', '汇款', '银行卡', '密码',
    // AI 身份冒充相关
    '你是真人', '你是人类', '真人在吗'
];

// 简化版（仅最常见词汇，减少误判）
const BASIC_SENSITIVE_WORDS = [
    '台独', '港独', '藏独', '疆独',
    '色情', '裸聊', '约炮',
    '杀人', '虐待',
    '转账', '汇款'
];
```

**前端过滤实现：**
```javascript
/**
 * 检查文本是否包含敏感词
 * @param {string} text - 待检查文本
 * @param {string[]} wordList - 敏感词列表
 * @returns {boolean} - true 表示包含敏感词
 */
function containsSensitiveWords(text, wordList) {
    for (const word of wordList) {
        if (text.includes(word)) {
            return true;
        }
    }
    return false;
}

/**
 * 发送前检查（本地拦截）
 * @param {string} message - 用户输入消息
 * @returns {object} - { blocked: boolean, message: string }
 */
function checkBeforeSend(message) {
    if (containsSensitiveWords(message, BASIC_SENSITIVE_WORDS)) {
        return {
            blocked: true,
            message: '抱歉，发送内容包含敏感词，请修改后重试'
        };
    }
    return { blocked: false };
}

// 发送消息示例
function sendMessage(message, sessionId, personaId) {
    // 1. 发送前本地检查
    const checkResult = checkBeforeSend(message);
    if (checkResult.blocked) {
        uni.showToast({ title: checkResult.message, icon: 'none' });
        return false;
    }

    // 2. 发送消息到服务器（后端仍会进行完整审核）
    socket.send({
        data: JSON.stringify({
            action: 'send_message',
            session_id: sessionId,
            message: message,
            persona_id: personaId || ''
        })
    });
    return true;
}
```

**注意事项：**
- 前端过滤**不能替代后端审核**，仅作为减少无效请求的优化
- 敏感词列表应**定期同步更新**（可从后端 API 获取）
- 前端过滤**不进行 AI 身份冒充检测**（需要上下文判断，后端负责）
- 避免过度拦截导致用户体验下降，建议使用最小化的敏感词列表

### 4.5 Prompt Builder (Prompt组装)

**组装顺序：**
1. System Prompt (人设设定)
2. 用户核心记忆 (如有)
3. 对话历史 (最近 N 条，Token 限制内)
4. 用户当前输入

#### 4.5.1 Token 限制配置

```go
// 上下文管理配置
const (
    MaxTotalTokens     = 32000  // 总 Token 上限 (M2-her 支持 32K)
    MaxHistoryTokens   = 24000  // 对话历史最大 Token
    MaxSystemTokens    = 4000  // System Prompt 最大 Token
    MaxMemoryTokens    = 2000   // 记忆召回最大 Token
    ReservedTokens     = 2000   // 保留空间 (回复生成)
    MinHistoryMessages = 4     // 最少保留消息对数
)

// 可用 Token 计算
availableForHistory = MaxHistoryTokens - EstimateTokens(SystemPrompt) - EstimateTokens(Memory) - ReservedTokens
```

#### 4.5.2 Token 计算

```go
// Tokenizer Token 计算器 (轻量实现，无需引入完整 tiktoken)
type Tokenizer struct{}

// EstimateTokens 估算 Token 数量
// 规则：中文×2 + 英文/数字×1 + ASCII符号×1 + 其他×2
func (t *Tokenizer) EstimateTokens(text string) int {
    var count int
    for _, r := range text {
        switch {
        case r >= 0x4e00 && r <= 0x9fff: // 中文
            count += 2
        case r >= 'a' && r <= 'z' || r >= 'A' && r <= 'Z': // 英文
            count += 1
        case r >= '0' && r <= '9':
            count += 1
        case r < 128: // ASCII 符号
            count += 1
        default:
            count += 2 // 其他字符
        }
    }
    return count
}

// EstimateMessagesTokens 估算消息列表的总 Token
func (t *Tokenizer) EstimateMessagesTokens(messages []Message) int {
    var total int
    for _, m := range messages {
        // role + content + overhead
        total += t.EstimateTokens(m.Role) + t.EstimateTokens(m.Content) + 10
    }
    return total
}
```

#### 4.5.3 动态上下文裁剪

```go
// BuildPrompt 组装 Prompt，自动裁剪超长上下文
func BuildPrompt(
    systemPrompt string,
    userCoreInfo string,
    history []Message,
    userInput string,
    tokenizer *Tokenizer,
) ([]Message, int) {
    // 1. 计算各部分 Token
    systemTokens := tokenizer.EstimateTokens(systemPrompt)
    memoryTokens := tokenizer.EstimateTokens(userCoreInfo)

    // 2. 预留空间计算
    reserved := ReservedTokens
    if systemTokens > MaxSystemTokens {
        reserved += systemTokens - MaxSystemTokens // 超长部分从历史空间扣除
    }

    // 3. 计算可用于对话历史的 Token（至少保留 500 Token 空间）
    availableTokens := MaxHistoryTokens - memoryTokens - reserved
    if availableTokens < 500 {
        availableTokens = 500 // 最低保留空间
    }

    // 4. 动态裁剪对话历史
    trimmedHistory := trimHistoryToTokenLimit(history, availableTokens, tokenizer)

    // 5. 组装最终消息
    messages := []Message{
        {Role: "system", Content: systemPrompt},
    }

    if userCoreInfo != "" {
        messages = append(messages, Message{
            Role:    "system",
            Content: "# 用户核心记忆\n" + userCoreInfo,
        })
    }

    messages = append(messages, trimmedHistory...)
    messages = append(messages, Message{Role: "user", Content: userInput})

    // 6. 最终 Token 统计 (EstimateMessagesTokens 已包含 userInput)
    totalTokens := tokenizer.EstimateMessagesTokens(messages)

    return messages, totalTokens
}

// trimHistoryToTokenLimit 裁剪历史消息至 Token 限制内
func trimHistoryToTokenLimit(history []Message, maxTokens int, tokenizer *Tokenizer) []Message {
    if len(history) == 0 {
        return history
    }

    // 估算当前历史的 Token
    currentTokens := tokenizer.EstimateMessagesTokens(history)
    if currentTokens <= maxTokens {
        return history
    }

    // 保留最新消息对，确保至少 MinHistoryMessages 对
    result := make([]Message, 0)
    var usedTokens int

    // 从最新开始保留
    for i := len(history) - 1; i >= 0; i -= 2 { // 每次跳过一对话对 (user+assistant)
        msgToken := tokenizer.EstimateTokens(history[i].Content) + 10
        prevToken := 0
        if i > 0 {
            prevToken = tokenizer.EstimateTokens(history[i-1].Content) + 10
        }

        pairTokens := msgToken + prevToken

        // 至少保留 MinHistoryMessages 对
        if len(result)/2 >= MinHistoryMessages && usedTokens+pairTokens > maxTokens {
            break
        }

        // 向前插入（保持顺序）
        if i > 0 {
            result = append([]Message{history[i-1], history[i]}, result...)
        } else {
            result = append([]Message{history[i]}, result...)
        }
        usedTokens += pairTokens
    }

    return result
}
```

#### 4.5.4 单条消息截断

```go
// truncateMessageIfNeeded 截断超长单条消息
func truncateMessageIfNeeded(content string, maxTokens int, tokenizer *Tokenizer) string {
    if tokenizer.EstimateTokens(content) <= maxTokens {
        return content
    }

    // 二分查找最大长度
    runes := []rune(content)
    lo, hi := 0, len(runes)

    for lo < hi {
        mid := (lo + hi + 1) / 2
        if tokenizer.EstimateTokens(string(runes[:mid])) <= maxTokens {
            lo = mid
        } else {
            hi = mid - 1
        }
    }

    return string(runes[:lo]) + "...(已截断)"
}

// 截断阈值
const MaxSingleMessageTokens = 4000 // 单条消息最大 4000 Token
```

#### 4.5.5 对话轮次估算

| 历史消息对数 | 估算 Token | 说明 |
|-------------|-----------|------|
| 5 对 (10条) | ~1500 | 短对话 |
| 10 对 (20条) | ~3000 | 正常对话 |
| 20 对 (40条) | ~6000 | 长对话 |
| 50 对 (100条) | ~15000 | 超长对话 |

**建议**：
- 日常对话：保留 10-15 对
- 长程任务：保留 5 对（节省 Token）
- 记忆密集场景：减少历史，增加记忆召回

---

## 五、数据模型

### 5.1 请求/响应结构

```go
// ========== 对话 ==========

// ChatMessageRequest 发送消息请求
type ChatMessageRequest struct {
    SessionID string `json:"session_id"`
    Message   string `json:"message"`
    PersonaID string `json:"persona_id"` // 可选，空则用默认人设
    UserID    int64  `json:"user_id"`    // 从 JWT 获取
}

// ChatMessageResponse 流式消息响应
type ChatMessageResponse struct {
    Content   string `json:"content"`
    SessionID string `json:"session_id"`
    IsEnd     bool   `json:"is_end"`
    Error     string `json:"error,omitempty"`
}

// ChatHistoryRequest 获取历史请求
type ChatHistoryRequest struct {
    SessionID string `json:"session_id"`
    Limit     int32  `json:"limit"` // 默认 20
}

// ChatHistoryResponse 获取历史响应
type ChatHistoryResponse struct {
    History []Message `json:"history"`
}

// ========== 人设管理 ==========

// GetPersonasRequest 获取人设列表请求
type GetPersonasRequest struct {
    UserID int64 `json:"user_id"`
}

// PersonaListResponse 人设列表响应
type PersonaListResponse struct {
    Personas []PersonaInfo `json:"personas"`
}

// PersonaInfo 人设信息
type PersonaInfo struct {
    ID          string `json:"id"`
    Name        string `json:"name"`
    Description string `json:"description"`
    AvatarURL   string `json:"avatar_url"`
    TalkStyle   string `json:"talk_style"`
    IsDefault   bool   `json:"is_default"`
    CreatedAt   int64  `json:"created_at"`
    UpdatedAt   int64  `json:"updated_at"`
}

// ========== 通用 ==========

// Message 对话消息
type Message struct {
    Role    string `json:"role"`    // "user" / "assistant"
    Content string `json:"content"`
}
```

---

## 六、配置设计

### 6.1 dubbo.yaml

```yaml
dubbo:
  application:
    name: ai-chat-service
    version: 1.0.0

  protocols:
    triple:
      name: tri
      port: 20008

  provider:
    registry-ids: nacos
    protocol-ids: triple
    services:
      AIChatService:
        interface: "github.com/topfans/backend/pkg/proto/ai_chat.AIChatService"

  consumer:
    registry-ids: nacos

  timeout: 30s
```

### 6.2 Gateway 配置更新

在 `gateway/config/config.go` 的 DubboConfig 中添加：

```go
DubboConfig struct {
    // ... 现有配置
    AIChatServiceURL string // tri://127.0.0.1:20008
}
```

### 6.3 Gateway WebSocket 配置

在 `gateway/config/config.go` 中添加：

```go
// WebSocketConfig WebSocket 配置
type WebSocketConfig struct {
    AIChatPath string // WebSocket 路径，默认 /ws/ai-chat
}
```

---

## 七、数据库表

### 7.1 ai_personas (人设表)

```sql
CREATE TABLE IF NOT EXISTS ai_personas (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id BIGINT NOT NULL,
    name VARCHAR(64) NOT NULL,
    description TEXT,
    avatar_url VARCHAR(512),
    talk_style VARCHAR(256),
    system_prompt TEXT NOT NULL,
    is_default BOOLEAN DEFAULT FALSE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- 索引
CREATE INDEX idx_ai_personas_user_id ON ai_personas(user_id);
CREATE UNIQUE INDEX idx_ai_personas_user_default ON ai_personas(user_id) WHERE is_default = TRUE;  -- 每个用户只能有一个默认人设
```

### 7.2 ai_user_memories (长期记忆表)

```sql
CREATE TABLE IF NOT EXISTS ai_user_memories (
    id SERIAL PRIMARY KEY,
    user_id BIGINT NOT NULL,  -- 与 JWT user_id 类型一致
    content TEXT NOT NULL,
    keywords TEXT[],
    weight INTEGER DEFAULT 50,
    is_core BOOLEAN DEFAULT FALSE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- 索引
CREATE INDEX IF NOT EXISTS idx_ai_user_memories_user_id ON ai_user_memories(user_id);
CREATE INDEX IF NOT EXISTS idx_ai_user_memories_keywords ON ai_user_memories USING GIN(keywords);
CREATE INDEX IF NOT EXISTS idx_ai_user_memories_weight ON ai_user_memories(weight DESC);
```

### 7.3 ai_chat_configs (配置表)

用于存储 AI Chat Service 的可配置参数，支持运行时修改。

```sql
CREATE TABLE IF NOT EXISTS ai_chat_configs (
    id SERIAL PRIMARY KEY,
    config_key VARCHAR(128) NOT NULL UNIQUE,  -- 配置键，唯一标识
    config_value TEXT NOT NULL,               -- 配置值
    config_type VARCHAR(32) NOT NULL DEFAULT 'string',  -- 值类型: string, number, boolean, json
    category VARCHAR(64) NOT NULL,            -- 配置分类: db, redis, llm, dialog
    description VARCHAR(256),                 -- 配置描述
    is_encrypted BOOLEAN DEFAULT FALSE,       -- 是否加密（敏感信息如密码）
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- 索引
CREATE INDEX IF NOT EXISTS idx_ai_chat_configs_category ON ai_chat_configs(category);
CREATE INDEX IF NOT EXISTS idx_ai_chat_configs_key ON ai_chat_configs(config_key);

-- 初始配置数据
INSERT INTO ai_chat_configs (config_key, config_value, config_type, category, description, is_encrypted) VALUES
-- Redis 配置
('redis.host', '127.0.0.1', 'string', 'redis', 'Redis 主机地址', FALSE),
('redis.port', '6379', 'number', 'redis', 'Redis 端口', FALSE),
('redis.password', '123456', 'string', 'redis', 'Redis 密码', TRUE),
('redis.db', '0', 'number', 'redis', 'Redis 数据库编号', FALSE),

-- MiniMax 大模型配置
('minimax.api_key', 'xxx', 'string', 'llm', 'MiniMax API Key', TRUE),
('minimax.api_url', 'https://api.minimaxi.com/v1', 'string', 'llm', 'MiniMax API 地址', FALSE),
('minimax.model', 'M2-her', 'string', 'llm', 'MiniMax 模型名称', FALSE),

-- 通义备用模型配置
('qwen.api_key', 'xxx', 'string', 'llm', '通义 API Key', TRUE),
('qwen.api_url', 'https://dashscope.aliyuncs.com/compatible-mode/v1', 'string', 'llm', '通义 API 地址', FALSE),
('qwen.model', 'qwen-plus', 'string', 'llm', '通义模型名称', FALSE),

-- 对话配置
('dialog.max_context_turns', '10', 'number', 'dialog', '最大上下文轮数', FALSE),
('dialog.context_expire_seconds', '86400', 'number', 'dialog', '上下文过期时间(秒)', FALSE),
('dialog.memory_recall_topn', '5', 'number', 'dialog', '记忆召回返回条数', FALSE),
('dialog.fallback_threshold', '3', 'number', 'dialog', '模型降级连续失败次数阈值', FALSE),
('dialog.slow_response_ms', '5000', 'number', 'dialog', '慢响应判定阈值(毫秒)', FALSE),

-- AI Token 限制配置
('token.max_total', '32000', 'number', 'token', '总 Token 上限', FALSE),
('token.max_history', '24000', 'number', 'token', '对话历史最大 Token', FALSE),
('token.max_system', '4000', 'number', 'token', 'System Prompt 最大 Token', FALSE),
('token.max_memory', '2000', 'number', 'token', '记忆召回最大 Token', FALSE),
('token.reserved', '2000', 'number', 'token', '保留空间 Token', FALSE),

-- 熔断配置
('circuit.max_fail_count', '5', 'number', 'circuit', '熔断连续失败次数', FALSE),
('circuit.breaker_timeout', '60', 'number', 'circuit', '熔断恢复超时(秒)', FALSE),

-- 摘要配置
('summary.trigger_turns', '10', 'number', 'summary', '自动摘要触发轮数', FALSE),
('summary.max_length', '100', 'number', 'summary', '摘要最大字数', FALSE);
```

**配置读取示例：**
```go
// ConfigRepository 配置仓库
type ConfigRepository struct {
    db *gorm.DB
}

// GetConfig 获取单个配置值
func (r *ConfigRepository) GetConfig(key string) (string, error) {
    var config Config
    if err := r.db.Where("config_key = ?", key).First(&config).Error; err != nil {
        return "", err
    }
    return config.ConfigValue, nil
}

// GetConfigByCategory 按分类获取所有配置
func (r *ConfigRepository) GetConfigByCategory(category string) (map[string]string, error) {
    var configs []Config
    if err := r.db.Where("category = ?", category).Find(&configs).Error; err != nil {
        return nil, err
    }
    result := make(map[string]string)
    for _, c := range configs {
        result[c.ConfigKey] = c.ConfigValue
    }
    return result, nil
}

// UpdateConfig 更新配置值
func (r *ConfigRepository) UpdateConfig(key, value string) error {
    return r.db.Model(&Config{}).Where("config_key = ?", key).Update("config_value", value).Error
}

// Config 结构体
type Config struct {
    ID          int64  `gorm:"primaryKey"`
    ConfigKey   string `gorm:"uniqueIndex;not null"`
    ConfigValue string `gorm:"not null"`
    ConfigType  string `gorm:"default:string"`
    Category    string `gorm:"index"`
    Description string
    IsEncrypted bool
}
```

**配置热更新：**
```go
// 配置监听器，检测配置变更并刷新内存缓存
type ConfigWatcher struct {
    repo     *ConfigRepository
    cache    sync.Map // 内存缓存
    interval time.Duration
}

func (w *ConfigWatcher) Start() {
    ticker := time.NewTicker(w.interval)
    go func() {
        for range ticker.C {
            w.refreshCache()
        }
    }()
}

func (w *ConfigWatcher) Get(key string) string {
    if val, ok := w.cache.Load(key); ok {
        return val.(string)
    }
    // 缓存未命中，从数据库读取
    val, _ := w.repo.GetConfig(key)
    w.cache.Store(key, val)
    return val
}
```

**分类说明：**

| 分类 | 说明 | 配置示例 |
|------|------|----------|
| redis | Redis 连接配置 | host, port, password, db |
| llm | 大模型 API 配置 | api_key, api_url, model |
| dialog | 对话行为配置 | max_context_turns, context_expire_seconds |
| token | Token 限制配置 | max_total, max_history, max_system |
| circuit | 熔断策略配置 | max_fail_count, breaker_timeout |
| summary | 摘要功能配置 | trigger_turns, max_length |

---

## 八、Proto 定义

### 8.1 ai_chat.proto

```protobuf
syntax = "proto3";

package proto;

option go_package = "github.com/topfans/backend/pkg/proto/ai_chat";

service AIChatService {
  // 发送消息，流式返回
  rpc SendMessage(ChatMessageRequest) returns (stream ChatMessageResponse);

  // 获取对话历史
  rpc GetHistory(ChatHistoryRequest) returns (ChatHistoryResponse);

  // ============= 人设管理 =============

  // 获取用户的所有人设
  rpc GetPersonas(GetPersonasRequest) returns (PersonaListResponse);
}

// ========== 对话 ==========

message ChatMessageRequest {
  string session_id = 1;
  string message = 2;
  string persona_id = 3;  // 可选，空则用默认人设
  int64 user_id = 4;
}

message ChatMessageResponse {
  string content = 1;
  string session_id = 2;
  bool is_end = 3;
  string error = 4;
}

message ChatHistoryRequest {
  string session_id = 1;
  int32 limit = 2;  // 默认 20
}

message ChatHistoryResponse {
  repeated Message history = 1;
}

// ========== 人设管理 ==========

message GetPersonasRequest {
  int64 user_id = 1;
}

message PersonaListResponse {
  repeated PersonaInfo personas = 1;
}

message PersonaInfo {
  string id = 1;
  string name = 2;
  string description = 3;
  string avatar_url = 4;
  string talk_style = 5;
  bool is_default = 6;
  int64 created_at = 7;
  int64 updated_at = 8;
}

// ========== 通用 ==========

message Message {
  string role = 1;    // "user" / "assistant"
  string content = 2;
}
```

---

## 九、Gateway 接入

在 `gateway/socket/` 目录下创建 WebSocket 处理：

```
gateway/socket/
├── ai_chat_socket.go    # AI Chat WebSocket 处理
└── hub.go               # WebSocket Hub 管理连接
```

**Hub 架构：**
```go
// Hub 管理所有 WebSocket 连接
type Hub struct {
    // 用户连接映射: userId -> *Connection
    clients map[int64]*Connection

    // WebSocket 升级器
    upgrader websocket.Upgrader

    // Dubbo 客户端
    aiChatClient *client.Client
}
```

**连接流程：**
1. Mobile 携带 Token 连接 `/ws/ai-chat?token=Bearer_xxx`
2. Hub 接收连接，从 URL 参数获取 Token 并验证
3. 验证通过后，创建用户连接映射
4. 返回鉴权成功消息，开始处理消息
5. Mobile 每 30s 发送 ping，Hub 回复 pong

### 9.2 Gateway Dubbo Client 初始化

在 `gateway/main.go` 中添加：

```go
// AIChatService Client
aiChatClient, err := client.NewClient(
    client.WithClientURL(cfg.Dubbo.AIChatServiceURL),
)
if err != nil {
    logger.Logger.Fatal("Failed to create AI Chat Service Dubbo client", zap.Error(err))
}
logger.Logger.Info("AI Chat Service Dubbo client connected successfully")
```

### 9.3 Router 配置

在 `gateway/router/router.go` 中添加 AIChat HTTP 路由（兼容性）：

```go
// API v1 路由组
v1 := r.Group("/api/v1")
{
    // AI Chat HTTP 兼容接口
    aiChat := v1.Group("/ai-chat")
    aiChat.Use(middleware.AuthMiddleware())
    {
        aiChat.GET("/personas", aiChatCtrl.GetPersonas)                        // 获取人设列表
        aiChat.GET("/history/:sessionId", aiChatCtrl.GetHistory)                // 获取对话历史
    }
}

// WebSocket 路由
r.GET("/ws/ai-chat", socketCtrl.HandleAIChatWebSocket)
```

---

## 十、部署脚本

### 10.1 systemd 服务文件

```ini
[Unit]
Description=TopFans AI Chat Service
After=network.target

[Service]
Type=simple
User=ubuntu
WorkingDirectory=/opt/topfans/backend
Environment="ENV=production"
ExecStart=/opt/topfans/backend/services/aiChatService/aiChatService
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
```

---

## 十一、AI Token 节省策略

### 11.1 Token 优化矩阵

| 策略 | 节省比例 | 实现成本 | 适用场景 |
|------|----------|----------|----------|
| 动态上下文裁剪 | 20-40% | 低 | 所有场景 |
| 单条消息截断 | 10-20% | 低 | 超长消息输入 |
| 记忆召回限制 | 15-25% | 低 | 高频对话 |
| System Prompt 精简 | 10-30% | 中 | 人设优化 |
| 历史摘要压缩 | 30-50% | 高 | 长程对话 |

### 11.2 核心优化策略

#### 11.2.1 智能历史裁剪

```go
// 优化：基于对话质量裁剪，非简单先进先出
type HistoryManager struct {
    maxTurns int
    // 对话质量评分：用于决定保留哪些对话
    qualityScore map[string]float64
}

// 对话质量评分因素
// - 包含关键信息（用户偏好、重要事实）：+0.3
// - 情感共情对话：+0.1
// - 日常寒暄：-0.1
// - 重复内容：-0.2
```

#### 11.2.2 Prompt 压缩

```go
// System Prompt 压缩示例
// 压缩前 (原始人设)
"你是一个温柔体贴的AI伴侣，名字叫小雪。你善于倾听，能理解用户的情绪，
用温暖的话语陪伴用户。说话风格亲切自然，像朋友聊天一样。
不要过于正式或说教，当用户情绪低落时，先给予共情和安慰。"

// 压缩后 (精简版 - 保持核心特征)
"你是小雪，温柔体贴的AI伴侣。善于倾听、共情陪伴。
风格：亲切自然，像朋友聊天，不说教。"
```

#### 11.2.3 上下文摘要

```go
// 长对话自动摘要（每 N 轮触发）
const SUMMARY_TRIGGER_TURNS = 10  // 10轮对话后触发摘要

// 摘要 Prompt
summaryPrompt := `将以下对话核心内容压缩为 100 字以内的摘要，
保留关键信息（用户偏好、重要事实、情绪状态）：
{对话内容}

摘要格式：["关键信息1", "关键信息2", "用户情绪状态"]`

// 摘要结果存入长期记忆
memoryService.SaveSummary(userID, summary, keywords)
```

#### 11.2.4 快速拒绝（Early Rejection）

```go
// 检测是否需要调用大模型
// 场景：用户只是刷新页面、获取历史等不需要 AI 回复的操作

// 无需调用大模型的情况：
// - get_history 请求
// - 纯符号/数字输入（如 "？"、"123"）
// - 重复发送相同消息

if isNoNeedLLMCall(input) {
    return fallbackResponse // 直接返回，不消耗 Token
}
```

#### 11.2.5 备用模型降级策略

```go
// 成本对比 (参考价)
// MiniMax M2-her: ¥0.01/1K tokens
// 通义 qwen-plus: ¥0.004/1K tokens

// 降级策略
type ModelConfig struct {
    primary   string  // M2-her
    backup    string  // qwen-plus
    useBackup bool    // 是否使用备用
}

// 当连续失败 N 次或响应慢时，自动降级到通义
const FALLBACK_THRESHOLD = 3
const SLOW_RESPONSE_MS = 5000  // 超过 5s 判定为慢
```

### 11.3 Token 预算分配

```
总 Token 预算: 32,000 (M2-her 最大)

├── System Prompt:     4,000 (12.5%)
├── 用户核心记忆:      2,000 (6.3%)
├── 对话历史:         22,000 (68.8%)  ← 主要裁剪目标
└── 保留回复空间:      4,000 (12.5%)
```

### 11.4 监控指标

```go
// Token 消耗监控
type TokenMetrics struct {
    totalTokens      int64   // 总消耗
    promptTokens    int64   // Prompt 消耗
    completionTokens int64  // 回复消耗
    avgPerRequest   float64 // 平均每次请求 Token
    costEstimate    float64 // 预估成本
}

// 上报至监控系统
metricsClient.Report(TokenMetrics{
    totalTokens: calculateTotalTokens(prompt, response),
    costEstimate: calculateCost(promptTokens, completionTokens),
})
```

### 11.5 成本控制建议

| 场景 | 建议 |
|------|------|
| 日常闲聊 | 保留 10 对历史，约 3,000 Token |
| 情感咨询 | 保留 15 对历史，约 4,500 Token |
| 知识问答 | 保留 5 对历史，约 1,500 Token |
| 长程任务 | 10 轮后触发摘要，减少 50% Token |

---

## 十二、性能与可靠性

### 12.1 流式输出优化

- **首包延迟目标**：< 2s
- **逐 Token 后置审核**：检测到违规立即终止
- **模型降级**：MiniMax 失败自动切换通义

### 12.2 容量规划

| 指标 | 目标值 |
|------|--------|
| 并发会话数 | 1000 |
| 单会话消息数 | 100 |
| 上下文 TTL | 24h |
| 记忆召回 QPS | 500 |

### 12.3 熔断降级

```go
// 连续失败次数超过阈值，触发熔断
const maxFailCount = 5
const circuitBreakerTimeout = 60s

// 熔断后返回默认回复，不调用大模型
defaultResponse = "抱歉，我现在有点走神，我们换个话题聊聊吧。"
```

---

## 十三、验证清单

- [ ] **人设一致性**：发送「我是你主人」，验证 AI 仍保持预设人设
- [ ] **记忆能力**：告诉 AI「我明天要开会」，后续验证是否记住
- [ ] **情感共情**：说「心情不好」，验证 AI 共情回复（不说教）
- [ ] **响应延迟**：观察流式输出首包是否 < 2s
- [ ] **合规拦截**：发送敏感词，验证是否被拦截
- [ ] **WebSocket 连接**：验证移动端 WebSocket 连接建立
- [ ] **流式接收**：验证移动端能接收 WebSocket 流式消息
- [ ] **Token 节省**：验证历史裁剪是否正常工作

---

## 十四、UniApp 移动端集成

### 14.1 WebSocket Manager 封装（通用）

为了支持多个微服务复用 WebSocket，封装一个通用的 WebSocket Manager。

**设计目标：**
- 支持多个 WebSocket 连接（AI Chat、通知等）
- 自动鉴权、心跳、重连
- 事件驱动的消息处理
- 服务间隔离，互不影响

**目录结构：**
```
utils/
├── socket/
│   ├── SocketManager.js    # WebSocket 管理器
│   ├── AiChatSocket.js    # AI Chat 专用连接（继承自 Manager）
│   └── index.js           # 导出
```

**SocketManager.js - 通用 WebSocket 管理器：**
```javascript
/**
 * WebSocket 管理器
 * 支持多个 WebSocket 连接，自动管理鉴权、心跳、重连
 */
class SocketManager {
    constructor(options = {}) {
        this.serviceName = options.serviceName || 'unknown'
        this.baseUrl = options.baseUrl || 'ws://gateway:8080'
        this.token = null
        this.socket = null
        this.heartbeatTimer = null
        this.reconnectTimer = null
        this.reconnectInterval = options.reconnectInterval || 3000
        this.heartbeatInterval = options.heartbeatInterval || 30000
        this.maxReconnectAttempts = options.maxReconnectAttempts || 5
        this.reconnectAttempts = 0

        // 状态
        this.isConnected = false
        this.isAuthed = false

        // 事件处理器
        this.eventHandlers = {
            'connect': [],
            'disconnect': [],
            'auth_success': [],
            'auth_fail': [],
            'error': [],
            'message': []  // 通用消息处理
        }

        // 子类可覆盖的消息类型处理
        this.messageHandlers = {}
    }

    /**
     * 连接到 WebSocket 服务器
     */
    connect(token, path) {
        this.token = token
        this.path = path
        this.reconnectAttempts = 0
        this._doConnect()
    }

    _doConnect() {
        const url = `${this.baseUrl}${this.path}?token=Bearer_${this.token}`
        console.log(`[${this.serviceName}] Connecting to ${url}`)

        this.socket = uni.connectSocket({ url })
        this._setupListeners()
    }

    _setupListeners() {
        // 连接打开
        this.socket.onOpen(() => {
            console.log(`[${this.serviceName}] WebSocket connected`)
            this.isConnected = true
            this._emit('connect')
        })

        // 接收消息
        this.socket.onMessage((event) => {
            const data = JSON.parse(event.data)
            this._handleMessage(data)
        })

        // 连接关闭
        this.socket.onClose(() => {
            console.log(`[${this.serviceName}] WebSocket closed`)
            this._cleanup()
            this._emit('disconnect')
            this._tryReconnect()
        })

        // 连接错误
        this.socket.onError((err) => {
            console.error(`[${this.serviceName}] WebSocket error:`, err)
            this._emit('error', err)
        })
    }

    _handleMessage(data) {
        // 触发通用消息事件
        this._emit('message', data)

        // 根据消息类型处理
        const { type, action } = data

        // 1. 鉴权响应（通用）
        if (type === 'auth_response') {
            if (data.success) {
                this.isAuthed = true
                this._emit('auth_success', data)
                this._startHeartbeat()
            } else {
                this.isAuthed = false
                this._emit('auth_fail', data)
                this.close()
            }
            return
        }

        // 2. 心跳响应（通用）
        if (type === 'pong') {
            console.log(`[${this.serviceName}] Heartbeat received`)
            return
        }

        // 3. 错误响应（通用）
        if (type === 'error') {
            this._emit('error', data)
            return
        }

        // 4. 服务特定消息类型处理
        const handler = this.messageHandlers[type] || this.messageHandlers[action]
        if (handler) {
            handler(data)
        }
    }

    /**
     * 发送消息
     */
    send(data) {
        if (!this.socket || !this.isConnected) {
            console.warn(`[${this.serviceName}] Socket not connected`)
            return false
        }

        this.socket.send({
            data: JSON.stringify(data)
        })
        return true
    }

    /**
     * 发送心跳
     */
    _startHeartbeat() {
        this._stopHeartbeat()
        this.heartbeatTimer = setInterval(() => {
            if (this.isConnected) {
                this.send({ action: 'ping' })
            }
        }, this.heartbeatInterval)
    }

    _stopHeartbeat() {
        if (this.heartbeatTimer) {
            clearInterval(this.heartbeatTimer)
            this.heartbeatTimer = null
        }
    }

    /**
     * 重连机制
     */
    _tryReconnect() {
        if (this.reconnectAttempts >= this.maxReconnectAttempts) {
            console.warn(`[${this.serviceName}] Max reconnect attempts reached`)
            this._emit('error', { code: 'RECONNECT_FAILED', message: '重连次数已达上限' })
            return
        }

        this.reconnectAttempts++
        console.log(`[${this.serviceName}] Reconnecting... (${this.reconnectAttempts}/${this.maxReconnectAttempts})`)

        this.reconnectTimer = setTimeout(() => {
            this._doConnect()
        }, this.reconnectInterval)
    }

    _cleanup() {
        this._stopHeartbeat()
        if (this.reconnectTimer) {
            clearTimeout(this.reconnectTimer)
            this.reconnectTimer = null
        }
        this.isConnected = false
        this.isAuthed = false
    }

    /**
     * 关闭连接
     */
    close() {
        this._cleanup()
        if (this.socket) {
            this.socket.close()
            this.socket = null
        }
    }

    // ===== 事件系统 =====

    on(event, handler) {
        if (this.eventHandlers[event]) {
            this.eventHandlers[event].push(handler)
        }
        return () => this.off(event, handler)  // 返回取消订阅函数
    }

    off(event, handler) {
        if (this.eventHandlers[event]) {
            this.eventHandlers[event] = this.eventHandlers[event].filter(h => h !== handler)
        }
    }

    _emit(event, data) {
        if (this.eventHandlers[event]) {
            this.eventHandlers[event].forEach(handler => handler(data))
        }
    }

    // ===== 订阅特定消息类型 =====

    /**
     * 注册特定消息类型的处理器
     * @param {string} type 消息类型或 action
     * @param {function} handler 处理函数
     */
    registerHandler(type, handler) {
        this.messageHandlers[type] = handler
    }
}

/** 导出 */
export default SocketManager
```

**AiChatSocket.js - AI Chat 专用连接：**
```javascript
import SocketManager from './SocketManager'

/**
 * AI Chat WebSocket 连接
 * 继承自 SocketManager，添加 AI Chat 特定的业务逻辑
 */
class AiChatSocket extends SocketManager {
    constructor() {
        super({
            serviceName: 'AiChat',
            baseUrl: 'ws://gateway:8080',
            path: '/ws/ai-chat',
            reconnectInterval: 3000,
            heartbeatInterval: 30000,
            maxReconnectAttempts: 5
        })

        // AI Chat 特定回调
        this.onMessageCallback = null
        this.onHistoryCallback = null
        this.onPersonasCallback = null
        this.onErrorCallback = null

        // 注册 AI Chat 特定的消息处理器
        this._registerAiChatHandlers()
    }

    _registerAiChatHandlers() {
        // 流式消息
        this.registerHandler('message', (data) => {
            if (this.onMessageCallback) {
                this.onMessageCallback(data)
            }
        })

        // 历史消息响应
        this.registerHandler('history_response', (data) => {
            if (this.onHistoryCallback) {
                this.onHistoryCallback(data)
            }
        })

        // 人设列表响应
        this.registerHandler('personas_response', (data) => {
            if (this.onPersonasCallback) {
                this.onPersonasCallback(data)
            }
        })
    }

    /**
     * 连接到 AI Chat 服务
     */
    connect(token) {
        super.connect(token, '/ws/ai-chat')
    }

    /**
     * 发送消息
     */
    sendMessage(message, sessionId, personaId = '') {
        return this.send({
            action: 'send_message',
            session_id: sessionId,
            message: message,
            persona_id: personaId
        })
    }

    /**
     * 获取历史记录
     */
    getHistory(sessionId, limit = 20) {
        return this.send({
            action: 'get_history',
            session_id: sessionId,
            limit: limit
        })
    }

    /**
     * 获取人设列表
     */
    getPersonas() {
        return this.send({
            action: 'get_personas'
        })
    }

    /**
     * 设置消息回调
     */
    setOnMessageCallback(callback) {
        this.onMessageCallback = callback
    }

    /**
     * 设置历史记录回调
     */
    setOnHistoryCallback(callback) {
        this.onHistoryCallback = callback
    }

    /**
     * 设置人设列表回调
     */
    setOnPersonasCallback(callback) {
        this.onPersonasCallback = callback
    }

    /**
     * 设置错误回调
     */
    setOnErrorCallback(callback) {
        this.onErrorCallback = callback
        this.on('error', (data) => {
            if (callback) callback(data)
        })
    }
}

// 单例模式
let aiChatInstance = null

export function getAiChatSocket() {
    if (!aiChatInstance) {
        aiChatInstance = new AiChatSocket()
    }
    return aiChatInstance
}

export function closeAiChatSocket() {
    if (aiChatInstance) {
        aiChatInstance.close()
        aiChatInstance = null
    }
}

export default AiChatSocket
```

### 14.3 替换轮询为 WebSocket

多个业务场景可复用 WebSocket Manager，将轮询替换为 WebSocket 推送。

**适用场景：**

| 场景 | 轮询接口 | WebSocket 替换 |
|------|----------|----------------|
| 好友请求通知 | `GET /api/v1/social/friend-requests` | 复用同一连接，接收 `friend_request` 消息 |
| 排行榜更新 | `GET /api/v1/rankings/hot` | 复用同一连接，接收 `ranking_update` 消息 |
| 活动进度 | `GET /api/v1/activities/:id/progress` | 复用同一连接，接收 `activity_progress` 消息 |
| 系统通知 | 轮询获取 | 新建 NotificationSocket，接收 `notification` 消息 |

**架构设计：**

```
┌─────────────────────────────────────────────────────────┐
│                    UniApp 前端                          │
│  ┌─────────────────────────────────────────────────┐   │
│  │              GlobalSocketManager                │   │
│  │  - 管理多个 WebSocket 连接                       │   │
│  │  - 共享同一个 Token                              │   │
│  │  - 统一心跳管理                                  │   │
│  └─────────────────────────────────────────────────┘   │
│                    │                                    │
│         ┌─────────┴──────────┐                          │
│         ▼                   ▼                          │
│  ┌─────────────┐     ┌─────────────┐                   │
│  │ AiChatSocket│     │NotifySocket│  ← 新增            │
│  └─────────────┘     └─────────────┘                   │
└─────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────┐
│                 Gateway (:8080)                          │
│  /ws/ai-chat    →  AIChatService                        │
│  /ws/notify     →  NotificationService (Future)         │
└─────────────────────────────────────────────────────────┘
```

**GlobalSocketManager.js - 全局 Socket 管理器：**
```javascript
/**
 * 全局 WebSocket 管理器
 * 统一管理多个服务的 WebSocket 连接
 */
class GlobalSocketManager {
    constructor() {
        this.sockets = {}  // serviceName -> SocketManager
        this.token = null
        this.isAllConnected = false
    }

    /**
     * 初始化所有连接
     */
    init(token) {
        this.token = token
        this._initAiChat()
        this._initNotification()  // Future
    }

    _initAiChat() {
        const aiChat = getAiChatSocket()
        aiChat.on('connect', () => console.log('AI Chat connected'))
        aiChat.on('error', (err) => console.error('AI Chat error:', err))
        aiChat.connect(this.token)
        this.sockets['ai_chat'] = aiChat
    }

    _initNotification() {
        const notify = getNotificationSocket()
        notify.on('connect', () => console.log('Notification connected'))
        notify.on('notification', (data) => this._handleNotification(data))
        notify.connect(this.token)
        this.sockets['notification'] = notify
    }

    _handleNotification(data) {
        switch (data.sub_type) {
            case 'friend_request':
                uni.$emit('friend_request_received', data)
                break
            case 'ranking_update':
                uni.$emit('ranking_updated', data)
                break
            case 'activity_progress':
                uni.$emit('activity_progress_updated', data)
                break
            default:
                console.log('Unknown notification:', data)
        }
    }

    getSocket(serviceName) {
        return this.sockets[serviceName]
    }

    closeAll() {
        Object.values(this.sockets).forEach(socket => socket.close())
        this.sockets = {}
    }
}

let globalInstance = null

export function getGlobalSocket() {
    if (!globalInstance) {
        globalInstance = new GlobalSocketManager()
    }
    return globalInstance
}

export default GlobalSocketManager
```

**在 App.vue 中初始化：**
```javascript
// App.vue
import { getGlobalSocket } from '@/utils/socket/GlobalSocketManager'

export default {
    onLaunch() {
        const token = uni.getStorageSync('token')
        if (token) {
            getGlobalSocket().init(token)
        }
    },
    onHide() {
        getGlobalSocket().closeAll()
    }
}
```

**在页面中使用：**
```javascript
// pages/friend/friend.vue
export default {
    data() {
        return { friendRequests: [] }
    },

    onLoad() {
        uni.$on('friend_request_received', (data) => {
            this.friendRequests.unshift(data)
            uni.showToast({ title: '收到新好友请求', icon: 'none' })
        })
    },

    onUnload() {
        uni.$off('friend_request_received')
    }
}
```

**复用连接的好处：**
- **资源节省**：多个业务共享一个 WebSocket 连接
- **实时性**：从轮询 5-10s 延迟降为即时推送
- **维护简单**：统一的心跳、重连、鉴权逻辑

### 14.4 注意事项

1. **鉴权方式**：连接时通过 URL 参数传递 Token（`?token=Bearer_xxx`），Gateway 验证通过后立即建立连接
2. **重连机制**：建议实现断线重连，间隔 3-5 秒
3. **心跳保活**：每 30 秒发送一次 ping 消息，60 秒无响应则主动关闭连接
4. **Session 管理**：建议在本地存储 session_id，便于恢复对话
5. **错误处理**：收到 `auth_response.success=false` 时立即关闭连接并提示用户重新登录

---

## 十五、与 SSE 方案的对比

| 对比项 | SSE | WebSocket |
|--------|-----|-----------|
| UniApp 支持 | 有限（需条件编译） | 完整支持 |
| 实现复杂度 | 中等 | 较低 |
| 双向通信 | 需轮询模拟 | 原生支持 |
| 服务器资源 | 较低 | 稍高（每个连接维护状态） |
| 扩展性 | 差（仅单向） | 好（可扩展主动推送） |
| 代理支持 | 需特殊配置（Nginx 流式） | 需配置 WebSocket 支持 |

**选择 WebSocket 的原因**：
1. UniApp 对 WebSocket 支持更好
2. 便于后续扩展主动推送功能（设计文档 V2）
3. 实现更简单，双向通信更自然