docs: 文档修改
This commit is contained in:
parent
aba8ec6ba7
commit
9d7e360845
File diff suppressed because it is too large
Load Diff
@ -49,6 +49,184 @@
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## 〇·一、本期实施范围(2026-06-08 追加)
|
||||||
|
|
||||||
|
> **状态**: 本期聚焦看板 7 RPC + 事件采集框架两个范围,业务侧 5 服务集成在本期内,OLAP/SDK/实时/采样/多赛季/灰度发布策略等延后。本节不替换原设计,只标注"本期做 vs 不做",并给出阶段划分与特性清单,作为实施依据。
|
||||||
|
|
||||||
|
### 0.1.1 范围矩阵
|
||||||
|
|
||||||
|
| # | 类别 | 项目 | 本期 | 来源章节 |
|
||||||
|
|---|------|------|------|----------|
|
||||||
|
| 1 | **看板 7 RPC** | `GetTodayOverview` / `Get7DayIncomeCurve` / `GetExhibitionIncomeSummary` / `GetLikeIncomeByLevel` / `GetTopAssetsByEarning` / `GetAssetLevelDistribution` / `GetAssetUpgradeProgress` | ✅ | §2.3、§3.4-3.5、§4.2 |
|
||||||
|
| 2 | **事件采集** | `TrackEvent` / `BatchTrackEvent` + EventSink 抽象 + Worker 批量落库 | ✅ | §2.2、§3.2、§4.3 |
|
||||||
|
| 3 | **预聚合表** | `metric_weekly_user_income` / `metric_recent_level_ups` / `metric_upcoming_level_ups` | ✅ | §3.5 |
|
||||||
|
| 4 | **物化视图** | `mv_daily_user_income` / `mv_daily_exhibition_revenue` / `mv_daily_like_income` / `mv_asset_level_distribution` | ✅ | §3.4 |
|
||||||
|
| 5 | **分区管理** | events 表按日分区 + 7 天预创建 + 30 天清理 | ✅ | §3.3、§4.5 |
|
||||||
|
| 6 | **Gateway 路由** | 7 个 `/api/v1/dashboard/*` 路由 + JWT 鉴权 + 粉丝身份校验 | ✅ | §2.4-2.6、§4.2 |
|
||||||
|
| 7 | **业务侧集成** | 5 个服务改造(social/asset/gallery/task/user)逐步加 TrackEvent | ✅ | §5.7 |
|
||||||
|
| 8 | **可观测性** | Prometheus 指标(§4.7 全量)+ healthz + refresh_log | ✅ | §4.7、§4.8 |
|
||||||
|
| 9 | **OLAP 双写** | ClickHouse 集成 | ❌ | §3.10、§6.2 |
|
||||||
|
| 10 | **实时通道** | 同步 TrackEvent 实时通道 | ❌ | §3.10、§6.2 |
|
||||||
|
| 11 | **SDK 端点** | HTTP `/api/v1/dashboard/track` 暴露 | ❌ | §2.4、§3.10 |
|
||||||
|
| 12 | **采样** | 客户端采样 + `EnableSampling` | ❌ | §3.8、§4.9 |
|
||||||
|
| 13 | **多赛季** | `season_id` 字段 / 赛季 Tab | ❌ | §3.10、§6.2 |
|
||||||
|
| 14 | **灰度发布** | 10%→100% 灰度策略 + Stage 1-6 步骤 | ❌(保留机制但简化执行) | §5.8 |
|
||||||
|
| 15 | **完整告警实现** | 钉钉/飞书 webhook(10 条规则) | ❌(指标埋点保留,告警由监控平台处理) | §4.8 |
|
||||||
|
| 16 | **Channel 满载 Level 2-5** | 加大缓冲 / Worker 并发 / 解耦预聚合 / 降级 / 采样 | ❌(仅 Level 1 监控) | §4.9 |
|
||||||
|
|
||||||
|
### 0.1.2 阶段划分(4 阶段 / 事件先看板后)
|
||||||
|
|
||||||
|
| 阶段 | 名称 | 周期(估) | 关键产出 | 关键验证 |
|
||||||
|
|------|------|-----------|---------|---------|
|
||||||
|
| **P1** | 服务骨架 | 2-3 天 | go.mod / proto / main.go / config / healthz / 启动接入 go.work / 10 个 SQL 迁移 | `go build` 通过 + `go test ./...` 通过 + 契约测试(proto json schema 校验)+ DB 迁移可重放 + P1 末预检查清单通过 |
|
||||||
|
| **P2** | 事件采集 | 4-5 天 | events 表(已建)+ TrackEvent/BatchTrackEvent RPC + EventSink 接口 + event_flusher Worker + 3 个预聚表 + partitioner | 单元测试 + 集成测试(dockertest 真实 PG)+ 业务侧 socialService 一个服务联调通 |
|
||||||
|
| **P3** | 看板 7 RPC | 4-5 天 | 4 个 MV + 7 个 RPC service + main.go 启动预热 7 cache 逻辑 + gateway 7 路由 + Redis 5min TTL | 集成测试(7 RPC 全覆盖)+ gateway 端到端 + 性能测试(k6/wrk 缓存命中/未命中 P99)+ 前端切流量 + cache hit rate > 80% |
|
||||||
|
| **P4** | 业务侧补全 | 2-3 天 | 4 个服务集成(galleryService→taskService→assetService→userService)+ 各服务单元测试 + 联调 | 各服务 `go test` 通过 + 看板数据逐步出现 7 个事件类型 |
|
||||||
|
|
||||||
|
**关键依赖 / 决策:**
|
||||||
|
|
||||||
|
- **P2 不依赖 P3**:事件写入和看板读取在 service 层完全解耦(service 写 events,dashboard 读 MV),但都需要 PG schema + proto + config
|
||||||
|
- **P3 启动时预热**:dashboard 7 个 RPC 各走独立 cache key + 启动 hook 预热 7 个 cache(防冷启动雪崩)
|
||||||
|
- **P4 顺序**:galleryService(exhibition.start/end)→ taskService(exhibition.revenue)→ assetService(asset.mint/level_up)→ userService(crystal.change)。先做高频路径(gallery 展出是核心场景),后做低频路径
|
||||||
|
|
||||||
|
**对外接口冻结点:**
|
||||||
|
|
||||||
|
- P1 末:proto 文件定稿(7 RPC + 2 事件 RPC,事件 RPC 内部可用)
|
||||||
|
- P2 末:TrackEvent RPC 内部可用,业务侧可接入
|
||||||
|
- P3 末:前端可切流量
|
||||||
|
|
||||||
|
### 0.1.3 特性清单(每阶段具体做什么)
|
||||||
|
|
||||||
|
#### P1 · 服务骨架
|
||||||
|
|
||||||
|
**目标:** 起个空服务能跑起来,迁移能跑通。
|
||||||
|
|
||||||
|
| 任务 | 涉及文件 | 备注 |
|
||||||
|
|------|---------|------|
|
||||||
|
| go.mod 创建 | `backend/services/statisticService/go.mod` | 沿用 taskService 的依赖(pkg/database / pkg/redis / pkg/dubbo / pkg/metrics) |
|
||||||
|
| go.work 集成 | `backend/go.work` | 加 `./services/statisticService` |
|
||||||
|
| proto 定义 | `backend/proto/event.proto`、`backend/proto/statistic.proto` | **关键:冻结 proto**,7 RPC + 2 事件 RPC |
|
||||||
|
| 配置 | `backend/services/statisticService/config/statistic_config.go` | 端口 20009 + schema=statistic + Redis + 刷新间隔 + Channel 配置 + 4 个 EnableXxx=false(OLAPDualWrite / RealtimeChannel / SDKEndpoint / Sampling) |
|
||||||
|
| main.go | `backend/services/statisticService/main.go` | Dubbo triple 注册 + HTTP healthz + 启动 Worker hook |
|
||||||
|
| DB 迁移 10 个 SQL | `backend/migrations/2026_06_08_001_statistic_events.sql` 等 10 个 | events 分区表 + 4 MV + 3 预聚表 + refresh_log + 初始 7 天分区 |
|
||||||
|
| healthz 端点 | `backend/services/statisticService/handler/healthz.go` | 暴露 6 个关键指标(DB ping / Redis ping / Worker 状态 / 事件 channel 大小 / 预聚表行数) |
|
||||||
|
| metrics 埋点骨架 | `backend/services/statisticService/metrics/metrics.go` | §4.7 全部 20+ 指标先声明,本期不全部埋数据 |
|
||||||
|
| 启动验证 | 本地 + CI | `go build` + `go test ./...` + DB 迁移重放 + healthz 200 |
|
||||||
|
|
||||||
|
**P1 末预检查清单**(P2 开工前必须验证):
|
||||||
|
|
||||||
|
| # | 检查项 | 来源 | 验证方式 |
|
||||||
|
|---|--------|------|---------|
|
||||||
|
| 1 | socialService 有 `LikeAsset` 方法(或等价的写 like_income_log 入口) | §2.2 事件类型 | `grep -r "like_income_log" backend/services/socialService/` |
|
||||||
|
| 2 | galleryService 有 `PlaceAsset`(开始展出)/ `RemoveFromSlot`(结束展出)方法 | §2.2 | `grep -rn "^func.*PlaceAsset\|^func.*RemoveFromSlot" backend/services/galleryService/service/` |
|
||||||
|
| 3 | taskService 有 `OnExhibitionCompleted` 方法(派发展览收益) | §2.2 | `grep -n "OnExhibitionCompleted" backend/services/taskService/service/revenue_service.go` |
|
||||||
|
| 4 | assetService 有 `CreateMintOrder`(铸造)/ `CheckUpgrade` 或 `logLevelChange`(升级)方法 | §2.2 | `grep -n "CreateMintOrder\|CheckUpgrade\|logLevelChange" backend/services/assetService/service/{mint_service,asset_level_service}.go`;**实际等级变更点需 P1 末向 assetService 同学确认** |
|
||||||
|
| 5 | userService 有 `UpdateCrystalBalance` 方法(水晶账本变动) | §2.2 | `grep -n "UpdateCrystalBalance" backend/services/userService/service/user_service.go` |
|
||||||
|
| 6 | public 库的 `assets` 表有 `level` 字段、`status='active'` 软删除状态、`deleted_at IS NULL` 约定(MV4 假设) | §3.4 MV4 | `\d+ public.assets` 或 DBeaver 看 schema |
|
||||||
|
| 7 | public 库的 `like_income_log` / `crystal_log` / `level_up_log` 表存在 | §3.4、§3.5 | `\dt public.*` |
|
||||||
|
| 8 | Dubbo triple 端口 20009 未被占用 | §1.4 | `lsof -i :20009` |
|
||||||
|
| 9 | 业务侧 5 个服务的 Dubbo 注册名(`tri://...:20000/20001/20002/20003/20006`)可达 | §1.2、§5.5 | `nc -zv` 或本地 e2e 调通 |
|
||||||
|
| 10 | PostgreSQL `statistic` schema 创建权限 OK | §3.1 | 用业务账号 `CREATE SCHEMA statistic` 测试 |
|
||||||
|
|
||||||
|
**核对不过则不进 P2。**
|
||||||
|
|
||||||
|
#### P2 · 事件采集框架
|
||||||
|
|
||||||
|
**目标:** 服务能收事件、能落库、能维护预聚表;socialService 一个业务方联调通。
|
||||||
|
|
||||||
|
| 模块 | 涉及文件 | 备注 |
|
||||||
|
|------|---------|------|
|
||||||
|
| **model** | `model/event.go` | Event 结构 + JSONB 序列化 |
|
||||||
|
| **repository** | `repository/event_repo.go` | 批量 INSERT ON CONFLICT DO NOTHING + event_id 去重 |
|
||||||
|
| **repository** | `repository/metric_repo.go` | 3 个预聚表读写 |
|
||||||
|
| **sink** | `sink/event_sink.go`(接口)+ `sink/channel_sink.go`(本期实现) | EventSink 接口预留 OLAP/实时/采样实现点 |
|
||||||
|
| **service** | `service/event_service.go` | 校验(event_id 格式 / user_id 存在 / event_type 白名单 / properties < 1KB)+ 推入 channel |
|
||||||
|
| **worker** | `worker/event_flusher.go` | 攒批 100 条/1s + 批量落库 + 触发 metric_recent_level_ups 同步更新 |
|
||||||
|
| **worker** | `worker/metric_recent_level_ups_updater.go` | 同步更新(被 event_flusher 调用) |
|
||||||
|
| **worker** | `worker/metric_upcoming_level_ups_updater.go` | 15min ticker(不依赖 event) |
|
||||||
|
| **worker** | `worker/metric_weekly_user_income_updater.go` | 5min ticker,pg_try_advisory_lock 防多实例重复 |
|
||||||
|
| **worker** | `worker/partitioner.go` | 启动时 + 每天 00:05 创建未来 7 天 / 每天 00:30 清理 30 天前 |
|
||||||
|
| **proto handler** | `handler/track_event.go` | TrackEvent / BatchTrackEvent RPC 实现 |
|
||||||
|
| **业务侧集成** | socialService 的 `like_income_log` 写入后 | 调用 `statisticClient.TrackEvent` 异步(fire-and-forget) |
|
||||||
|
| **测试** | `service/event_service_test.go`、`worker/event_flusher_test.go`、`integration/track_event_test.go` | 单元 + 集成(dockertest 真实 PG) |
|
||||||
|
|
||||||
|
**EventSink 接口设计(关键):**
|
||||||
|
|
||||||
|
```go
|
||||||
|
type EventSink interface {
|
||||||
|
Submit(ctx context.Context, e *event.Event) error
|
||||||
|
SubmitBatch(ctx context.Context, es []*event.Event) error
|
||||||
|
Close() error
|
||||||
|
}
|
||||||
|
// 本期实现:ChannelEventSink(推到 channel,由 event_flusher 消费)
|
||||||
|
// 未来扩展:KafkaEventSink / ClickHouseDualWriteSink / SamplingEventSink
|
||||||
|
```
|
||||||
|
|
||||||
|
#### P3 · 看板 7 RPC
|
||||||
|
|
||||||
|
**目标:** 7 个看板端到端通,前端可切流量。
|
||||||
|
|
||||||
|
| 模块 | 涉及文件 | 备注 |
|
||||||
|
|------|---------|------|
|
||||||
|
| **model** | `model/metric.go` | 7 个响应结构对齐 proto |
|
||||||
|
| **repository** | `repository/dashboard_repo.go` | 7 个聚合 SQL(读 MV/预聚表,公共 schema 关联 assets/exhibitions) |
|
||||||
|
| **service** | `service/dashboard_service.go` | 7 个 RPC 业务逻辑 + Redis 5min TTL + 缓存穿透防护 + 跨服务调用降级(userService.crystal_balance 失败时用 stale 标记) |
|
||||||
|
| **service** | `service/metric_service.go` | MV 刷新协调(被 materializer 调用) |
|
||||||
|
| **worker** | `worker/materializer.go` | 4 个 MV 用 REFRESH MATERIALIZED VIEW CONCURRENTLY + pg_try_advisory_lock + 错开执行时间(避免 DB 瞬时压力) + refresh_log 记录 |
|
||||||
|
| **cache** | `service/cache.go` | 封装 pkg/redis,cache key 格式 `dash:{rpc}:{star_id}:{user_id}` |
|
||||||
|
| **gateway 路由** | `backend/gateway/router/router.go` | 7 个 GET 路由 |
|
||||||
|
| **gateway controller** | `backend/gateway/controller/statistic_controller.go` | 7 个方法 + JWT 鉴权 + 粉丝身份校验(调 userService.fan_profile)+ 响应包装 `{code:200, data:resp}` |
|
||||||
|
| **业务侧消费** | gateway 调 7 个 RPC,前端 dashboardApi 已固定 | 切流量前确认前端 mock 数据准确性 |
|
||||||
|
| **测试** | `service/dashboard_service_test.go`(7 RPC 单测)、`integration/dashboard_rpc_test.go`(7 RPC dockertest)、`integration/gateway_test.go`(端到端) | 100% RPC 覆盖 |
|
||||||
|
|
||||||
|
**看板冷启动优化(启动 hook 预热):**
|
||||||
|
|
||||||
|
```go
|
||||||
|
// main.go 启动时
|
||||||
|
go func() {
|
||||||
|
for _, starID := range getTopNStars(100) { // 取前 100 star
|
||||||
|
for _, rpc := range dashboardRPCs {
|
||||||
|
go callAndCache(rpc, starID, sampleUserID)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}()
|
||||||
|
```
|
||||||
|
|
||||||
|
#### P4 · 业务侧补全(4 个服务)
|
||||||
|
|
||||||
|
| 顺序 | 服务 | 事件类型 | 集成位置 |
|
||||||
|
|------|------|---------|---------|
|
||||||
|
| 1 | galleryService | `exhibition.start`、`exhibition.end` | `PlaceAsset`(开始)/ `RemoveFromSlot`(结束)后 |
|
||||||
|
| 2 | taskService | `exhibition.revenue` | `OnExhibitionCompleted` 调用后 |
|
||||||
|
| 3 | assetService | `asset.mint`、`asset.level_up` | `CreateMintOrder`(铸造)/ `CheckUpgrade`(升级)后 |
|
||||||
|
| 4 | userService | `crystal.change` | `UpdateCrystalBalance` 调用后 |
|
||||||
|
|
||||||
|
**集成模式(统一封装):**
|
||||||
|
|
||||||
|
```go
|
||||||
|
// 各服务引入 pkg/statistic 客户端
|
||||||
|
type StatisticClient interface {
|
||||||
|
TrackEvent(ctx context.Context, e *event.Event) error
|
||||||
|
BatchTrackEvent(ctx context.Context, es []*event.Event) error
|
||||||
|
}
|
||||||
|
// 业务调用方用 defer + recover 保证 fire-and-forget 不影响主流程
|
||||||
|
defer func() {
|
||||||
|
if r := recover(); r != nil {
|
||||||
|
log.Errorf("statistic track panic: %v", r)
|
||||||
|
}
|
||||||
|
go statisticClient.TrackEvent(bgCtx, buildEvent(...))
|
||||||
|
}()
|
||||||
|
```
|
||||||
|
|
||||||
|
**每个服务必做:**
|
||||||
|
|
||||||
|
- 加 `statisticClient` 注入
|
||||||
|
- 在指定业务方法后加 `defer go TrackEvent`
|
||||||
|
- 单测验证 `TrackEvent` 被调用(用 mock client)
|
||||||
|
- 联调验证事件能落库 + 看板数据能出现
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## 一、架构总览
|
## 一、架构总览
|
||||||
|
|
||||||
### 1.1 服务定位
|
### 1.1 服务定位
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user