434 lines
11 KiB
Markdown
434 lines
11 KiB
Markdown
# 收入完整性检测算法测试指南
|
||
|
||
## 📋 测试概览
|
||
|
||
本测试数据集为收入完整性检测算法提供8个不同风险等级的专项测试场景,总计 **280 条记录**,涵盖从正常到严重风险的全方位测试案例。
|
||
|
||
### 数据统计
|
||
- **主播信息**: 8 条
|
||
- **分成协议**: 24 条
|
||
- **充值记录**: 240 条
|
||
- **税务申报**: 8 条
|
||
- **测试场景**: 8 个
|
||
- **数据期间**: 2024年1月
|
||
|
||
## 🎯 测试场景详细说明
|
||
|
||
### 场景1: 严重漏报场景 (CRITICAL)
|
||
- **主播ID**: TEST_001
|
||
- **风险等级**: CRITICAL (严重)
|
||
- **测试重点**: 测试算法对严重漏报行为的检测能力
|
||
|
||
**数据特征**:
|
||
- 充值总额: ¥455,375.00
|
||
- 申报金额: ¥100,000.00
|
||
- 差异金额: ¥355,375.00
|
||
- 差异率: 78.04%
|
||
|
||
**预期结果**:
|
||
- 风险等级: CRITICAL
|
||
- 风险评分: 90-100分
|
||
- 风险类型: 严重收入漏报
|
||
|
||
**触发条件**:
|
||
```
|
||
差异率 (78.04%) > 50% (严重风险阈值) ✓
|
||
差异金额 (355,375元) > 100,000元 (严重风险阈值) ✓
|
||
```
|
||
|
||
---
|
||
|
||
### 场景2: 高风险场景 (HIGH)
|
||
- **主播ID**: TEST_002
|
||
- **风险等级**: HIGH (高风险)
|
||
- **测试重点**: 测试算法对高风险漏报行为的检测能力
|
||
|
||
**数据特征**:
|
||
- 充值总额: ¥298,428.00
|
||
- 申报金额: ¥180,000.00
|
||
- 差异金额: ¥118,428.00
|
||
- 差异率: 39.68%
|
||
|
||
**预期结果**:
|
||
- 风险等级: HIGH
|
||
- 风险评分: 75-90分
|
||
- 风险类型: 明显收入漏报
|
||
|
||
**触发条件**:
|
||
```
|
||
差异率 (39.68%) > 30% (高风险阈值) ✓
|
||
差异金额 (118,428元) > 50,000元 (高风险阈值) ✓
|
||
```
|
||
|
||
---
|
||
|
||
### 场景3: 中风险场景 (MEDIUM)
|
||
- **主播ID**: TEST_003
|
||
- **风险等级**: MEDIUM (中风险)
|
||
- **测试重点**: 测试算法对中等程度漏报的检测能力
|
||
|
||
**数据特征**:
|
||
- 充值总额: ¥98,213.00
|
||
- 申报金额: ¥85,000.00
|
||
- 差异金额: ¥13,213.00
|
||
- 差异率: 13.45%
|
||
|
||
**预期结果**:
|
||
- 风险等级: MEDIUM
|
||
- 风险评分: 50-75分
|
||
- 风险类型: 中等收入差异
|
||
|
||
**触发条件**:
|
||
```
|
||
差异率 (13.45%) > 10% (中风险阈值) ✓
|
||
差异金额 (13,213元) > 10,000元 (中风险阈值) ✓
|
||
```
|
||
|
||
---
|
||
|
||
### 场景4: 低风险场景 (LOW)
|
||
- **主播ID**: TEST_004
|
||
- **风险等级**: LOW (低风险)
|
||
- **测试重点**: 测试算法对轻微收入差异的处理能力
|
||
|
||
**数据特征**:
|
||
- 充值总额: ¥47,792.00
|
||
- 申报金额: ¥47,000.00
|
||
- 差异金额: ¥792.00
|
||
- 差异率: 1.66%
|
||
|
||
**预期结果**:
|
||
- 风险等级: LOW
|
||
- 风险评分: 25-50分
|
||
- 风险类型: 轻微收入差异
|
||
|
||
**触发条件**:
|
||
```
|
||
差异率 (1.66%) < 5% (低风险阈值)
|
||
差异金额 (792元) < 5,000元 (低风险阈值)
|
||
但是 差异金额 > 0,所以触发LOW风险
|
||
```
|
||
|
||
---
|
||
|
||
### 场景5: 正常场景 (LOW)
|
||
- **主播ID**: TEST_005
|
||
- **风险等级**: LOW (低风险)
|
||
- **测试重点**: 测试算法对正常数据的处理能力
|
||
|
||
**数据特征**:
|
||
- 充值总额: ¥96,221.00
|
||
- 申报金额: ¥98,000.00
|
||
- 差异金额: -¥1,779.00 (申报超额)
|
||
- 差异率: -1.85%
|
||
|
||
**预期结果**:
|
||
- 风险等级: LOW
|
||
- 风险评分: 0-25分
|
||
- 风险类型: 基本无风险
|
||
|
||
**说明**:
|
||
```
|
||
申报金额 > 充值金额,属于正常误差范围
|
||
差异率 (-1.85%) 在可接受范围内
|
||
```
|
||
|
||
---
|
||
|
||
### 场景6: 完全未申报场景 (CRITICAL)
|
||
- **主播ID**: TEST_006
|
||
- **风险等级**: CRITICAL (严重)
|
||
- **测试重点**: 测试算法对极端情况的检测能力
|
||
|
||
**数据特征**:
|
||
- 充值总额: ¥748,729.00
|
||
- 申报金额: ¥0.00
|
||
- 差异金额: ¥748,729.00
|
||
- 差异率: 100.00%
|
||
|
||
**预期结果**:
|
||
- 风险等级: CRITICAL
|
||
- 风险评分: 95-100分
|
||
- 风险类型: 完全未申报收入
|
||
|
||
**触发条件**:
|
||
```
|
||
差异率 (100.00%) > 50% (严重风险阈值) ✓
|
||
差异金额 (748,729元) > 100,000元 (严重风险阈值) ✓
|
||
```
|
||
|
||
---
|
||
|
||
### 场景7: 多平台收入场景 (HIGH)
|
||
- **主播ID**: TEST_007
|
||
- **风险等级**: HIGH (高风险)
|
||
- **测试重点**: 测试算法对多平台收入合并申报的检测能力
|
||
|
||
**数据特征**:
|
||
- 充值总额: ¥150,251.00
|
||
- 申报金额: ¥50,000.00
|
||
- 差异金额: ¥100,251.00
|
||
- 差异率: 66.72%
|
||
|
||
**预期结果**:
|
||
- 风险等级: HIGH
|
||
- 风险评分: 80-90分
|
||
- 风险类型: 多平台收入漏报
|
||
|
||
**触发条件**:
|
||
```
|
||
差异率 (66.72%) > 50% (严重风险阈值) ✓
|
||
差异金额 (100,251元) > 100,000元 (严重风险阈值) ✓
|
||
可能被评为CRITICAL或HIGH
|
||
```
|
||
|
||
---
|
||
|
||
### 场景8: 分批申报场景 (MEDIUM)
|
||
- **主播ID**: TEST_008
|
||
- **风险等级**: MEDIUM (中风险)
|
||
- **测试重点**: 测试算法对分批申报行为的检测能力
|
||
|
||
**数据特征**:
|
||
- 充值总额: ¥117,728.00
|
||
- 申报金额: ¥60,000.00
|
||
- 差异金额: ¥57,728.00
|
||
- 差异率: 49.04%
|
||
|
||
**预期结果**:
|
||
- 风险等级: MEDIUM
|
||
- 风险评分: 60-75分
|
||
- 风险类型: 分批申报未完
|
||
|
||
**触发条件**:
|
||
```
|
||
差异率 (49.04%) < 50% (严重风险阈值)
|
||
但是 > 30% (高风险阈值),可能被评为HIGH
|
||
差异金额 (57,728元) > 50,000元 (高风险阈值)
|
||
```
|
||
|
||
## 🔬 API测试方法
|
||
|
||
### 方法1: 使用cURL测试
|
||
|
||
```bash
|
||
# 测试场景1: 严重漏报
|
||
curl -X POST http://localhost:8000/api/v1/risk-detection/detect \\
|
||
-H "Content-Type: application/json" \\
|
||
-d '{
|
||
"streamer_id": "TEST_001",
|
||
"period": "2024-01",
|
||
"comparison_type": "monthly"
|
||
}'
|
||
|
||
# 测试场景2: 高风险
|
||
curl -X POST http://localhost:8000/api/v1/risk-detection/detect \\
|
||
-H "Content-Type: application/json" \\
|
||
-d '{
|
||
"streamer_id": "TEST_002",
|
||
"period": "2024-01",
|
||
"comparison_type": "monthly"
|
||
}'
|
||
|
||
# 测试场景3: 中风险
|
||
curl -X POST http://localhost:8000/api/v1/risk-detection/detect \\
|
||
-H "Content-Type: application/json" \\
|
||
-d '{
|
||
"streamer_id": "TEST_003",
|
||
"period": "2024-01",
|
||
"comparison_type": "monthly"
|
||
}'
|
||
|
||
# 测试场景4: 低风险
|
||
curl -X POST http://localhost:8000/api/v1/risk-detection/detect \\
|
||
-H "Content-Type: application/json" \\
|
||
-d '{
|
||
"streamer_id": "TEST_004",
|
||
"period": "2024-01",
|
||
"comparison_type": "monthly"
|
||
}'
|
||
|
||
# 测试场景5: 正常
|
||
curl -X POST http://localhost:8000/api/v1/risk-detection/detect \\
|
||
-H "Content-Type: application/json" \\
|
||
-d '{
|
||
"streamer_id": "TEST_005",
|
||
"period": "2024-01",
|
||
"comparison_type": "monthly"
|
||
}'
|
||
|
||
# 测试场景6: 完全未申报
|
||
curl -X POST http://localhost:8000/api/v1/risk-detection/detect \\
|
||
-H "Content-Type: application/json" \\
|
||
-d '{
|
||
"streamer_id": "TEST_006",
|
||
"period": "2024-01",
|
||
"comparison_type": "monthly"
|
||
}'
|
||
|
||
# 测试场景7: 多平台收入
|
||
curl -X POST http://localhost:8000/api/v1/risk-detection/detect \\
|
||
-H "Content-Type: application/json" \\
|
||
-d '{
|
||
"streamer_id": "TEST_007",
|
||
"period": "2024-01",
|
||
"comparison_type": "monthly"
|
||
}'
|
||
|
||
# 测试场景8: 分批申报
|
||
curl -X POST http://localhost:8000/api/v1/risk-detection/detect \\
|
||
-H "Content-Type: application/json" \\
|
||
-d '{
|
||
"streamer_id": "TEST_008",
|
||
"period": "2024-01",
|
||
"comparison_type": "monthly"
|
||
}'
|
||
```
|
||
|
||
### 方法2: 使用Python脚本测试
|
||
|
||
```python
|
||
import requests
|
||
import json
|
||
|
||
API_BASE = "http://localhost:8000/api/v1"
|
||
|
||
# 测试所有场景
|
||
test_cases = [
|
||
{"streamer_id": "TEST_001", "name": "严重漏报"},
|
||
{"streamer_id": "TEST_002", "name": "高风险"},
|
||
{"streamer_id": "TEST_003", "name": "中风险"},
|
||
{"streamer_id": "TEST_004", "name": "低风险"},
|
||
{"streamer_id": "TEST_005", "name": "正常"},
|
||
{"streamer_id": "TEST_006", "name": "完全未申报"},
|
||
{"streamer_id": "TEST_007", "name": "多平台收入"},
|
||
{"streamer_id": "TEST_008", "name": "分批申报"},
|
||
]
|
||
|
||
for test_case in test_cases:
|
||
response = requests.post(
|
||
f"{API_BASE}/risk-detection/detect",
|
||
json={
|
||
"streamer_id": test_case["streamer_id"],
|
||
"period": "2024-01",
|
||
"comparison_type": "monthly"
|
||
}
|
||
)
|
||
result = response.json()
|
||
print(f"\n{test_case['name']} ({test_case['streamer_id']}):")
|
||
print(f" 风险等级: {result.get('risk_level')}")
|
||
print(f" 风险评分: {result.get('risk_score')}")
|
||
```
|
||
|
||
### 方法3: 使用前端界面测试
|
||
|
||
1. 访问前端页面: http://localhost:3000/risk-detection/execute
|
||
2. 选择算法: 收入完整性检测
|
||
3. 输入主播ID: TEST_001 (或其他测试ID)
|
||
4. 选择期间: 2024-01
|
||
5. 点击"开始检测"
|
||
|
||
## 📊 预期结果对照表
|
||
|
||
| 场景 | 主播ID | 预期等级 | 预期评分 | 差异率 | 差异金额 |
|
||
|------|--------|----------|----------|--------|----------|
|
||
| 场景1 | TEST_001 | CRITICAL | 90-100 | 78.04% | ¥355,375 |
|
||
| 场景2 | TEST_002 | HIGH | 75-90 | 39.68% | ¥118,428 |
|
||
| 场景3 | TEST_003 | MEDIUM | 50-75 | 13.45% | ¥13,213 |
|
||
| 场景4 | TEST_004 | LOW | 25-50 | 1.66% | ¥792 |
|
||
| 场景5 | TEST_005 | LOW | 0-25 | -1.85% | -¥1,779 |
|
||
| 场景6 | TEST_006 | CRITICAL | 95-100 | 100.00% | ¥748,729 |
|
||
| 场景7 | TEST_007 | HIGH | 80-90 | 66.72% | ¥100,251 |
|
||
| 场景8 | TEST_008 | MEDIUM | 60-75 | 49.04% | ¥57,728 |
|
||
|
||
## 🎓 算法验证要点
|
||
|
||
### 1. 数据获取验证
|
||
- [ ] 正确获取主播信息
|
||
- [ ] 正确获取充值数据
|
||
- [ ] 正确获取申报数据
|
||
- [ ] 正确获取分成协议
|
||
|
||
### 2. 差异计算验证
|
||
- [ ] 差异金额 = 充值总额 - 申报金额
|
||
- [ ] 差异率 = 差异金额 / 充值总额 × 100%
|
||
- [ ] 负差异率处理(申报超额)
|
||
|
||
### 3. 风险评级验证
|
||
- [ ] CRITICAL: 差异率 > 50% 或 差异金额 > 10万
|
||
- [ ] HIGH: 差异率 > 30% 或 差异金额 > 5万
|
||
- [ ] MEDIUM: 差异率 > 10% 或 差异金额 > 1万
|
||
- [ ] LOW: 差异率 > 5% 或 差异金额 > 5000
|
||
- [ ] NONE: 差异率 <= 5% 且 差异金额 <= 5000
|
||
|
||
### 4. 证据链验证
|
||
- [ ] 收入汇总证据
|
||
- [ ] 申报汇总证据
|
||
- [ ] 差异分析证据
|
||
- [ ] 风险详情证据
|
||
|
||
### 5. 边界条件测试
|
||
- [ ] 零申报处理
|
||
- [ ] 超额申报处理
|
||
- [ ] 大额差异处理
|
||
- [ ] 小额差异处理
|
||
|
||
## 🔍 问题排查
|
||
|
||
### 问题1: 找不到主播信息
|
||
**症状**: 返回错误 "找不到主播信息"
|
||
**原因**: 数据库中无对应主播数据
|
||
**解决**: 检查streamers.json中的数据是否已导入
|
||
|
||
### 问题2: 风险等级不匹配
|
||
**症状**: 实际风险等级与预期不符
|
||
**原因**: 阈值配置或计算逻辑问题
|
||
**解决**: 检查阈值配置和差异计算公式
|
||
|
||
### 问题3: 数据为空
|
||
**症状**: 充值或申报数据为空
|
||
**原因**: 参数错误或期间不匹配
|
||
**解决**: 检查期间格式(YYYY-MM)是否正确
|
||
|
||
## 📝 测试报告模板
|
||
|
||
```markdown
|
||
# 收入完整性检测算法测试报告
|
||
|
||
## 测试环境
|
||
- 测试时间: 2025-11-28
|
||
- 算法版本: v1.0
|
||
- 测试数据: 8个场景
|
||
|
||
## 测试结果
|
||
|
||
### 场景1: 严重漏报 (TEST_001)
|
||
- 输入: TEST_001, 2024-01
|
||
- 预期: CRITICAL (90-100分)
|
||
- 实际: [实际结果]
|
||
- 结果: ✓ 通过 / ✗ 失败
|
||
|
||
### 场景2: 高风险 (TEST_002)
|
||
- 输入: TEST_002, 2024-01
|
||
- 预期: HIGH (75-90分)
|
||
- 实际: [实际结果]
|
||
- 结果: ✓ 通过 / ✗ 失败
|
||
|
||
... (其他场景)
|
||
|
||
## 总结
|
||
- 通过率: X/8 (XX%)
|
||
- 问题数量: X个
|
||
- 建议: [改进建议]
|
||
```
|
||
|
||
## 📚 相关文档
|
||
|
||
- [算法文档](01-RevenueIntegrityAlgorithm.md)
|
||
- [API文档](http://localhost:8000/api/v1/docs)
|
||
- [前端界面](http://localhost:3000/risk-detection/execute)
|
||
|
||
---
|
||
|
||
**最后更新**: 2025-11-28 00:33 \n**测试数据版本**: v1.0 \n**状态**: ✅ 准备就绪
|