deep-risk/backend/tests/TEST_README.md

# 收入完整性检测算法测试用例

## 📋 概述

本测试套件为收入完整性检测算法（RevenueIntegrityAlgorithm）提供全面的测试覆盖，包括正常场景、异常情况、边界值等共计19个测试场景。

## 🏗️ 测试架构

```
app/tests/
├── conftest.py                      # pytest配置文件，共享fixtures
└── test_revenue_integrity.py        # 主测试文件

run_tests.py                         # 测试运行脚本
```

## 📊 测试场景覆盖

### 1. 正常场景测试（3个用例）
- ✅ **收入一致（有分成协议）**：充值45000元，申报45000元，预期无风险
- ✅ **收入一致（无分成协议）**：无协议但收入匹配，预期无风险
- ✅ **轻微差异（低风险）**：差异在可接受范围内

### 2. 少报收入场景测试（3个用例）
- ⚠️ **中度少报**：差异率28.57%，风险等级MEDIUM
- 🚨 **高度少报**：差异率42.86%，风险等级HIGH
- 🔴 **严重少报**：差异率71.43%，风险等级CRITICAL

### 3. 多报收入场景测试（1个用例）
- ⚠️ **轻微多报**：申报超过充值，但风险较低

### 4. 边界值测试（7个用例）
- 📌 **无充值数据**：有申报无充值，高风险
- 📌 **无申报数据**：有充值无申报，Critical风险
- 📌 **完全一致**：零差异，无风险
- 📌 **大额边界值**：差异100000元（刚好达到Critical阈值）
- 📌 **差异率边界值**：差异率50%（刚好达到Critical阈值）
- 📌 **月度对账**：模拟真实30笔充值场景
- 📌 **企业主播**：统一社会信用代码场景

### 5. 错误处理测试（4个用例）
- ❌ **缺少主播ID**：返回UNKNOWN风险等级
- ❌ **缺少期间参数**：参数验证错误
- ❌ **无效期间格式**：格式校验错误
- ❌ **主播不存在**：数据不存在错误

### 6. 真实业务场景测试（3个用例）
- 🏢 **新主播无历史**：少量数据处理
- 🏢 **企业主播**：企业主体测试
- 🏢 **月度对账**：大量数据场景

## 🚀 快速开始

### 1. 安装依赖

```bash
# 安装pytest和异步测试支持
pip install pytest pytest-asyncio pytest-html pytest-cov

# 安装项目依赖
pip install -r requirements.txt
```

### 2. 运行所有测试

```bash
# 使用运行脚本（推荐）
python run_tests.py

# 或直接使用pytest
pytest app/tests/test_revenue_integrity.py -v
```

### 3. 运行特定测试

```bash
# 运行正常场景测试
python run_tests.py -k "normal"

# 运行少报收入测试
python run_tests.py -k "under_reporting"

# 运行边界值测试
python run_tests.py -k "boundary"

# 运行错误处理测试
python run_tests.py -k "error"
```

### 4. 生成详细报告

```bash
# 生成HTML报告
python run_tests.py --html=test_report.html

# 生成覆盖率报告
python run_tests.py --cov

# 生成HTML覆盖率报告
python run_tests.py --cov --html=coverage_report.html
```

## 📝 测试数据说明

### Mock数据结构

#### 1. 主播信息（StreamerInfo）
```python
{
    "streamer_id": "ZB_TEST_001",
    "streamer_name": "测试主播",
    "entity_type": "individual",  # individual/enterprise
    "tax_registration_no": "TAX123456789",
    "unified_social_credit_code": None,
    "id_card_no": "110101199001011234",
    "phone_number": "13800138000",
    "bank_account_no": "6222021234567890123",
    "bank_name": "中国工商银行",
}
```

#### 2. 充值数据（PlatformRecharge）
```python
{
    "recharge_id": "RC001",
    "user_name": "测试用户",
    "amount": 10000.0,
    "time": datetime(2024, 1, 15, 10, 30, 0),
    "payment_method": "bank_transfer",
    "status": "success",
}
```

#### 3. 税务申报（TaxDeclaration）
```python
{
    "declaration_id": "TAX001",
    "taxpayer_name": "测试主播",
    "tax_period": "2024-01",
    "sales_revenue": 45000.0,
    "declaration_date": date(2024, 2, 15),
    "tax_rate": 0.03,
}
```

#### 4. 分成协议（RevenueSharingContract）
```python
{
    "streamer_ratio": 70.0,    # 主播分成70%
    "platform_ratio": 30.0,    # 平台分成30%
    "contract_start_date": date(2024, 1, 1),
    "contract_end_date": date(2024, 12, 31),
}
```

## 🎯 风险等级判断规则

### 风险等级定义

| 风险等级 | 差异率阈值 | 差异金额阈值 | 描述 |
|---------|-----------|-------------|------|
| **CRITICAL** | > 50% | > 100,000元 | 严重风险，可能存在重大税务问题 |
| **HIGH** | > 30% | > 50,000元 | 高度风险，需要重点关注 |
| **MEDIUM** | > 10% | > 10,000元 | 中度风险，需要进一步核查 |
| **LOW** | > 5% | > 5,000元 | 低度风险，持续监控 |
| **NONE** | ≤ 5% | ≤ 5,000元 | 正常，收入基本一致 |

### 计算公式

```python
# 预期收入计算（如果有分成协议）
expected_revenue = recharge_total * (streamer_ratio / 100)

# 差异计算
difference = abs(declared_revenue - expected_revenue)

# 差异率计算
difference_rate = (difference / expected_revenue) * 100
```

## 🧪 测试用例详解

### 示例1：测试严重少报收入

```python
@pytest.mark.asyncio
async def test_critical_under_reporting(self, algorithm, mock_db_session, streamer_info, mock_contract_data):
    """测试场景6：严重少报收入（Critical风险）"""
    # 充值500000元，分成70%，预期350000元，申报100000元
    # 差异250000元，差异率71.43%

    recharge_data = [
        {"recharge_id": f"RC{i}", "user_name": "测试", "amount": 100000.0,
         "time": datetime(2024, 1, 15), "payment_method": "bank_transfer"}
        for i in range(1, 6)
    ]

    declaration_data = [
        {"declaration_id": "TAX001", "taxpayer_name": "测试主播",
         "tax_period": "2024-01", "sales_revenue": 100000.0,
         "declaration_date": date(2024, 2, 15)}
    ]

    mock_db_session.execute.side_effect = [
        self._create_streamer_result(streamer_info),
        self._create_recharge_result(recharge_data),
        self._create_declaration_result(declaration_data),
        self._create_contract_result(mock_contract_data),
    ]

    # 执行检测
    context = self._create_context("ZB_TEST_001", "2024-01", mock_db_session)
    result = await algorithm.detect(context)

    # 断言
    assert result.risk_level == RiskLevel.CRITICAL
    assert result.risk_score >= 85.0
    assert "严重风险" in result.description
    assert result.risk_data["difference"] >= 100000.0
```

### 示例2：测试边界值

```python
@pytest.mark.asyncio
async def test_large_amount_boundary(self, algorithm, mock_db_session, streamer_info):
    """测试场景11：大额边界值（刚好10万元差异）"""
    # 充值200000元，申报100000元，差异100000元，差异率50%
    # 这个值刚好达到CRITICAL的金额阈值

    recharge_data = [
        {"recharge_id": "RC001", "user_name": "测试", "amount": 200000.0,
         "time": datetime(2024, 1, 15), "payment_method": "bank_transfer"}
    ]

    declaration_data = [
        {"declaration_id": "TAX001", "taxpayer_name": "测试主播",
         "tax_period": "2024-01", "sales_revenue": 100000.0,
         "declaration_date": date(2024, 2, 15)}
    ]

    mock_db_session.execute.side_effect = [
        self._create_streamer_result(streamer_info),
        self._create_recharge_result(recharge_data),
        self._create_declaration_result(declaration_data),
        MagicMock(scalar_one_or_none=lambda: None),
    ]

    # 执行检测
    context = self._create_context("ZB_TEST_001", "2024-01", mock_db_session)
    result = await algorithm.detect(context)

    # 断言：刚好达到Critical阈值
    assert result.risk_level == RiskLevel.CRITICAL
    assert result.risk_data["difference"] == 100000.0
```

## 🔧 扩展测试

### 添加新的测试场景

1. **在test_revenue_integrity.py中添加新方法**：

```python
@pytest.mark.asyncio
async def test_new_scenario(self, algorithm, mock_db_session, streamer_info):
    """测试新场景：描述"""
    # 准备测试数据
    recharge_data = [...]
    declaration_data = [...]

    # Mock数据库查询
    mock_db_session.execute.side_effect = [...]

    # 执行检测
    context = self._create_context("ZB_TEST_001", "2024-01", mock_db_session)
    result = await algorithm.detect(context)

    # 断言结果
    assert result.risk_level == RiskLevel.XXX
    assert ...
```

2. **在conftest.py中添加新的Mock数据**：

```python
@pytest.fixture
def mock_new_data():
    """新类型Mock数据"""
    return {
        # 数据结构定义
    }
```

## 📈 测试报告

### 运行测试后的输出示例

```
===========================================
收入完整性检测算法测试运行器
===========================================

执行命令: python -m pytest app/tests/test_revenue_integrity.py -v

collected 19 items

app/tests/test_revenue_integrity.py::TestRevenueIntegrityNormal::test_revenue_match_with_contract PASSED [  5%]
app/tests/test_revenue_integrity.py::TestRevenueIntegrityNormal::test_revenue_match_without_contract PASSED [ 10%]
app/tests/test_revenue_integrity.py::TestRevenueIntegrityNormal::test_slight_difference_low_risk PASSED [ 15%]
app/tests/test_revenue_integrity.py::TestRevenueIntegrityUnderReporting::test_medium_under_reporting PASSED [ 21%]
app/tests/test_revenue_integrity.py::TestRevenueIntegrityUnderReporting::test_high_under_reporting PASSED [ 26%]
app/tests/test_revenue_integrity.py::TestRevenueIntegrityUnderReporting::test_critical_under_reporting PASSED [ 31%]
app/tests/test_revenue_integrity.py::TestRevenueIntegrityOverReporting::test_over_reporting_low_risk PASSED [ 36%]
app/tests/test_revenue_integrity.py::TestRevenueIntegrityEdgeCases::test_no_recharge_data PASSED [ 42%]
app/tests/test_revenue_integrity.py::TestRevenueIntegrityEdgeCases::test_no_declaration_data PASSED [ 47%]
app/tests/test_revenue_integrity.py::TestRevenueIntegrityEdgeCases::test_zero_difference PASSED [ 52%]
app/tests/test_revenue_integrity.py::TestRevenueIntegrityEdgeCases::test_large_amount_boundary PASSED [ 57%]
app/tests/test_revenue_integrity.py::TestRevenueIntegrityEdgeCases::test_rate_boundary_50_percent PASSED [ 63%]
app/tests/test_revenue_integrity.py::TestRevenueIntegrityErrorHandling::test_missing_streamer_id PASSED [ 68%]
app/tests/test_revenue_integrity.py::TestRevenueIntegrityErrorHandling::test_missing_period PASSED [ 73%]
app/tests/test_revenue_integrity.py::TestRevenueIntegrityErrorHandling::test_invalid_period_format PASSED [ 78%]
app/tests/test_revenue_integrity.py::TestRevenueIntegrityErrorHandling::test_streamer_not_found PASSED [ 84%]
app/tests/test_revenue_integrity.py::TestRevenueIntegrityBusinessScenarios::test_monthly_reconciliation PASSED [ 89%]
app/tests/test_revenue_integrity.py::TestRevenueIntegrityBusinessScenarios::test_new_streamer_no_history PASSED [ 94%]
app/tests/test_revenue_integrity.py::TestRevenueIntegrityBusinessScenarios::test_enterprise_streamer PASSED [100%]

===========================================
✓ 所有测试通过！
===========================================
```

## ⚠️ 注意事项

1. **异步测试**：所有测试都是异步的，需要使用`@pytest.mark.asyncio`标记
2. **数据库Mock**：使用`AsyncMock`模拟数据库会话，避免真实数据库连接
3. **数据隔离**：每次测试使用独立的Mock数据，避免数据污染
4. **风险等级断言**：确保断言的风险等级与测试场景匹配
5. **边界值测试**：特别注意差异率和金额的边界值

## 🐛 常见问题

### Q: 测试运行失败，提示"ModuleNotFoundError"
**A**: 确保在项目根目录运行，并检查PYTHONPATH设置
```bash
export PYTHONPATH=/path/to/deeprisk-claude-1/backend:$PYTHONPATH
```

### Q: 异步测试报错
**A**: 确保安装了`pytest-asyncio`并使用正确的标记
```bash
pip install pytest-asyncio
```

### Q: 想要调试特定测试
**A**: 使用pytest的调试选项
```bash
pytest app/tests/test_revenue_integrity.py::TestRevenueIntegrityNormal::test_revenue_match_with_contract -v -s
```

## 📚 参考资料

- [pytest文档](https://docs.pytest.org/)
- [pytest-asyncio文档](https://pytest-asyncio.readthedocs.io/)
- [Python unittest.mock](https://docs.python.org/3/library/unittest.mock.html)
- [SQLAlchemy异步支持](https://docs.sqlalchemy.org/en/14/orm/extensions/asyncio.html)

## 🤝 贡献

欢迎提交新的测试用例！请确保：
1. 测试场景明确
2. Mock数据完整
3. 断言准确
4. 遵循现有代码风格

---

**创建时间**: 2024-11-30
**版本**: 1.0.0
**作者**: Claude Code