Test Failure Investigator Skill

When to use this skill

When one or more unit tests are failing and you need to investigate the root cause. This skill provides a systematic workflow for diagnosing test failures using test result XML files and log output.

How to use this skill

Ask the user which test suite(s) they want to investigate.
Follow the Detailed instructions section to enable logging, run tests, and analyse results.

Detailed instructions

Gathering information

Before investigating, ask the user the following:

Which test suite(s) should be investigated? Examples:
- ores.accounts.tests - tests for the accounts component
- ores.refdata.tests - tests for the reference data component
- ores.cli.tests - tests for the CLI component
- rat - run all tests
Are they investigating a specific test failure, or all failures in a suite?
If investigating a specific test, what is the test name?

Enabling test logging

Test logging is disabled by default. Before investigating failures, ensure logging is enabled by following the instructions in CMake Runner Skill under "Configuring test logging".

Quick reference for enabling logging:

# Enable logging at debug level
cmake --preset linux-clang-debug -DORES_TEST_LOG_LEVEL=debug

# Or trace level for more detail
cmake --preset linux-clang-debug -DORES_TEST_LOG_LEVEL=trace -DORES_TEST_LOG_CONSOLE=ON

Running the failing tests

Run only the specific test suite(s) under investigation to minimise noise:

# Run a specific test suite
cmake --build --target test_COMPONENT.tests --preset linux-clang-debug

# Example: run accounts tests
cmake --build --target test_ores.accounts.tests --preset linux-clang-debug

# Or run all tests
cmake --build --preset linux-clang-debug --target rat

Replace COMPONENT with the actual component name (e.g. ores.accounts, ores.refdata).

Parsing test results

After running tests, use the parse_test_results.py script to get a summary of failures:

./scripts/parse_test_results.py build/output/linux-clang-debug/publish/bin

Understanding the output

The script produces output in three sections:

1. File summary

Found 3 test-results files:
  - build/output/linux-clang-debug/publish/bin/test-results-ores.accounts.tests.xml
  - build/output/linux-clang-debug/publish/bin/test-results-ores.cli.tests.xml
  - build/output/linux-clang-debug/publish/bin/test-results-ores.refdata.tests.xml

Looking for logs in: build/output/linux-clang-debug/publish/log

2. Per-suite results

For each test suite, you will see:

================================================================================
Processing: test-results-ores.accounts.tests.xml
================================================================================
Test Suite: ores.accounts.tests
Catch2 Version: 3.8.0
RNG Seed: 123456789
Modification Time: 2025-01-26 10:30:45
Total Tests: 42
Passed: 40
Failed: 2
Skipped: 0
Total Duration: 1.234s

  Test Suite Log: build/output/linux-clang-debug/publish/log/ores.accounts.tests/ores.accounts.tests.log
  Errors/Warnings in test suite log:
    Line 156: [2025-01-26 10:30:44.123] [ERROR] Database connection failed
    Line 234: [2025-01-26 10:30:45.456] [WARN] Retry attempt 3 of 5

Found 2 failed test(s):

FAILURE #1 (account_creation_with_invalid_currency):
  Name: account_creation_with_invalid_currency
  Tags: [domain][accounts]
  File: /path/to/projects/ores.accounts/tests/domain_account_tests.cpp
  Line: 156
  Duration: 0.023s
  Exception: [/path/to/file.cpp:42] Expected: valid currency code, Got: "XXX"

  Test Case Log: build/output/linux-clang-debug/publish/log/ores.accounts.tests/domain_account_tests/account_creation_with_invalid_currency.log
  Errors/Warnings in test case log:
    Line 12: [ERROR] Currency lookup failed for code: XXX
    Line 15: [WARN] Falling back to default currency handling

Key information to extract:

Exception: The assertion or error that caused the failure
File/Line: Where the test is defined (navigate here to see the test code)
Test Case Log: Individual log file for this specific test
Errors/Warnings: Log entries that may indicate the root cause

3. Overall summary

================================================================================
OVERALL SUMMARY
================================================================================
Total Files Processed: 3
Valid XML Files: 3
Invalid/Skipped Files: 0
Total Test Suites: 3
Total Tests: 150
Total Passed: 148
Total Failed: 2
Total Skipped: 0
Total Duration: 5.678s
Pass Rate: 98.67%
Fail Rate: 1.33%
Failed Test Cases: 2
================================================================================

Common scenarios

No logs found

If the script reports "No test logs found!", logging is disabled. Enable it as described in "Enabling test logging" above, then re-run the tests.
XML parse errors

If XML files cannot be parsed, the test executable may have crashed. The script will dump the entire test suite log to help diagnose the crash.

Reading individual test logs

For deeper investigation, read the specific test case log file directly. Log files are organised as follows (see CMake Runner Skill section "Output directory layout" for details):

build/output/linux-clang-debug/publish/log/
  COMPONENT.tests/                    <- Test suite directory
    COMPONENT.tests.log               <- Suite-level log
    TEST_CATEGORY/                    <- Test category from filename
      TEST_NAME.log                   <- Individual test log

Example path for a specific test:

build/output/linux-clang-debug/publish/log/ores.accounts.tests/domain_account_tests/account_serialization_to_json.log

Investigation workflow

Follow this systematic approach:

Parse results: Run parse_test_results.py to identify failing tests
Review exceptions: Check the exception messages for immediate clues
Read test code: Navigate to the test line to understand what the test expects
Check test log: Read the individual test case log for ERROR/WARN entries
Check suite log: If the test log is not helpful, check the suite-level log for broader context
Increase log level: If more detail is needed, reconfigure with ORES_TEST_LOG_LEVEL=trace and re-run
Add targeted logging: If still unclear, add temporary debug logging to the code under test

Cleanup

After investigation, disable logging to improve test performance:

cmake --preset linux-clang-debug -DORES_TEST_LOG_LEVEL=OFF