Mutation Testing

Mutation testing verifies test quality by introducing small code changes (mutations) and checking if tests catch them. It answers the question: "Are my tests actually testing my code?"

Concept

Traditional code coverage measures which lines are executed, but not whether assertions verify behavior. Mutation testing fills this gap:

Mutator changes code (e.g., > becomes >=)
Tests run against mutated code
If tests fail → mutation "killed" (tests caught the bug)
If tests pass → mutation "survived" (test gap exists)

A high mutation score indicates tests that truly verify behavior, not just execute code paths.

Running Mutation Tests

task tests:mutation              # Run mutation testing

Phexium uses Pest's built-in mutation testing (--mutate flag).

CI/CD Considerations

Mutation testing is disabled in the GitLab CI pipeline for resource management reasons. GitLab.com free tier accounts have limited CI/CD minutes, and mutation testing is computationally expensive (it runs the test suite multiple times, once per mutation).

The job definition exists in .gitlab-ci.yml but is commented out to preserve pipeline minutes for essential checks (quality, unit tests, integration tests, acceptance tests).

Workaround: Run mutation testing manually before significant releases or when improving test quality:

task tests:mutation              # Run locally

This approach balances test quality verification with CI resource constraints. Focus mutation testing on:

Before major releases
After adding significant domain logic
When test coverage reports show gaps
During dedicated test improvement sprints

Common Mutation Types

Arithmetic Operators

Original	Mutation
`+`	`-`
`-`	`+`
`*`	`/`
`/`	`*`
`++`	`--`
`--`	`++`

Comparison Operators

Original	Mutation
`>`	`>=`
`<`	`<=`
`>=`	`>`
`<=`	`<`
`==`	`!=`
`!=`	`==`
`===`	`!==`

Boolean Logic

Original	Mutation
`true`	`false`
`false`	`true`
`&&`	`\\|\\|`
`\\|\\|`	`&&`
`!$a`	`$a`

Return Values

Original	Mutation
`return $x`	`return null`
`return true`	`return false`
`return 0`	`return 1`
`return ""`	`return "mutated"`

Interpreting Results

Mutation Score

Mutation Score: 85%
- 100 mutations generated
- 85 killed
- 15 survived

An 85% mutation score means 85% of artificial bugs were caught by tests.

What Survived Mutations Mean

Each survived mutation indicates one of:

Cause	Action
Missing test case	Add test for that behavior
Weak assertion	Strengthen assertion (check specific values)
Dead code	Remove unreachable code
Equivalent mutation	Ignore (mutation produces same behavior)

Improving Mutation Score

Example: Boundary Condition

Survived mutation:

// Original code
if ($length > 5) {
    throw new InvalidArgumentException('Too short');
}

// Mutation (survived - tests still pass)
if ($length >= 5) {
    throw new InvalidArgumentException('Too short');
}

Problem: No test verifies the boundary case (length exactly 5).

Fix: Add boundary test:

test('Title with exactly 5 characters is rejected', function (): void {
    expect(fn () => Title::fromString('12345'))
        ->toThrow(InvalidArgumentException::class);
});

test('Title with 6 characters is accepted', function (): void {
    $title = Title::fromString('123456');
    expect($title->toString())->toBe('123456');
});

Example: Boolean Logic

Survived mutation:

// Original
if ($isActive && $hasPermission) { ... }

// Mutation (survived)
if ($isActive || $hasPermission) { ... }

Problem: Tests don't cover the case where only one condition is true.

Fix: Add tests for each combination:

test('Access denied when active but no permission', function (): void {
    $user = UserMother::activeWithoutPermission();
    expect($user->canAccess())->toBeFalse();
});

test('Access denied when has permission but inactive', function (): void {
    $user = UserMother::inactiveWithPermission();
    expect($user->canAccess())->toBeFalse();
});

Focus Areas

High-Value Targets (aim for 80%+ mutation score)

Business logic in handlers and entities
Value Object validation (boundaries, formats)
State transitions (entity status changes)
Calculations (prices, dates, quantities)

Lower Priority (mutation score less critical)

Infrastructure code (repository implementations)
Simple getters without logic
Framework glue code
Third-party library wrappers