TIL: Reliable Pytest Testing with Mocks (by Example)

2025-07-19

Suppose we have the following class in./src/grammar.py:

# ./src/grammar.py
import logging
import language_tool_python

logger = logging.getLogger(__name__)


class GrammarMetric:
    """Compute grammar quality scores for text summaries using LanguageTool.

    Because the Public API has a rate limit, this metric depends on LanguageTool
    and Java. To install on MacOS, use

        brew install openjdk
        sudo ln -sfn /opt/homebrew/opt/openjdk/libexec/openjdk.jdk /Library/Java/JavaVirtualMachines/openjdk.jdk

    On Linux, use

        sudo apt update && sudo apt install default-jre

    """

    default_language = "en-US"

    def __init__(self, language: str | None = None):
        self.language = language or self.default_language
        self.server = language_tool_python.LanguageTool(self.language)

    def __call__(self, data: list[str]) -> list[float]:
        results = []
        for text in data:
            matches = self.server.check(text)
            results.append(float(max(0, 1 - len(matches) / len(text.split()))))
        return results

This works — but running tests on it requires downloading LanguageTool resources and Java. That’s slow and flaky in CI. Let’s mock it.

We’ll:

Create shared fixtures and a skipifci mark.
Write a mock for LanguageTool.
Test the mock to avoid false confidence.
Test the actual GrammarMetric using the mock.

Definitions:

Mock and `MagicMock`:

A Mock is a fake object used in place of a real one during testing. It lets you simulate behavior and test code in isolation. A MagicMock: A powerful kind of mock from Python’s unittest.mock module that automatically creates fake methods and tracks how they’re used.

The MagicMock object creates attributes and methods dynamically when you access them. When you access an attribute (e.g., mock.some_method()), MagicMock automatically returns another MagicMock instance.

When you pass keyword arguments to the MagicMock initialiser, like MagicMock(foo=123), it sets those as attributes on the mock. So mock = MagicMock(foo=123) means mock.foo will return 123 instead of a new MagicMock. This lets you control or override parts of the mocks behavior up front

MagicMock objects have two special attributes that control what’s returned:

return_value: Sets what a mock should return when called. For example, mock.method.return_value = 42 makes mock.method() return 42
side_effect: Lets you control more complex behavior — like raising exceptions, returning different values each time, or calling a function. For example, mock.method.side_effect = ValueError("oops") makes mock.method() raise an error

Other Terms:

patch: A function from unittest.mock that temporarily replaces a real object (like a class or function) with a mock during a test. It’s often used with a with block or as a decorator.
Fixture: A reusable setup function in pytest that provides test data or objects to tests. Defined using @pytest.fixture.
@pytest.mark.skipif: A decorator to conditionally skip tests — for example, skipping slow or flaky tests when running in CI (for example, "CI" in os.environ).

Step 1: Shared Fixtures and Marks

In ./tests/conftest.py:

# ./tests/conftest.py
import os
import pytest
from functools import partial

def n_samples() -> int:
    return 5

def mock_data(n_samples) -> list[str]:
    return ["This is some mock data." for _ in range(n_samples)]

skipifci = partial(pytest.mark.skipif, "CI" in os.environ)

Step 2: Decide What to Mock (and What to Ignore)

We want to avoid calling LanguageTool.check() in CI. But how does it behave?

In a Python shell:

>>> import language_tool_python
>>> lt = language_tool_python.LanguageTool("en-US")
>>> matches = lt.check("I can grammar good")
>>> type(matches), type(matches[0])
# (list, language_tool_python.Match)

You can inspect a Match object with:

>>> dir(matches[0])
# ... many attributes, but we don’t use them in our code

If we want to inspect this object more, we could use the following for-loop:

>>> for k in dir(result): print(f"{k}({type(k).__name__}): {repr(getattr(result, k)):.32}...")
# ... many attributes, but we don’t use them in our code

Since our code only checks the number of matches, not their content, returning a list of dummy MagicMock instances is sufficient

Step 3: Write the Mocks

In ./tests/test_grammar.py:

# ./tests/test_grammar.py
import random
from unittest.mock import MagicMock, patch

import language_tool_python
import pytest
from conftest import skipifci

from src.grammar import GrammarMetric

## Mock language_tool_python.LanguageTool
#

@pytest.fixture
def mock_check():
    def _mock_check(text, *args, **kwargs):
        return [
            MagicMock(message="mock", sentence=text, ruleId="MOCK_RULE")
            for _ in range(random.randint(1, 3))
        ]
    return _mock_check


@pytest.fixture
def mock_language_tool(mock_check):
    with patch("src.grammar.language_tool_python.LanguageTool") as MockLT:
        instance = MockLT.return_value
        instance.check.side_effect = mock_check
        yield instance

Why this works:

patch targets the location where LanguageTool is used (src.grammar).
The mock object has the minimal interface needed to test downstream code.

Step 4: Test the Mock

We want to make sure our mock returns the same kind of object as the real one, so downstream code doesn’t break.

# ./tests/test_grammar.py
# ...

## Mock language_tool_python.LanguageTool
#
# ...

@skipifci(reason="Requires real LanguageTool")
def test_mock_check_matches_real(mock_check):
    real = language_tool_python.LanguageTool("en-US")
    text = "Me fail English? That's unpossible!"

    for result in [real.check(text), mock_check(text)]:
        assert isinstance(result, list)
        for r in result:
            assert hasattr(r, "message")

Step 5: Test the Metric Object

Write tests that work with and without the mock:

# ./tests/test_grammar.py
# ...

## Mock language_tool_python.LanguageTool
#
# ...

## Test GrammarMetric
#

def _run_metric(metric, data):
    scores = metric(data)
    assert isinstance(scores, list)
    assert all(isinstance(s, float) for s in scores)
    assert len(scores) == len(data)


def test_grammar_metric_mocked(mock_data, mock_language_tool):
    metric = GrammarMetric()
    _run_metric(metric, mock_data)


@skipifci(reason="Requires LanguageTool download")
def test_grammar_metric_real(mock_data):
    metric = GrammarMetric()
    _run_metric(metric, mock_data)

Summary

This structure lets you:

Use real LanguageTool locally to verify your mock
Skip flaky or expensive tests in CI
Keep tests fast and deterministic without compromising coverage

Use this pattern any time you need to test around a costly external dependency

Reply to this post by email ↪