Writing Custom Assertion Diagnostic Functions

🏷️ Structured Data Formats: JSON, YAML, and CSV / Data Validation Basics

📚 All-in-One📖 Theory Only💻 Code Only

When validating data in Python, standard assertions often fall short. They tell you something failed, but rarely explain why. Custom assertion diagnostic functions give you the power to create meaningful, context-rich error messages that make debugging faster and more intuitive.

🧠 Understanding the Problem with Standard Assertions

Standard Python assertions provide minimal information when they fail:

Basic assert statement: Only raises an AssertionError with no context
assert with message: Shows a static string, but doesn't reveal actual values
No structured output: You must manually inspect variables to understand what went wrong

Custom diagnostic functions solve these limitations by generating detailed, human-readable failure reports.

⚙️ Anatomy of a Custom Assertion Diagnostic Function

A well-designed diagnostic function typically includes these components:

A clear function name that describes what is being validated (e.g., assert_valid_email, check_range)
Input parameters for the actual value, expected value, and optional context
A comparison logic that determines pass or fail
A descriptive failure message that shows what was expected versus what was received
Optional metadata like timestamps, data types, or source locations

🛠️ Building Your First Custom Diagnostic Function

Start with a simple pattern that checks equality and provides rich feedback:

Function name: assert_equals_with_diagnostics
Parameters: actual, expected, label="Value"
Logic: Compare actual to expected using ==
On failure: Return a string showing the label, expected value, actual value, and their data types
On success: Return None or an empty string

This pattern can be extended to check ranges, data types, list lengths, dictionary keys, and more.

📊 Comparison: Standard Assert vs Custom Diagnostic

Feature	Standard Assert	Custom Diagnostic Function
Error message	Static or missing	Dynamic and context-rich
Shows actual value	No	Yes
Shows expected value	No	Yes
Shows data types	No	Yes
Reusable across tests	Limited	Fully reusable
Readable for debugging	Poor	Excellent

🕵️ Advanced Diagnostic Techniques

Once comfortable with basic comparisons, enhance your functions with these techniques:

Type checking: Validate that inputs are the expected data type before comparing values
Boundary analysis: For numeric checks, show how far off the actual value is from the expected range
Collection inspection: For lists or dictionaries, highlight which specific elements or keys differ
Nested reporting: For complex structures, recursively generate diagnostic messages for each level
Conditional messages: Tailor the output based on the type of failure (missing key, wrong type, out of range)

🧩 Practical Patterns for Common Scenarios

Pattern 1: Value Range Checker - Accepts value, min_val, max_val, and name - Returns a message if value is outside the allowed range - Includes the actual value, the boundary that was violated, and by how much

Pattern 2: Dictionary Key Validator - Accepts data_dict, required_keys, and dict_name - Checks for missing keys and unexpected keys - Returns a structured report listing missing and extra keys separately

Pattern 3: List Content Comparator - Accepts actual_list, expected_list, and list_name - Identifies items present in one list but not the other - Reports differences in a clear, itemized format

🔄 Integrating Diagnostics into Your Workflow

Custom diagnostic functions work best when integrated into a validation pipeline:

Step 1: Define your diagnostic functions in a dedicated module (e.g., validation_helpers.py)
Step 2: Import them into your test scripts or data processing code
Step 3: Call the diagnostic function instead of a plain assert
Step 4: Check the return value — if it is not None, log or print the diagnostic message
Step 5: Use the detailed message to quickly identify and fix the issue

✅ Best Practices for Writing Diagnostic Functions

Keep messages concise but complete: Include enough detail to debug, but avoid overwhelming noise
Use consistent formatting: Standardize how values, types, and differences are displayed
Return None on success: This makes it easy to check pass/fail with a simple conditional
Make functions composable: Build small diagnostic functions that can be combined for complex validations
Document expected behavior: Add docstrings explaining what each function checks and what its output looks like
Test your diagnostics: Verify that your failure messages are accurate and helpful by intentionally triggering failures

🚀 Taking It Further

Once comfortable with basic custom diagnostics, explore these enhancements:

Logging integration: Write diagnostic messages directly to a log file with timestamps
Color-coded output: Use terminal color codes to highlight differences in values
Aggregate reporting: Collect multiple diagnostic failures and present them as a single report
Threshold-based warnings: Instead of failing immediately, collect warnings for values that are close to boundaries
Custom exception classes: Create specialized exception types that carry diagnostic information

Custom assertion diagnostic functions transform debugging from a guessing game into a structured, efficient process. They empower you to understand exactly what went wrong and why, saving time and reducing frustration during data validation tasks.

Custom assertion diagnostic functions let engineers create informative error messages when data validation fails in Python tests.

🛠️ Example 1: Basic custom assertion message

This example shows how to add a simple custom message to an assertion.

value = 10
expected = 5

assert value == expected, f"Expected {expected}, but got {value}"

📤 Output: AssertionError: Expected 5, but got 10

📐 Example 2: Checking value is within a range

This example demonstrates a custom diagnostic function that checks if a number falls within an acceptable range.

def assert_in_range(value, low, high):
    message = f"Value {value} is not between {low} and {high}"
    assert low <= value <= high, message

temperature = 105
assert_in_range(temperature, 60, 100)

📤 Output: AssertionError: Value 105 is not between 60 and 100

🔍 Example 3: Validating dictionary keys exist

This example shows how to check that required keys are present in a data dictionary.

def assert_has_keys(data, required_keys):
    missing = [key for key in required_keys if key not in data]
    message = f"Missing required keys: {missing}"
    assert len(missing) == 0, message

sensor_data = {"temperature": 72, "humidity": 45}
assert_has_keys(sensor_data, ["temperature", "pressure"])

📤 Output: AssertionError: Missing required keys: ['pressure']

📊 Example 4: Validating data types in a list

This example demonstrates a custom function that checks all items in a list match an expected data type.

def assert_all_type(items, expected_type):
    bad_items = [item for item in items if not isinstance(item, expected_type)]
    message = f"Items {bad_items} are not type {expected_type.__name__}"
    assert len(bad_items) == 0, message

readings = [10, 20, "thirty", 40]
assert_all_type(readings, int)

📤 Output: AssertionError: Items ['thirty'] are not type int

📋 Example 5: Comparing two CSV-like row structures

This example shows a practical diagnostic function for comparing expected and actual data rows.

def assert_row_match(actual_row, expected_row, row_number):
    for col, (actual, expected) in enumerate(zip(actual_row, expected_row)):
        if actual != expected:
            message = f"Row {row_number}, Column {col}: expected '{expected}', got '{actual}'"
            assert False, message

expected = ["ID001", "Alice", "Engineer"]
actual = ["ID001", "Alice", "Manager"]
assert_row_match(actual, expected, 5)

📤 Output: AssertionError: Row 5, Column 2: expected 'Engineer', got 'Manager'

📊 Comparison Table: Custom Assertion Functions

Function	Purpose	When to Use
`assert value == expected, message`	Simple value comparison	Basic equality checks
`assert_in_range(value, low, high)`	Range validation	Numeric sensor readings
`assert_has_keys(data, keys)`	Key presence check	JSON or dictionary validation
`assert_all_type(items, type)`	Type consistency check	List or column data validation
`assert_row_match(actual, expected, row)`	Row-by-row comparison	CSV or table data validation