You’ve built your data warehouse and your core reporting infrastructure.
That’s a huge win. But too often, a critical step is either missed or undervalued: Data Testing. I frequently encounter setups where teams have “not yet” added —or sometimes not even considered—robust data quality tests.
In my view, this is a significant oversight. Every time a stakeholder spots some dodgy-looking metrics or outdated mapping files, the confidence and trust in your hard work might crumble a bit.
Unfortunately, trust is like your fitness level: It takes months and years to achieve a decent level, but you can destroy it within a few moments or days.
Here is a tangible guide on how to get started with data testing, how to maximize its impact, and make it as actionable as possible.
Frameworks Make Data Testing Non-Negotiable
Writing tests and quality checks for your data transformations used to be a pain point, often requiring custom scripts and workarounds. You could do it, but it was tedious.
With modern frameworks like dbt, implementing these checks has become extremely simple to set up. There is simply no reason not to use the built-in testing capabilities these tools provide. Please do it.
Stratify Your Tests with Severity Levels
When creating your tests, I recommend splitting them into clear severity levels. Typically, classifying them as Critical and Minor is sufficient, though a Medium level can also be incorporated if needed.
Critical vs. Non-Critical Tests
In dbt, you have a severity
-flag flag. You can set tests to simply warn you (printing logs to the console without breaking anything) or to fail and block the pipeline.
models:
- name: large_table
columns:
- name: slightly_unreliable_column
data_tests:
- not_null:
tags: [minor]
config:
severity: error
error_if: “>1000”
warn_if: “>10”
I personally find the explicit severity levels through tags still better as they leave more room for adjustments (also a minor-test might deserve a dedicated threshold for “warning” and “failure”), beneficial because they dictate a differentiated response, urgency, and Service Level Agreement (SLA):
Critical Failure: This requires immediate reaction. If you’ve implemented a write, audit, publish pattern, only these critical failures should block the data pipeline entirely.
Minor Failure: This can likely wait until the afternoon or even the next day. These non-critical failures should not block the pipeline.
Varying Run Intervals
For minor issues that don’t demand immediate attention, you might also consider adjusting the run interval. If your pipeline runs every hour, why test for minor issues in every run if you won’t jump on the topic immediately? Testing minor things once a day or even once a week can be totally sufficient, which is a case-by-case decision based on your use case.
This prevents alert spam and maintains a good signal-to-noise ratio.
Alerting and Stakeholder Information: The Trust-Building Measures
Use Purpose-Built Alerting Tools
You must have proper alerting in place. If you are using dbt, I highly recommend exploring Elementary, the open-source package. It provides excellent Slack or Microsoft Teams alerts that include:
Actual context: Example records and the SQL statement needed to replicate the issue.
Actionable insights: The ability to tag specific people directly in the alert.
This saves you from having to write and maintain your own custom logging and alerting systems.
Inform Your Stakeholders Proactively
Think about stakeholder information. If the data is stale, do you want to inform your stakeholders before they notice it’s been broken for three days?
While it might sound counterintuitive, proactively telling stakeholders that something has failed is actually a trust-building measure. It demonstrates that you are aware of the issue and are owning the topic.
Smart Alert Routing for Ownership
You can take this a step further by routing alerts based on the failure type:
If the failure points to faulty input (e.g., from the sales team’s CRM system), route the alert to the sales channel to request an immediate fix from the source.
If it’s a technical issue due to a backend change, route it to the relevant tech team.
This shifts the responsibility to the team best equipped to address the root cause, if applicable.
Essential Data Testing Principles
Don’t Test for Testing’s Sake
Avoid building tests simply to have a high test count. Start from the end: what situations do you want to avoid?
Duplicated revenue?
Counting orders twice?
Data that is critically out-of-date?
Start with the most critical and embarrassing stuff first. And whenever you spend a significant amount of time debugging a problem, you must write a test for it immediately afterward to prevent its recurrence.
Use Documentation (Doc Strings) Extensively
This is a highly recommended practice: use doc strings within the test itself or in its description to explain the following:
Why are you testing this specific condition?
What is the implication if the test fails (Are reports affected? Is an operational workflow like invoicing affected?).
What to do (Who to reach out to? How was this issue caused and solved in the past?).
This documentation is immensely helpful, especially for tests that don’t fail frequently, or if a coworker last touched the code a year or two ago.
AI as a Tool, Not a Decision-Maker
While I use AI assistance and vibe-code frequently, do not let AI decide what and how to test.
You can use AI to write the test code itself, but you must bring the domain knowledge to the table:
AI might suggest testing the order value to be between $0 and $1 billion. But with domain knowledge of an E-Commerce Shop, you know if you only sell socks & underwear, an order over $500 is suspicious and should be flagged.
If you sell flight tickets, a transaction showing 500 passengers is dodgy and clearly indicates an input error or faulty transformation
AI will often generate a massive volume of redundant and nonsensical tests that lead to more noise than signal. You need to decide the parameters based on real-world business constraints.
The Rule of Thumb: Manage Alert Fatigue
If you have an alert that you consistently do not react to, turn it off or delete the test. You want to avoid alert fatigue. Either fix the issue later, or if it’s genuinely not important, remove the check.
Owning Data Quality: Getting in the Driver’s Seat
Well-set-up tests empower you to be in the driver’s seat. They enable you to proactively charge stakeholders with improving data quality in their source systems, instead of only juggling reactive answers about why an output looks weird.
A great example from my marketing background is URL parameter mapping.
A new campaign launches with new URL parameters.
Your underlying mapping that associates parameters with channels is now outdated and the Channel-Field is empty
The marketing team sees that their new channel is missing or data is wrong, and they come to you.
Instead, write a test that flags if a core metric (purchases, clicks, etc.) exceeds a certain threshold (say, 5-10%) and is not hit by the current mapping. Set this as a non-critical, non-blocking flag.
When the test flags a potential new parameter, you can reach out to the marketing team and ask:
Hey, you launched a new campaign.
a) How is it working out?
b) How should we be calling this channel?
You move from being the reactive party whose dashboard is “broken” to the proactive owner of data quality, anticipating issues and leading the solution.
Think about testing, and do it wisely.
I hope these tips are helpful.