Data Quality: Dimensions, Validation and Acceptance Criteria | DataSupplier
DataSupplier
Insights EN · ES Log in Request a Quote
Insights / Delivery & Technical

Data quality: dimensions, validation and acceptance criteria

DataSupplier·15 min read

A dataset is only as useful as it is reliable. When you source external data, quality is not a vague aspiration; it is something you can define, measure and make contractually binding. This guide sets out the dimensions of data quality, how to validate a dataset, and how to write acceptance criteria that protect your project.

Available across the EU. DataSupplier sources and delivers this data in all 27 European Union countries — including Germany, France, Spain, Italy, the Netherlands and Poland — and across the EEA, in the format and cadence you need.

Why quality must be defined, not assumed

Most data disputes are quality disputes: the data arrived, but it was incomplete, stale, inconsistent or did not match the agreed schema. The way to avoid this is to define quality up front, in measurable terms, and tie it to acceptance. Quality defined late is quality you cannot enforce.

The core dimensions of data quality

  • Completeness: are all expected records and fields present?
  • Accuracy: do values correctly reflect the real world?
  • Consistency: are values coherent within and across datasets?
  • Timeliness: is the data fresh enough for the use case?
  • Validity: do values conform to the expected format and ranges?
  • Uniqueness: are there unintended duplicates?

These dimensions are reflected in standards such as ISO/IEC 25012, and they give a shared vocabulary for describing what "good" means for a specific dataset.

How to validate a dataset

Validation turns dimensions into checks. Profiling reveals distributions, null rates, ranges and outliers. Schema validation confirms structure and types. Referential checks confirm relationships hold. Cross-source reconciliation compares against a trusted reference. Sampling and manual review catch issues automated checks miss. For recurring feeds, these checks should run on every delivery, not just the first.

Setting acceptance criteria

Acceptance criteria are the threshold at which data is considered fit to use, expressed as measurable targets: for example, completeness above a defined percentage, freshness within a defined window, zero schema violations, and duplicate rates below a threshold. They should be specific enough to test objectively, and agreed before delivery so both sides know what "accepted" means.

Quality in recurring supply

For ongoing feeds, quality is a process, not a one-off gate. That means monitoring against the agreed criteria, alerting on breaches, a defined remediation path, and a change process for when the source itself changes. Service-level agreements and data contracts make these expectations explicit and enforceable.

The role of a managed supply partner

A managed approach builds validation and acceptance into the supply process: profiling and checking data on the way in, transforming it to meet the agreed schema, and documenting quality so the buyer can trust, and audit, what they receive. This is especially valuable when combining multiple sources, where inconsistencies are most likely to appear.

The six dimensions, applied

The dimensions are only useful when turned into concrete checks for a specific dataset. Completeness becomes a null-rate threshold per field and an expected record count; accuracy becomes validation against a trusted reference; consistency becomes cross-field and cross-source rules; timeliness becomes a freshness window; validity becomes format and range checks; uniqueness becomes a duplicate threshold. Written this way, “good quality” stops being an opinion and becomes something you can test on every delivery.

Automating quality checks

For recurring feeds, quality must be enforced automatically, not inspected by hand. A practical pipeline profiles each delivery, runs the dimension checks, compares against the agreed acceptance criteria, and alerts on breaches before data reaches consumers. Pair this with a defined remediation path and a change process for when the source itself changes, and quality becomes a controlled process rather than a recurring surprise.

Key takeaways
  • Define quality in measurable terms before delivery, not after.
  • Use the standard dimensions: completeness, accuracy, consistency, timeliness, validity, uniqueness.
  • Turn dimensions into automated checks that run on every delivery.
  • Write acceptance criteria you can test objectively, and back them with SLAs.

Sources & further reading

  • ISO/IEC 25012: Data quality model.
  • ISO 8000: Data quality.
  • DAMA-DMBOK: Data Management Body of Knowledge, data-quality dimensions.
  • European Commission: data-quality guidance within the European data spaces.
Want data you can trust?

We validate, transform and document external data against agreed acceptance criteria and SLAs. Get a no-obligation quote.

Request a Quote Book a 30-minute call
Related
API, MQTT, Parquet, CSV or Excel: choosing a delivery model → The complete guide to enterprise external data sourcing →