Reference Data and Code Lists | DataSupplier
DataSupplier
Insights EN · ES Log in Request a Quote
Insights / Delivery & Technical

Reference data and code lists

DataSupplier·12 min read

Reference data is the unglamorous backbone that makes datasets interoperable. This guide covers reference data and code lists and why they matter for external data.

What reference data is

Reference data is the standard codes, classifications and identifiers, country codes, currencies, industry codes, units, that give meaning and consistency to other data. It changes slowly but underpins everything.

Why it matters

Combining datasets requires shared reference data; mismatched classifications or identifiers silently break joins and aggregations. Good reference data is what makes integration work.

The data landscape

  • Classifications: industry (NACE), product (CPV, HS), geography (NUTS).
  • Identifiers: country, currency, entity (LEI).
  • Mappings: crosswalks between schemes.
  • Versions: revisions over time.

Mappings and versions

Different sources use different schemes, so crosswalks are essential, and classifications get revised, so version handling matters for time series.

Sourcing considerations

Standards bodies and official sources provide authoritative reference data, much of it open. Keeping it current and mapped is the work.

In a managed model

A managed partner can maintain reference data and mappings so sourced datasets align consistently.

Crosswalks and versions

Different sources use different classification schemes, so crosswalks between them are essential, and schemes get revised, so version handling matters for time series. A mismatch between, say, two industry classifications silently breaks joins and aggregations. Maintaining current, mapped reference data is the quiet backbone that lets sourced datasets align.

Authoritative and largely open

Standards bodies and official sources, Eurostat (NACE, NUTS, CPV), ISO codes, the WCO Harmonised System, GLEIF’s LEI, provide authoritative reference data, much of it open. The work is keeping it current and mapped, so it stays a reliable anchor for integration rather than a source of silent error.

Key takeaways
  • Reference data gives meaning and consistency to other data.
  • Mismatched classifications silently break joins.
  • Crosswalks between schemes and version handling are essential.
  • Standards bodies provide authoritative, often open reference data.

Sources & further reading

  • Eurostat: NACE, NUTS and CPV classifications.
  • ISO: country, currency and unit codes.
  • GLEIF: Legal Entity Identifier.
  • WCO: Harmonised System codes.
Need reference data aligned?

We maintain reference data and mappings so sourced datasets align consistently. Get a no-obligation quote.

Request a Quote Book a 30-minute call
Related
Master data management and entity resolution →Data integration patterns: ETL, ELT and CDC →