Geospatial data formats and standards (GeoJSON, GeoParquet, OGC)
A great deal of external data is fundamentally spatial, tied to a place. Delivering it well means understanding geospatial formats, coordinate systems and standards. This guide covers the essentials for buyers and integrators.
Why geospatial is different
Spatial data carries geometry and a coordinate reference system, and combining datasets requires them to align. Getting formats and projections right is the difference between layers that overlay correctly and ones that do not.
Key formats
- GeoJSON: simple, web-friendly vector format for moderate volumes.
- GeoParquet: columnar, compressed vector format for large analytical datasets.
- Shapefile and GeoPackage: established vector formats with broad tool support.
- Raster formats (e.g. GeoTIFF, Cloud-Optimized GeoTIFF): for imagery and gridded data.
Coordinate reference systems
Every spatial dataset has a coordinate reference system (CRS), such as WGS 84. Mixing datasets in different CRSs without reprojection produces misaligned results. Confirming and harmonising the CRS is a basic but critical step.
OGC and interoperability
The Open Geospatial Consortium defines standards and services (such as WMS, WFS and OGC API) that make spatial data interoperable across tools and vendors. In Europe, the INSPIRE framework shapes how public spatial data is shared.
Delivery considerations
For analytics at scale, GeoParquet into a warehouse is efficient; for web and integration, GeoJSON or OGC API services fit; for imagery, cloud-optimized rasters allow partial reads. Match the format to the consumer and volume.
Sourcing and governance
Spatial data that pinpoints individuals or premises can be personal, so aggregation may be needed. Provenance, CRS and licensing should be documented so downstream users can trust and combine the data.
Coordinate reference systems in practice
The most common geospatial error is mixing coordinate reference systems. A dataset in WGS 84 (used by GPS and GeoJSON) will not overlay correctly on one in a national grid or a Web Mercator tileset unless it is reprojected. Before combining layers, confirm each dataset’s CRS, reproject everything to one chosen system, and verify alignment on known points. Latitude/longitude order, axis conventions and datum differences are small details that silently corrupt analysis when ignored.
Choosing a format by use
For large analytical workloads, GeoParquet brings columnar efficiency to vector data; for web and integration, GeoJSON or live OGC API services fit; for established GIS tooling, GeoPackage is robust; for imagery and grids, cloud-optimized GeoTIFF allows partial reads without downloading whole scenes. Matching the format to the consumer and volume, rather than converting everything to one default, keeps geospatial delivery efficient and interoperable.
- Spatial data needs a known, harmonised coordinate reference system.
- Use GeoParquet for scale, GeoJSON or OGC API for web and integration.
- OGC and INSPIRE underpin geospatial interoperability in Europe.
- Aggregate where spatial data could identify individuals or premises.
Sources & further reading
- Open Geospatial Consortium (OGC): standards and OGC API.
- GeoParquet specification.
- European Commission: INSPIRE Directive for spatial data.
- EUR-Lex: Regulation (EU) 2016/679 (GDPR).
Vector and raster data in the right formats and CRS, harmonised and delivered for your stack. Get a no-obligation quote.