Data Sources for geospatial and remote sensing analysis
This page signposts useful data sources for geospatial ecology in R, including best options and supporting evidence.
- rstac rapid access, search and download spacetime earth observation data via SpatioTemporal Asset Catalog (STAC) linked with Microsoft Planetary Computer (see MPC data catalogue). This is usually the most efficient way to access most satellite-derived datasets for TESS Lab research. Data sources and products include Landsat, ESA WorldCover, Fire (e.g., burned area), Weather and climate (e.g., ECMWF, inc. ERA5); Infrastructure such as building footprints, MODIS, and ESA Sentinel data.
- rsi simplified access to geospatial data from STAC (SpatioTemporal Asset Catalog) and calculating spectral indices, also includes a global Digital Terrain Model (DTM) (by default the Copernicus 30 m DEM which is generally considered to be the best non-commercial product ([1], [2], [3], see also [4], [5]), although the NASADEM is also very good).
- Weather and climate
- chirps API Client for the Climate Hazards Center ‘CHIRPS’ and ‘CHIRTS’. The ‘CHIRPS’ data is a quasi-global (50°S – 50°N) high-resolution (0.05 arc-degrees) rainfall data set, which incorporates satellite imagery and in-situ station data to create gridded rainfall time series for trend analysis and seasonal drought monitoring. Many independent studies have highlighted that CHIRPS outperforms nearly all other precipitation products in Africa across most metrics ([1], [2], [3], [4], [5], [6], [7], [8], [9]), although TAMSAT has similar performance with slightly better correspondence with daily totals but slightly worse correspondence for multi-day totals and slightly finer spatial resolution.
- ‘CHIRTS’ is a quasi-global (60°S – 70°N), a high-resolution data set of daily maximum and minimum temperatures.
- This Geospatial Data Catalog signposts over 18,000 (free) geospatial datasets, providing information: Dataset description including key variables, Time-series, Spatial resolution. It was built by Rob Johnsen, who builds geospatial platforms for the World Bank and you can add new datasets.
- Chewie for efficient download, management and manipulation of the GEDI 2A, 2B, 1B, 4A spaceborne LiDAR products.
- rayvista a small plugin for the rayshader package, to create 3D visualisations of any location on Earth.
- chmloader global Canopy Height Model (CHM) at 1 m spatial resolution (after Tolan et al., 2024).
- rnaturalearth facilitates access to Natural Earth vector and raster data including physical (e.g., coastline, lakes, glaciated areas) datasets and cultural (e.g., country boundaries, airports, roads, railroads) features.
- OpenStreetMap gives access to open street map raster images.
- mapme.biodiversity allows to download and process a number open datasets related to biodiversity conservation providing efficient routines and parallelization options. Datasets include among others the Global Forest Watch, ESA/Copernicus Landcover, WorldClim and NASA FIRMS.
- marmap for downloading, manipulating, and plotting bathymetric and topographic data in R (querying NOAA’s ETOPO1 database).
- osfr facilitates access to open research materials and data in the Open Science Framework (OSF).
- cshapes has historical country boundaries (1886-today).
- gbifdb a high-performance interface to the Global Biodiversity Information Facility (GBIF). ‘gbifdb’ provides enhanced performance for R users performing large-scale analyses on servers and cloud computing providers, providing full support for arbitrary ‘SQL’ or ‘dplyr’ operations on the complete ‘GBIF’ data tables.
- 𝗿𝗴𝗲𝗲𝗱𝗶𝗺 to search, composite, and download ‘Google Earth Engine’ imagery with ‘reticulate’ bindings for the ‘Python’ module ‘geedim’ with documentation here. (NB. there is an alternative way to use GDAL to extract assets but the process is quite involved (creating a Google Cloud bucket, and getting a key into the system variables… etc.)).
- If using tmap, this web explorer shows the different basemaps available (“Esri.WorldImagery” is one of the true colour satellite imagery options).
- Other geospatial data sets (for importing to R)
- Soil properties:
- Globally from SoilGrids (interactive website inc. manual download, and also service for linking with R), and
- Africa-specific iSDA (interactive website; and guidance on accessing data via AWS) slightly outperforms SoilGrids for most soil properties in Africa with a larger number of training samples, note that the 30 m spatial resolution needs to be used cautiously because not all predictor covariates were available at that resolution.
- Available/Free fine-resolution optical data
- UK Airborne Survey Data from 1981 to present (ca. 130 TB) is available from the UK Centre for Environmental Data Analysis (CEDA). (Pre-2018 data from here and post-2019 data from here). For help email helpdesk@neodaas.ac.uk.
- Global: Via the Copernicus Data Warehouse, ESA provide access for free research use of limited extent optical image data (e.g., Global Eye, WorldView 2, WorldView 3, as detailed as 8-band with down to 0.3 m ground sampling distance.
- Historical imagery from the declassified CORONA (up to 1.8 m GSD from 1960-1972) and HEXAGON (up to 0.6 m GSD from 1971-1984) spy satellite programmes covering most of the globe with usually panchromatic data (apart from Greenland, Australia and Antarctica) (Info 1, info 2, download via the USGS, distribution efforts are ongoing).
- Global: Planet SkySat Public Ortho Imagery, data from Planet labs Inc. SkySat satellites for the short term “Skybox for Good” Beta programme in 2015 and a few limited other times/locations. Available in 5-band Multispectral/Pan collection, and a Pansharpened RGB collection. Ca. 2-0.8 m ground sampling distance. See also more SkySat sample data via ESA.
- Global: ‘Functional Map of the World‘ global mapping effort with > 1 million images from >200 countries, 3-8 bands, varying ground sampling distances, ~3.5TB of data hosted on AWS.
- Global: SpaceNet Challenge Datasets free WorldView-3 optical data for limited extents (100 locations), 8-band, 0.3 m ground sampling distance, multi-temporal.
- Global: WorldStrat Dataset free coverage of ~10,000 km2 from Airbus SPOT 6/7 satellites, ground sampling distance of 1.5 m in panchromatic and 6 m in multispectral.
- USA: National Agriculture Imagery Program (NAPI), 1-2 meter ground sample distance, 3-4 bands, >9 million km2 surveyed yearly.
- Digital Earth Africa data catalogue.
- Google Earth Engine Data Catalogue:
- inc. PMLv2 gridded vegetation gross primary productivity and evapotranspiration product at 500 m grain.
- Google Earth Engine Community Catalog: inclusing hundreds of geospatial datasets that aren’t in the core GEE collections.
- Road network density: The Global Roads Inventory Project (GRIP) dataset, and Microsoft Global Roads dataset (millions more km of roads).
- Database of Global Administrative Areas (GADM) (see also https://www.geoboundaries.org/, sometimes useful to cross-check GADM)
- PlanetLabs’ Education and Research program offers free (but commercially restricted) access to optical data from their satellite constellation (PlanetScope and RapidEye products with ca. 3-5 m spatial resolution).
- Aiddata has lots of geospatial datasets.
- Dryad open data platform from research, with datasets including fine resolution satellite data, flora and fauna occurrence, geospatial GDP, malaria risk maps, etc.
- Zenodo open data platform from research, with datasets on includes data on: climate and environmental variables, road infrastructure, geospatial GDP estimates.
- Harvard Dataverse, a repository for researchers to publish their data, including, air pollution, population grids, etc.
- The Humanitarian Data Exchance is one of the best resources for geospatial data such as population estimates, indices of socio-economic deprivation, etc.
- Global Health Data Exchange is one of the best resources for geospatial health and well-being data; including pollution, child mortality, vaccinations, malaria incidence, and educational attainment etc.
- Soil properties:
