An interactive web atlas for exploring predicted soil bacterial diversity across Sub-Saharan Africa, derived from 16S rRNA amplicon sequencing of ~810 soil samples across 9 countries.
Built with R and Shiny. Deployed on Posit Connect.
This atlas serves spatially continuous predictions of Hill diversity indices and evenness at 0.05° resolution (~5 km), estimated from Bayesian hierarchical models fitted to soil microbiome data from BioProject PRJNA807934. Users can query predicted diversity values at any location in Sub-Saharan Africa by clicking the map or hovering the cursor over it.
Benin, Botswana, Côte d'Ivoire, Kenya, Mozambique, Namibia, South Africa, Zambia, Zimbabwe
| Layer | Symbol | Unit |
|---|---|---|
| Species richness | q = 0 | ASVs |
| Shannon diversity | q = 1 | Effective species |
| Simpson diversity | q = 2 | Effective species |
| Evenness | logE | - |
All layers are served as mean prediction alongside a 90% posterior predictive interval (PPI), reflecting the full uncertainty from the underlying Bayesian model.
.
├── global.R # Startup: loads rasters, defines metadata & helpers
├── ui.R # Full-screen map layout with floating panels
├── server.R # Reactive logic: layer swap, map click, cursor tooltip
├── R/
│ ├── mod_map.R # Map Shiny module
│ ├── mod_predict.R # Prediction Shiny module
│ └── utils.R # Utility functions
├── www/
│ └── atlas.css # Custom CSS (dark forest-green theme)
├── data/ # Pre-computed raster predictions (not tracked by git)
│ ├── pred_mean_D0.tif
│ ├── pred_mean_D1.tif
│ ├── pred_mean_D2.tif
│ ├── pred_mean_logE.tif
│ ├── pred_mean_D0_even.tif
│ ├── pred_ppi90_D0.tif
│ ├── pred_ppi90_D1.tif
│ ├── pred_ppi90_D2.tif
│ ├── pred_ppi90_logE.tif
│ └── pred_ppi90_D0_even.tif
└── sda.Rproj
Note: The
data/directory is excluded from version control (see.gitignore) due to file sizes. Raster files must be obtained separately, see Data below.
- R ≥ 4.2
- The following packages:
install.packages(c(
"shiny", "leaflet", "leaflet.extras",
"terra", "sf", "bslib"
))terra requires GDAL and PROJ system libraries. On Ubuntu/Debian:
sudo apt-get install libgdal-dev libproj-devOn macOS with Homebrew:
brew install gdal projClone the repository and place the raster files in data/, then from the project root:
shiny::runApp(".", launch.browser = TRUE)Or open any of global.R, ui.R, server.R in RStudio and click Run App.
Fix the port to keep a stable URL across restarts during development:
shiny::runApp(".", port = 4242, launch.browser = TRUE)Raster predictions are archived on Zenodo:
The pre-computed raster predictions are generated by a Bayesian hierarchical model (Student-t likelihood, non-centred parametrisation) fitted in Stan via CmdStanR. The modelling pipeline is maintained in a separate repository.
Raster specifications:
- CRS: EPSG:4326 (WGS 84)
- Resolution: 0.05° (~5 km at the equator)
- Format: Cloud-Optimised GeoTIFF (COG)
- Coverage: Sub-Saharan Africa
To convert local TIF files to COG format before deployment:
library(terra)
tif_files <- list.files("data", pattern = "\\.tif$", full.names = TRUE)
for (f in tif_files) {
tmp <- paste0(f, ".tmp.tif")
r <- terra::rast(f)
terra::writeRaster(r, tmp, filetype = "COG", overwrite = TRUE)
file.rename(tmp, f)
message("Converted: ", basename(f))
}The atlas is deployed on Posit Connect. To push an update:
rsconnect::deployApp(
appDir = ".",
appName = "africa-soil-diversity-atlas"
)| Setting | Recommended value |
|---|---|
| RAM | ≥ 2 GB (4 GB preferred) |
| Max worker processes | ≥ 2 |
| Connection timeout | 60 s |
| Idle timeout | 120 s |
These are set under App => Settings => Runtime in the Connect dashboard.
Single-language stack: the entire pipeline from raster loading to UI rendering is in R, with no Python or JavaScript framework dependencies beyond what Leaflet and Shiny provide.
Pre-computed rasters over on-the-fly prediction: all diversity indices are pre-computed on a fixed grid. The app performs only fast terra::extract() lookups at clicked or hovered coordinates, keeping response time sub-second regardless of model complexity.
Student-t Bayesian model: the underlying statistical model uses a Student-t likelihood rather than Normal, which proved decisive for this dataset: switching reversed several ecological conclusions and improved ELPD by ~1,470 units relative to Normal-likelihood models.
Bivariate evenness model: richness and evenness are modelled jointly, allowing the detection of aridity effects that dissociate the two indices in opposite directions — a result not detectable with univariate models.
If you use this atlas or the underlying data in your work, please cite:
[Citation to be added upon publication]
The source sequencing data are available at NCBI BioProject PRJNA807934.