Crandore Hub

TemporalModelR

Temporally Explicit Species Distribution Modelling

Increases the ease of implementing a temporally-explicit modeling methodology when building ecological niche and species distribution models. Provides functions to assist with three major steps of temporally-explicit models: (i) preprocessing species and environmental data and generating suitable background or pseudoabsence data, (ii) building a niche model and generating temporally-explicit predictions from that model, and (iii) model postprocessing to explore spatiotemporal trends in model predictions. Methodological and theoretical foundations are described in Ingenloff and Peterson (2021) <doi:10.1111/2041-210X.13564>, Franklin (2010, ISBN:9780521700023), Peterson et al. (2011, ISBN:9780691136882), Blonder (2018) <doi:10.1111/ecog.03187>, Senay et al. (2013) <doi:10.1371/journal.pone.0071218>, and Li and Zhang (2024) <doi:10.48550/arXiv.2404.05933>.

README

Temporally Explicit Species Distribution Modeling in R
================
Connor Hughes, Mariana Castaneda-Guzman, Luis E. Escobar

- [Background](#background)
- [Package description](#package-description)
- [Installing the package](#installing-the-package)
- [Workflow in TemporalModelR](#workflow-in-temporalmodelr)
  - [Preprocessing](#preprocessing)
  - [Modeling](#modeling)
  - [Postprocessing](#postprocessing)
- [Citation](#citation)
- [Note on AI usage](#note-on-ai-usage)
- [Contributing](#contributing)
- [Getting help](#getting-help)

<!-- README.md is generated from README.Rmd. Please edit that file -->

<!-- badges: start -->

[![R-CMD-check](https://github.com/CJHughes926/TemporalModelR/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/CJHughes926/TemporalModelR/actions/workflows/R-CMD-check.yaml)
<!-- badges: end -->

## Background

Ecological niche modeling (ENM) and species distribution modeling (SDM)
are widely used to reconstruct and predict species environmental
tolerances and geographic distributions, expanding ecological and
biogeographic theory. Despite their widespread use, one aspect that has
received relatively little attention is the temporally dynamic
environment.

Environmental conditions shaping species distributions vary over time,
including climate, resource availability, and habitat availability. As a
result, the suitability of any given location may change as key
environmental conditions become more or less favorable for a species.
Most existing implementations of ENM assume a static relationship
between species and environment: occurrences are pooled across years,
predictors are time-averaged, and a single model is fit to the combined
dataset. This works well when relationships are genuinely stable, but it
carries three costs when they aren’t:

1.  **Temporal mismatch.** Pairing each occurrence with long-term
    average conditions can misrepresent the environment the species
    actually experienced at the moment it was recorded, leading to
    over-generalized niche estimates and higher rates of both
    extrapolation and omission.
2.  **Lost training data.** Standard spatial rarefaction retains one
    point per pixel, discarding temporally independent observations from
    the same location. When environments at a location have changed over
    time, those repeat observations carry unique information about the
    species niche.
3.  **No temporal dynamics in the output.** Static models produce a
    single predictive surface and cannot directly answer questions about
    how suitable conditions for a species have changed over time, or
    what environmental drivers are responsible for those changes.

Temporally-explicit ENM offer an alternative practice which preserve the
time stamp on each observation. Each occurrence is paired with the
environmental conditions it experienced *at the time step it was
recorded*, rather than against a long-term average or other static
variable. The resulting models are sensitive to year-to-year (or
season-to-season) variation in the species-environment relationship, and
predictions are projected as time series rather than as a single static
surface.

**TemporalModelR** is an R package for building temporally explicit SDMs
end-to-end, from raw occurrence records to time-resolved trend analyses.
It provides a consistent and replicable workflow for implementing
temporally explicit ENM.

<br>

## Package description

**TemporalModelR** operates across two complementary dimensions to
support temporally explicit modeling:

- **E-space (Environmental Space)** represents a location in terms of
  its environmental conditions, independent of geography. A species
  niche in E-space is its tolerance for a given set of environmental
  variables.
- **G-space (Geographic Space)** represents a location in terms of where
  it physically sits on the landscape. Every point in G-space has a
  corresponding location in E-space.

Because environmental conditions change over time, the same point in
G-space may move through E-space across years, decades, or seasons.
Traditional, temporally-static workflows assume both G-space and E-space
to be stable in time. TemporalModelR assumes only that the *niche*
(i.e., the species tolerance in E-space) is stable across the study
period, while the E-space coordinates of any G-space location may change
over time. Species observations are therefore matched to E-space data
correct in both space and time, the niche is modeled in time-independent
E-space. The resulting niche estimate is projected back onto G-space at
explicit time periods to produce temporally dynamic predictions across
time rather than a single static surface.

The package structures the SDM workflow around three phases:

- **Preprocessing**: aligning environmental rasters to a common grid,
  rarefying occurrence data in space and time, extracting environmental
  values matched to observation time, scaling rasters to a standardized
  range, partitioning into spatially and/or temporally independent
  cross-validation folds, and generating fold-stratified pseudoabsences
  for presence/absence models.
- **Modeling**: fitting one of four supported algorithms (GLM, GAM,
  random forest, hypervolume) per cross-validation fold, with automatic
  threshold selection and per-fold evaluation metrics in both E-space
  (time-independent) and G-space (time-specific).
- **Postprocessing**: combining per-fold predictions into a consensus
  binary stack for more straightforward assessments, classifying each
  pixel’s temporal trajectory via changepoint detection into
  never-suitable, always-suitable, no pattern, increasing, decreasing,
  or fluctuating in suitability, and aggregating patterns across
  user-defined spatial units like administrative regions or watersheds.

Every function in the package is built around the concept of **time
placeholders** in raster filenames (`forest_cover_2005.tif`,
`pr_seas_2003_Spring.tif`). Patterns map clean predictor names to
filenames; `time_cols` defines which columns in your occurrence data
correspond to those placeholders. For example, a column in species
occurrence data which labeled `year` with subsequent values `2001`,
`2002`, `2005`, and a user defined variable name as `forest_cover_year`,
would match to rasters `forest_cover_2001`, `forest_cover_2002`, and
`forest_cover_2005`. Likewise, a databse with both columns `year` and
`month` and a user defined variable name of `pr_m_year_month` would
match to rasters like `pr_m_1997_09`.

This makes the workflow flexible to whether your time scale is annual,
monthly, daily, or anything in between, so long as you have rasters data
at the same temporal level. It also allows different predictors to vary
at different time scales within the same model, flexibly allowing
temporally static variables (e.g. elevation) to be processed in the same
suite and used in the same models as temporally dynamic variables, which
may themselves be measured at different time steps (e.g. annual forest
cover estimates, monthly precipitation).

<br>

## Installing the package

Note: Internet connection is required to install the package.

The development version of TemporalModelR can be installed using the
code below.

``` r
# install.packages("devtools")
devtools::install_github("CJHughes926/TemporalModelR")
```

<br>

## Workflow in TemporalModelR

A typical TemporalModelR workflow follows these steps:

1.  **Preprocess** — align rasters, rarefy occurrences, extract and
    optionally scale environmental values, partition into folds, and
    generate pseudoabsences (if needed)
2.  **Model** — fit a GLM, GAM, random forest, or hypervolume model,
    assess time-independent model performance. Project the fitted model
    across space and time, assess temporally-explicit model performance
3.  **Postprocess** — summarize predictions to consensus surfaces,
    identify temporal patterns, and aggregate by spatial unit

A brief description of each step is presented below. For full
walkthrough of each phase, see the vignettes hosted on the
[TemporalModelR site](https://CJHughes926.github.io/TemporalModelR/)
under the *Articles* menu, also linked inline throughout this README.

The package ships a small synthetic dataset that every vignette runs
against, so the full workflow can be reproduced end-to-end without
downloading any external data. The dataset includes raw and aligned
environmental rasters, an occurrence point database, intermediate point
files, and pre-computed `data()` objects from each phase of the
workflow. Reading the data set vignette first is recommended, as it
provides the shared context every other vignette refers back to.

> For a description of the synthetic data and the bundled `data()`
> objects used in the workflow vignettes, see the [About the Example
> Dataset](https://cjhughes926.github.io/TemporalModelR/articles/V1_dataset.html)
> vignette.

<br>

### Preprocessing

The preprocessing pipeline transforms raw occurrence records and
environmental rasters into the structured inputs that downstream models
expect. Six functions cover this phase. `raster_align()` standardizes
rasters to a common projection, extent, and resolution to prevent
misalignment in future analyses. `spatiotemporal_rarefaction()` subsets
data to one point per pixel per time step, reducing pseudoreplication
from your data but preserving truly unique temporally independent
observations from the same location. `temporally_explicit_extraction()`
extracts environmental values from the raster that matches each point’s
observation time step, producing a table of observations with their
relevant E-space conditions correct in both space and time, along with
the means and standard deviations of each variable across the data set.
`scale_rasters()` optionally uses those scaling parameters to z-score
every raster, ensuring variables share a comparable range and contribute
evenly to subsequent modeling algorithms sensitive to that.
`spatiotemporal_partition()` builds cross-validation folds via one of
four methods: purely spatial, purely temporal, spatiotemporal (combining
both), or random as a baseline. Finally, `generate_absences()` produces
fold-stratified pseudoabsences relevant for presence/absence models.

> For a detailed workflow showcasing each preprocessing function, see
> the [Preprocessing temporally explicit
> data](https://cjhughes926.github.io/TemporalModelR/articles/V2_Preprocessing.html)
> vignette.

<br>

### Modeling

TemporalModelR supports four modeling algorithms, all of which share the
same data requirements (`partition_result`, `pseudoabsence_result` for
presnce/absence models, a `model_formula` or `model_vars`). One model is
fit per cross-validation fold; each is evaluated on its held-out test
set, and for any continuous models a threshold is selected and applied
to produce only binary results.

- **`build_temporal_glm()`** fits one generalized linear model per fold
  with a binomial link function. Supports any combination of linear,
  polynomial, and interactive terms via standard R formula syntax.
- **`build_temporal_gam()`** fits one generalized additive model per
  fold via `mgcv::gam()`. Allows for complex nonlinear relationships
  between environmental gradients and species presence.
- **`build_temporal_rf()`** fits one random forest classifier per fold
  via `randomForest::randomForest()`. Uses ensembles of decision trees
  to capture complex nonlinear relationships and interactions.
- **`build_temporal_hv()`** constructs n-dimensional hypervolumes per
  fold via Gaussian kernel density estimation or one-class support
  vector machine. The presence-only option; no pseudoabsences required.

Each model builder produces evaluation metrics across folds in E-space
(time-independent), capturing overall niche estimation quality
regardless of when the test points were observed. E-space evaluation
gives a stable picture of model performance even in years where
occurrence records are sparse.

Once a model is fit, `generate_spatiotemporal_predictions()` projects it
across user-defined time steps. The function accepts any of the four
model types via the `model_result` argument and writes one prediction
raster per fold per time step alongside per-time-step evaluation
metrics. The same call handles single-timestep and multi-timestep
workflows given the relevant raster data is available: for example
`time_steps = data.frame(year = 1:15)` produces 15 annual maps;
`expand.grid(year = 1:15, season = "Spring")` produces 15 spring
snapshots; `expand.grid(year = 1:15, season = c("Spring", "Summer"))`
produces 30 maps spanning two seasons.

Once predictions are projected into G-space, time-specific assessment
metrics become available, including cumulative binomial probability
(CBP) tests assessing whether predictions in a given time step are
better than random. G-space evaluation complements the E-space metrics
from model fitting and both may be considered in model evaluation:
G-space is robust and specific in years with substantial sample sizes
but loses meaning when few records exist for a given time step, while
E-space gives a more stable overall picture independent of time.
Reporting both gives the most complete view of model performance.

> For a full walkthrough of each modeling algorithm, including formula
> syntax, parameter choices, G-space projection, and diagnostic
> interpretation, see the four modeling vignettes: [Modeling with a
> GLM](https://cjhughes926.github.io/TemporalModelR/articles/V3a_GLM.html),
> [Modeling with a
> GAM](https://cjhughes926.github.io/TemporalModelR/articles/V3b_GAM.html),
> [Modeling with a Random
> Forest](https://cjhughes926.github.io/TemporalModelR/articles/V3c_RF.html),
> and [Modeling with a
> Hypervolume](https://cjhughes926.github.io/TemporalModelR/articles/V3d_HV.html).

<br>

### Postprocessing

The postprocessing phase converts the per-fold prediction rasters into
interpretable summaries. `summarize_raster_outputs()` collapses
fold-level predictions into a binary consensus stack (where a
user-specified number of folds agree on suitability) and a frequency
raster (proportion of time steps each pixel was suitable) which explored
the stability of G-space suitability projections across time.
`analyze_temporal_patterns()` uses the `fastcpd` package to detect
structured changes in each pixel’s suitability across the time series,
classifying trajectories as never-suitable, always-suitable, no pattern,
increasing, decreasing, or fluctuating. The changepoint analysis
accounts for temporal autocorrelation by including each pixel’s
previous-time-step value as a predictor in the change point detection
model; users can optionally enable spatial autocorrelation, which adds
the proportion of a pixel’s first-order (queen’s case) neighbors
predicted as suitable as an additional covariate.
`analyze_trends_by_spatial_unit()` aggregates pixel-level patterns
across user-defined polygons (administrative units, watersheds,
ecoregions), producing useful tables summarizing losses and gains in
suitability over time across a these regions, and is useful for
identifying regional variation in suitability trends.

> For the full postprocessing workflow including consensus threshold
> selection, changepoint diagnostics, and zone aggregation, see the
> [Postprocessing
> predictions](https://cjhughes926.github.io/TemporalModelR/articles/V4_Postprocessing.html)
> vignette.

<br>

## Citation

If you use TemporalModelR in your research, please cite:

Hughes, C., Castaneda-Guzman, M., & Escobar, L.E. (2026).
*TemporalModelR: An R Package for temporally-explicit species
distribution modeling.* \[Journal information to be added\]

<br>

## Note on AI usage

To maintain high standards of code quality and documentation, we have
used AI LLM tools while constructing code for this package. We used
these tools for grammatical polishing and exploring technical
implementation strategies for specialized functions. We manually checked
and tested all code and documentation refined with these tools.

<br>

## Contributing

We welcome contributions to improve `TemporalModelR`. To maintain the
integrity and performance of the package, we follow a few core
principles:

- **Quality over quantity**: we prioritize well-thought-out, stable
  improvements over frequent, minor changes. Please ensure your code is
  well-documented and follows the existing style of the package.
- **Minimal dependencies**: one of the goals of TemporalModelR is to
  remain efficient. We prefer solutions that use base R or existing
  dependencies. Proposals that introduce new package dependencies will
  be evaluated for their necessity.
- **AI-assisted code**: if you use AI tools to generate code
  alternatives or improvements, please manually verify the logic and
  accuracy of the output and demonstrate the benefit in your Pull
  Request.
- **Testing**: new features should include examples, and tests should be
  performed to ensure they work as intended and do not break existing
  workflows.

If you have an idea for a major change, please open an Issue first to
discuss it with the maintainers.

<br>

## Getting help

- Report bugs or request features: [GitHub
  Issues](https://github.com/CJHughes926/TemporalModelR/issues)
- Ask questions: [GitHub
  Discussions](https://github.com/CJHughes926/TemporalModelR/discussions)

Versions across snapshots

VersionRepositoryFileSize
0.2.0 rolling linux/jammy R-4.5 TemporalModelR_0.2.0.tar.gz 4.4 MiB
0.2.0 rolling linux/noble R-4.5 TemporalModelR_0.2.0.tar.gz 5.7 MiB
0.2.0 rolling source/ R- TemporalModelR_0.2.0.tar.gz 4.4 MiB
0.2.0 latest linux/jammy R-4.5 TemporalModelR_0.2.0.tar.gz 4.4 MiB
0.2.0 latest linux/noble R-4.5 TemporalModelR_0.2.0.tar.gz 5.7 MiB
0.2.0 latest source/ R- TemporalModelR_0.2.0.tar.gz 4.4 MiB
0.2.0 2026-04-23 source/ R- TemporalModelR_0.2.0.tar.gz 0 B

Dependencies (latest)

Imports

Suggests