Crandore Hub

survextrap

Bayesian Flexible Parametric Survival Modelling and Extrapolation

Survival analysis using a flexible Bayesian model for individual-level right-censored data, optionally combined with aggregate data on counts of survivors in different periods of time. An M-spline is used to describe the hazard function, with a prior on the coefficients that controls over-fitting. Proportional hazards or flexible non-proportional hazards models can be used to relate survival to predictors. Additive hazards (relative survival) models, waning treatment effects, and mixture cure models are also supported. Priors can be customised and calibrated to substantive beliefs. Posterior distributions are estimated using 'Stan', and outputs are arranged in a tidy format. See Jackson (2023) <doi:10.1186/s12874-023-02094-1>.

README

The survextrap R package
================

<!-- README.md is generated from README.Rmd. Please edit that file -->

# survextrap

`survextrap` is an R package for parametric survival modelling with
either or both of:

1.  A standard individual-level, right-censored survival dataset, e.g.

<table>

<tr>

<th>

Survival time
</th>

<th>

Death
</th>

<th>

Predictors…
</th>

</tr>

<tr>

<td>

2 years
</td>

<td>

Yes
</td>

<td>

</td>

</tr>

<tr>

<td>

5 years
</td>

<td>

No
</td>

<td>

</td>

</tr>

<tr>

<td>

etc…
</td>

<td>

</td>

</tr>

</table>

2.  (optionally) “External” data sources in the following aggregate
    “count” form:

<table>

<tr>

<th colspan="2">

Follow-up period
</th>

<th colspan="2">

Number
</th>

<th>

Predictors…
</th>

</tr>

<tr>

<th>

Start time $t$
</th>

<th>

End time $u$
</th>

<th>

Alive at $t$
</th>

<th>

Still alive at $u$
</th>

<th>

</th>

</tr>

<tr>

<td>

$t_{1}$
</td>

<td>

$u_{1}$
</td>

<td>

$n_{1}$
</td>

<td>

$r_{1}$
</td>

<td>

</td>

</tr>

<tr>

<td>

$t_{2}$
</td>

<td>

$u_{2}$
</td>

<td>

$n_{2}$
</td>

<td>

$r_{2}$
</td>

<td>

</td>

</tr>

<tr>

<td>

etc…
</td>

<td>

</td>

<td>

</td>

<td>

</td>

<td>

</td>

</tr>

</table>

Any number of rows can be supplied for the “external” data, and the time
intervals do not have to be distinct or exhaustive.

Many forms of external data that might be useful for survival
extrapolation (such as population data, registry data or elicited
judgements) can be manipulated into this common “count” form.

### Principles

- Extrapolations from short-term individual level data should be done
  using *explicit data or judgements* about how risk will change over
  time.

- Extrapolations should not rely on standard parametric forms
  (e.g. Weibull, log-normal, gamma…) that are only used out of
  convention and do not have interpretations as plausible *mechanisms*
  for how risk will change over time.

- Instead of selecting (or averaging) traditional parametric models, an
  *arbitrarily flexible* parametric model should be used, that *adapts*
  to give the optimal fit to the short-term and long-term data in
  combination.

### How it works

- Bayesian multiparameter evidence synthesis is used to jointly model
  all sources of data and judgements.

- An M-spline is used to represent how the hazard changes through time
  (as in [rstanarm](https://arxiv.org/abs/2002.09633)). The Bayesian
  fitting method automatically chooses the optimal level of smoothness
  and flexibility. Spline “knots” should span the period covered by the
  data, and any future period where there is a chance that the hazard
  may vary. Then if there is no data in the future period, the
  uncertainty will be acknowledged and the predicted hazards will have
  wide credible intervals.

- A proportional hazards model or a flexible non-proportional hazards
  model can be used to describe the relation of survival to predictors.

- Mixture cure, relative survival and treatment effect waning models are
  supported.

- It has an R interface, designed to be friendly to those familiar with
  standard R modelling functions.

- [Stan](https://mc-stan.org/) is used under the surface to do MCMC
  (Hamiltonian Monte Carlo) sampling from the posterior distribution, in
  a similar fashion to [rstanarm](https://mc-stan.org/rstanarm/) and
  [survHE](https://CRAN.R-project.org/package=survHE).

- Estimates and posterior summaries and samples for outputs, such as
  survival, hazard and (restricted) mean survival, can easily be
  extracted.

### Technical details of the methods

The model is fully described in a paper: [Jackson, BMC Medical Research
Methodology (2023)](https://doi.org/10.1186/s12874-023-02094-1). See
also `vignette("methods")`.

`vignette("priors")` goes into detail on how prior distributions and
judgements can be specified in `survextrap` - an important but
often-neglected part of Bayesian analysis.

### Evaluation of the methods

Two papers by Timmins et al. describe simulation studies that show good
performance of the methods for (a) [short-term estimation from
individual-level data](https://arxiv.org/abs/2503.21388) and (b)
[extrapolation including external
data](https://arxiv.org/abs/2505.16835).

### Examples of how to use it

`vignette("examples")` gives a rapid tour of each feature, using simple
textbook examples and simulated data.

The [cetuximab case
study](https://chjackson.github.io/survextrap/articles/cetuximab.html)
is a more in-depth demonstration of how `survextrap` could be used in a
typical health technology evaluation, based on clinical trial, disease
registry, general population and elicited data. This vignette
accompanies Section 4 of the preprint
[paper](https://arxiv.org/abs/2306.03957).

### Slides from presentations about survextrap

- [PSI, June
  2025](https://chjackson.github.io/survextrap/cjackson_survextrap_psi.pdf)

- [Belfast (RSS NI), December
  2023](https://chjackson.github.io/survextrap/cjackson_survextrap_belfast.pdf)

- [Royal Statistical Society, September
  2023](https://chjackson.github.io/survextrap/cjackson_survextrap_rss23.pdf)

- [R-HTA, York, June
  2023](https://chjackson.github.io/survextrap/cjackson_survextrap_rhta.pdf)

- [Exeter, October
  2022](https://chjackson.github.io/survextrap/cjackson_survextrap_exeter.pdf)

## Installation

The package can be installed in the usual way from CRAN, as:

    install.packages("survextrap")

The latest development version on Github can be installed as

    remotes::install_github("chjackson/survextrap")

or more easily as (but a day behind the code on Github)

    install.packages("survextrap", repos=c('https://chjackson.r-universe.dev',
                                           'https://cloud.r-project.org'))

If you use it, I would be very happy to know!

Feedback, suggestions or problem reports are welcome. Or just let me
know what you are using it for, and how well it worked for your
application.

[github issues](https://github.com/chjackson/survextrap/issues), or
[email](mailto:chris.jackson@mrc-bsu.cam.ac.uk) are fine.

<!-- badges: start -->

[![lifecycle](lifecycle-stable.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
[![R-CMD-check](https://github.com/chjackson/survextrap/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/chjackson/survextrap/actions/workflows/R-CMD-check.yaml)
[![test-coverage](https://codecov.io/gh/chjackson/survextrap/branch/master/graph/badge.svg)](https://app.codecov.io/gh/chjackson/survextrap)
<!-- badges: end -->

Versions across snapshots

VersionRepositoryFileSize
1.0.1 rolling linux/jammy R-4.5 survextrap_1.0.1.tar.gz 2.1 MiB
1.0.1 rolling linux/noble R-4.5 survextrap_1.0.1.tar.gz 2.1 MiB
1.0.1 rolling source/ R- survextrap_1.0.1.tar.gz 2.1 MiB
1.0.1 latest linux/jammy R-4.5 survextrap_1.0.1.tar.gz 2.1 MiB
1.0.1 latest linux/noble R-4.5 survextrap_1.0.1.tar.gz 2.1 MiB
1.0.1 latest source/ R- survextrap_1.0.1.tar.gz 2.1 MiB
1.0.1 2026-04-26 source/ R- survextrap_1.0.1.tar.gz 2.1 MiB
1.0.1 2026-04-23 source/ R- survextrap_1.0.1.tar.gz 2.1 MiB

Dependencies (latest)

Depends

Imports

LinkingTo

Suggests