Crandore Hub

Canek

Batch Correction of Single Cell Transcriptome Data

Non-linear/linear hybrid method for batch-effect correction that uses Mutual Nearest Neighbors (MNNs) to identify similar cells between datasets. Reference: Loza M. et al. (NAR Genomics and Bioinformatics, 2020) <doi:10.1093/nargab/lqac022>.

README

<!-- README.md is generated from README.Rmd. Please edit that file -->
<!--![Canek_logo](images/logo.png){width="100"}<!-- -->
<p align="center">
<img src="man/figures/README-logo.png" width="50%"  />
</p>

# Canek

<!-- badges: start -->

[![R-CMD-check](https://github.com/MartinLoza/Canek/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/MartinLoza/Canek/actions/workflows/R-CMD-check.yaml)
[![CRAN_Status_Badge](https://www.r-pkg.org/badges/version/Canek)](https://cran.r-project.org/package=Canek)
[![CRAN
Downloads](https://cranlogs.r-pkg.org/badges/Canek)](https://cran.r-project.org/package=Canek)
[![CRAN
Downloads](https://cranlogs.r-pkg.org/badges/grand-total/Canek)](https://cran.r-project.org/package=Canek)
<!-- badges: end -->

*Canek is an R package to correct batch effects from single-cell RNA-seq
biological replicates*.

### Motivation to develop Canek

As single-cell genomics technologies become mainstream, more
laboratories will perform experiments under different conditions with
biological replicates obtained using a common technology. In this
scenario, integration of datasets with minimal impact on cell phenotype
is essential.

### The workflow

Canek leverages information from mutual nearest neighbors to combine
local linear corrections with cell-specific non-linear corrections
within a fuzzy logic framework.

<p align="center">
<img src="man/figures/README-workflow.png" width="100%"/>
</p>
<style type="text/css">
    ol { list-style-type: upper-alpha; }
</style>

<font size="2">

> A. Canek starts with a reference batch and query batch, assuming a
> predominantly linear batch effect.
>
> B. Cell clusters are defined on the query batch and MNN pairs (arrows)
> are used to define batch effect observations.
>
> C. The MNN pairs from each cluster are used to estimate cluster
> specific correction vectors. These vectors can be used to correct the
> batch effect or, (D) a non-linear correction can be applied by
> calculating cell-specific correction vectors using fuzzy logic.

<font size="3">

### Results

Canek was the highest scored method in tests specifically designed to
assess **over-correction**, where Canek corrected batch effects without
distortion to the structures of cells as compared with a gold standard.

For more information about Canek check out our manuscript in [NAR
Genomics and Bioinformatics](https://doi.org/10.1093/nargab/lqac022).

## Usage

You can use Canek directly with *normalized-count matrices*, *Seurat*
objects or *SingleCellExperiment* objects. For more details, check out
our GitHub page and vignettes:

- [Canek website](https://martinloza.github.io/Canek/index.html)
- [Run Canek on a toy example
  vignette](https://martinloza.github.io/Canek/articles/toy_example.html)
- [Run Canek on Seurat objects
  vignette](https://martinloza.github.io/Canek/articles/seurat.html)
- [Run Canek on SingleCellExperiment objects
  vignette](https://martinloza.github.io/Canek/articles/SingleCellExperiment.html)
- [Best practices on batch effects
  correction](https://martinloza.github.io/Canek/articles/Best_practices_thymus.html)

## Installation

You can install the release version of Canek from
[CRAN](https://CRAN.R-project.org) with:

    install.packages("Canek")

You can install the development version from
[GitHub](https://github.com/) with:

``` r
# install.packages("remotes")
remotes::install_github("MartinLoza/Canek")
```

## Citation

If you use Canek in your research please cite our work using:

Loza M, Teraguchi S, Standley D, Diez D (2022). “Unbiased integration of
single cell transcriptome replicates.” *NAR Genomics and
Bioinformatics*, *4*(1), lqac022. <doi:10.1093/nargab/lqac022>
<https://doi.org/10.1093/nargab/lqac022>,
<https://martinloza.github.io/Canek/>.

Versions across snapshots

VersionRepositoryFileSize
0.2.5 rolling linux/jammy R-4.5 Canek_0.2.5.tar.gz 2.4 MiB
0.2.5 rolling linux/noble R-4.5 Canek_0.2.5.tar.gz 2.4 MiB
0.2.5 rolling source/ R- Canek_0.2.5.tar.gz 2.4 MiB
0.2.5 latest linux/jammy R-4.5 Canek_0.2.5.tar.gz 2.4 MiB
0.2.5 latest linux/noble R-4.5 Canek_0.2.5.tar.gz 2.4 MiB
0.2.5 latest source/ R- Canek_0.2.5.tar.gz 2.4 MiB
0.2.5 2026-04-26 source/ R- Canek_0.2.5.tar.gz 2.4 MiB
0.2.5 2026-04-23 source/ R- Canek_0.2.5.tar.gz 2.4 MiB
0.2.5 2025-04-20 source/ R- Canek_0.2.5.tar.gz 2.4 MiB

Dependencies (latest)

Imports

Suggests