Canek
Batch Correction of Single Cell Transcriptome Data
Non-linear/linear hybrid method for batch-effect correction that uses Mutual Nearest Neighbors (MNNs) to identify similar cells between datasets. Reference: Loza M. et al. (NAR Genomics and Bioinformatics, 2020) <doi:10.1093/nargab/lqac022>.
README
<!-- README.md is generated from README.Rmd. Please edit that file -->
<!--{width="100"}<!-- -->
<p align="center">
<img src="man/figures/README-logo.png" width="50%" />
</p>
# Canek
<!-- badges: start -->
[](https://github.com/MartinLoza/Canek/actions/workflows/R-CMD-check.yaml)
[](https://cran.r-project.org/package=Canek)
[](https://cran.r-project.org/package=Canek)
[](https://cran.r-project.org/package=Canek)
<!-- badges: end -->
*Canek is an R package to correct batch effects from single-cell RNA-seq
biological replicates*.
### Motivation to develop Canek
As single-cell genomics technologies become mainstream, more
laboratories will perform experiments under different conditions with
biological replicates obtained using a common technology. In this
scenario, integration of datasets with minimal impact on cell phenotype
is essential.
### The workflow
Canek leverages information from mutual nearest neighbors to combine
local linear corrections with cell-specific non-linear corrections
within a fuzzy logic framework.
<p align="center">
<img src="man/figures/README-workflow.png" width="100%"/>
</p>
<style type="text/css">
ol { list-style-type: upper-alpha; }
</style>
<font size="2">
> A. Canek starts with a reference batch and query batch, assuming a
> predominantly linear batch effect.
>
> B. Cell clusters are defined on the query batch and MNN pairs (arrows)
> are used to define batch effect observations.
>
> C. The MNN pairs from each cluster are used to estimate cluster
> specific correction vectors. These vectors can be used to correct the
> batch effect or, (D) a non-linear correction can be applied by
> calculating cell-specific correction vectors using fuzzy logic.
<font size="3">
### Results
Canek was the highest scored method in tests specifically designed to
assess **over-correction**, where Canek corrected batch effects without
distortion to the structures of cells as compared with a gold standard.
For more information about Canek check out our manuscript in [NAR
Genomics and Bioinformatics](https://doi.org/10.1093/nargab/lqac022).
## Usage
You can use Canek directly with *normalized-count matrices*, *Seurat*
objects or *SingleCellExperiment* objects. For more details, check out
our GitHub page and vignettes:
- [Canek website](https://martinloza.github.io/Canek/index.html)
- [Run Canek on a toy example
vignette](https://martinloza.github.io/Canek/articles/toy_example.html)
- [Run Canek on Seurat objects
vignette](https://martinloza.github.io/Canek/articles/seurat.html)
- [Run Canek on SingleCellExperiment objects
vignette](https://martinloza.github.io/Canek/articles/SingleCellExperiment.html)
- [Best practices on batch effects
correction](https://martinloza.github.io/Canek/articles/Best_practices_thymus.html)
## Installation
You can install the release version of Canek from
[CRAN](https://CRAN.R-project.org) with:
install.packages("Canek")
You can install the development version from
[GitHub](https://github.com/) with:
``` r
# install.packages("remotes")
remotes::install_github("MartinLoza/Canek")
```
## Citation
If you use Canek in your research please cite our work using:
Loza M, Teraguchi S, Standley D, Diez D (2022). “Unbiased integration of
single cell transcriptome replicates.” *NAR Genomics and
Bioinformatics*, *4*(1), lqac022. <doi:10.1093/nargab/lqac022>
<https://doi.org/10.1093/nargab/lqac022>,
<https://martinloza.github.io/Canek/>.
Versions across snapshots
| Version | Repository | File | Size |
|---|---|---|---|
0.2.5 |
rolling linux/jammy R-4.5 | Canek_0.2.5.tar.gz |
2.4 MiB |
0.2.5 |
rolling linux/noble R-4.5 | Canek_0.2.5.tar.gz |
2.4 MiB |
0.2.5 |
rolling source/ R- | Canek_0.2.5.tar.gz |
2.4 MiB |
0.2.5 |
latest linux/jammy R-4.5 | Canek_0.2.5.tar.gz |
2.4 MiB |
0.2.5 |
latest linux/noble R-4.5 | Canek_0.2.5.tar.gz |
2.4 MiB |
0.2.5 |
latest source/ R- | Canek_0.2.5.tar.gz |
2.4 MiB |
0.2.5 |
2026-04-26 source/ R- | Canek_0.2.5.tar.gz |
2.4 MiB |
0.2.5 |
2026-04-23 source/ R- | Canek_0.2.5.tar.gz |
2.4 MiB |
0.2.5 |
2025-04-20 source/ R- | Canek_0.2.5.tar.gz |
2.4 MiB |