
title: "UniverSC: Single-cell processing across technologies"
author: "S. Thomas Kelly^1†^, Kai Battenberg^1,2†^, Makoto Hayashi^2^, Aki Minoda^1^
^1^ RIKEN Center for Integrative Medical Sciences, Suehiro-cho-1-7-22, Tsurumi Ward, Yokohama
^2^ RIKEN Center for Sustainable Resource Sciences, Suehiro-cho-1-7-22, Tsurumi Ward, Yokohama
† These authors contributed equally to this work"
affiliations:
!Docker Manual build !Docker Cloud Build !Docker Cloud Status !Docker Stars !Docker Pulls
!Docker Image Version (tag latest semver) !MicroBadger Layers (latest) !Docker Image Size (v1.1.3) !Docker Image Version (latest by date) !MicroBadger Layers (latest) !Docker Image Size (latest)
!GitHub branch checks state !GitHub Release Date !GitHub last commit (branch) !GitHub issues !GitHub pull requests
 !GitHub release (latest by date) !GitHub release (by tag)
!https://github.com/minoda-lab/universc/workflows/CI%20to%20Docker%20hub/badge.svg !https://github.com/minoda-lab/universc/workflows/Docker%20compose%20build/badge.svg !https://github.com/minoda-lab/universc/workflows/Docker%20container%20tests/badge.svg !https://github.com/minoda-lab/universc/workflows/Docker%20build%20image/badge.svg
!https://github.com/minoda-lab/universc/workflows/Run%20all%20tests%20in%20Docker/badge.svg !https://github.com/minoda-lab/universc/workflows/Test%2010x%20Genomics/badge.svg !https://github.com/minoda-lab/universc/workflows/Test%20DropSeq%20%2F%20Nadia/badge.svg !https://github.com/minoda-lab/universc/workflows/Test%20ICELL8/badge.svg !https://github.com/minoda-lab/universc/workflows/Test%20SCI%2DSeq/badge.svg !https://github.com/minoda-lab/universc/workflows/Test%20inDrops%20v3/badge.svg !https://github.com/minoda-lab/universc/workflows/Test%20Smart%2DSeq3/badge.svg
Single-cell processing across technologies
Summary
Single-cell RNA-sequencing analysis to quantify RNA molecules in individual cells has become popular owing to the large amount of information one can obtain from each experiment. UniverSC is a universal single-cell processing tool that supports any UMI-based platform. Our command-line tool enables consistent and comprehensive integration, comparison, and evaluation across data generated from a wide range of platforms. Here we provide a guide to install and use this tool to process single-cell RNA-Seq data from FASTQ format.
Package
UniverSC version 1.1.3
Maintainers
Tom Kelly^†^ (RIKEN IMS) and Kai Battenberg^†^ (RIKEN CSRS/IMS)
† These authors contributed equally to this work
Contact: <first name>.<family name>[at]riken.jp
Disclaimer: we are third party developers not affiliated with 10X Genomics or any other vendor of single-cell technologies. We are releasing this code on an open-source license which calls Cell Ranger™ as an external dependency.
If you have cellranger already installed, then all you need to do is clone or download this git repository. You can then run the script in this directory or add it your PATH. See the Quick Start guide below.
If you wish to install cellranger and configure this script to run on a Linux environment, we provide details on installation below. Note that launch_universc.sh requires write-access a Cell Ranger installation so it needs to be installed in a user's "home" directory on a server. No admin powers needed!
Note that cellranger installations that are pre-compiled on Linux will not run on Mac or Windows. Note that Mac OS and some Linux distributions also have different version of sed and rename. It is possible to compile an open-source version of Cell Ranger but it is tricky to install the dependencies so we recommend using our docker image if you wish to do this.
If you are a beginner bioinformatician or wish to run this on a local computer (Mac or Windows), no problem! We provide a "docker" image containing everything needed to run it without installing the software needed. All you need to do is install docker and follow our guide to use the image. This comes bundled with all the compatible versions needed to run it.
Note that you need to run the shell commands given in a unix-like command-line interface (the "Terminal" application on Mac or Linux systems). Many shells are supported but we recommend the "bash" shell for beginners (this is the default on most systems). Windows 10 includes a subsystem to run bash. If this is too complicated, you can open a Linux environment (Ubuntu) in docker by following our instructions. Then you can enter bash commands into the terminal opened by docker.
If you run into problems installing or running launch_universc.sh please don't hesistate to contact us via email or GitHub.
We've developed a bash script that will run Cell Ranger on FASTQ files for these technologies. See below for details on how to use it.
If you use this tool, please cite to acknowledge the efforts of the authors. You can report problems and request new features to the maintainers with and issue on GitHub. Details on how to install and run are provided below. Please see the help and examples to try solve your problem before submitting an issue.
Details on the Docker image are given below. We recommend using Docker unless you have a server environment with Cell Ranger installed already.
In principle, any technology with a cell barcode and unique molecular identifier (UMI) can be supported.
The following technologies have been tested to ensure that they give the expected results: 10x Genomics, Nadia (DropSeq), ICELL8 version 3
We provide the following preset configurations for convenience based on published data and configurations used by other pipelines (e.g, DropSeqPipe and Kallisto/Bustools). To add further support for other technologies or troubleshoot problems, please submit an Issue to the GitHub repository: https://github.com/minoda-lab/universc/issues as described in Bug Reports below.
Some changes to the Cell Ranger install are required to run other technologies. Therefore we provide settings for 10x Genomics which restores settings for the Chromium instrument. We therefore recommend using UniverSC for processing all data from different technologies as the tool manages these changes. Please note that on a single install of Cell Ranger, multiple technologies or multiple samples of the same technology with different whitelist barcodes cannot be run cannot be run simultaneousely (the tool will also check for this to avoid causing problems with existing runs). Multiple samples of the same technology with the same barcode whitelist can be run simultaneously.
If you are using UniverSC you should also do so to run 10x Genomics data. If you wish to restore Cell Ranger to
default settings, see the installation or troubleshooting sections below.
Pre-set configurations
Chemistry settings available
All technologies support 3′ single-cell RNA-Seq. Barcode adjustments and
whitelists are changed automatically. For 5′ single-cell RNA-Seq, this
is only supported for 10x Genomics version 2 chemistry, ICELL8,
Smart-Seq, and STRT-Seq.
For 10x Genomics, this is detected automatically but can be
configured with the --chemistry argument.
For other technologies, the template switching oligonucleotide
is automatically converted to the match the 10x sequence.
Support for UMI-based and non-UMI technologies
By default, UMIs are supported where available so with the following exceptions for non-UMI technologies: ICELL8 v2, RamDA-Seq, Quartz-Seq, Smart-Seq, Smart-Seq2. While using UMI is recommended we provide a mock UMI for counting reads for these technologies (and data from previous versions).
Other techniques can be forced to replace the UMI with a mock sequence
for counting reads only with --non-umi or --read-only arguments.
Forcing non-UMI techniques is not recommended unless you are
integrating non-UMI and UMI-based technologies. It is not necessary
to specific --non-umi for non-UMI techniques as these will be used
automatically when applicable. For ICELL8 and Smart-Seq where both
non-UMI (icell8-v2, smartseq2) and UMI-based (icell8-v3, smartseq3)
techniques are available it is possible to specify which to use.
Single and dual indexed technologies
Where needed the cell barcode can be detected in the index I1 or I2 file. Single indexes are supported for STRT-Seq and Quartz-Seq. Dual indexes are supported for Fluidigm C1, ICELL8 full-length, inDrops-v3, RamDA-Seq, SCI-RNA-Seq, scifi-seq, and Smart-Seq. Combinatorial indexing technologies have linkers between barcodes removed automatically to match the barcode whitelist.
Demultiplexing for dual-indexing
For dual-indexed technologies such as Fluidigm C1, inDrops-v3, Sci-Seq, SmartSeq3 it is advised to use "bcl2fastq" before calling UniverSC:
/usr/local/bin/bcl2fastq -v --runfolder-dir "/path/to/illumina/bcls" --output-dir "./Data/Intensities/BaseCalls"\ --sample-sheet "/path/to/SampleSheet.csv" --create-fastq-for-index-reads\ --use-bases-mask Y26n,I8n,I8n,Y50n --mask-short-adapter-reads 0\ --minimum-trimmed-read-length 0
Please adjust the lengths for --use-bases-mask accordingly for read 1, index 1 (i7), index 2 (i5), and read 2.
Ensure that --create-fastq-for-index-read is used where possible.
Using --no-lane-splitting is optional as UniverSC can process an arbirtary number of lanes.
There is no need to specify index sequences in the same sheet for cell barcodes, using "NNNNNNNN" will match all
samples and the cell barcodes will be distinguished by the single-cell processing pipeline. Index sequences should
only be used to demultiplex samples and replicates (not cells).
Missing index sequences
If a sequencing facility has demultiplexed the samples for you without this, UniverSC will attempt to extract index sequences from FASTQ headers in read 1. If index sequences are not stored in the file headers and samples have already been demultiplexed, a dummy index file of the same number of reads as R1 and R2 will be required. As a workaroudn, you can generate this by copying the R1 and R2 files and replacing the sequences with the first barcode in the relevant whitelist. For example:
index1="TAAGGCGA" index2="AAGGAGTA" # create new files cp R1_file.fastq I1_file.fastq cp R2_file.fastq I2_file.fastq # replace sequences sed -i "2~4s/^.*$/${index1}/g" I1_file.fastq sed -i "2~4s/^.*$/${index2}/g" I2_file.fastq # replace quality scores sed -i "4~4s/^.*$/IIIIIIII/g" I1_file.fastq I2_file.fastq
This results in a new "sample index" for each demultiplexed sample. To combine demultiplexed sampls for dual indexed techniques use the following:
# for fastq files cat Sample1_R1_file.fastq Sample2_R1_file.fastq Sample3_R1_file.fastq > Combined_R1_file.fastq cat Sample1_R2_file.fastq Sample2_R2_file.fastq Sample3_R2_file.fastq > Combined_R2_file.fastq cat Sample1_I1_file.fastq Sample2_I1_file.fastq Sample3_I1_file.fastq > Combined_I1_file.fastq cat Sample1_I2_file.fastq Sample2_I2_file.fastq Sample3_I2_file.fastq > Combined_I2_file.fastq # for compressed files (not need to uncompress) cat Sample1_R1_file.fastq.gz Sample2_R1_file.fastq.gz Sample3_R1_file.fastq.gz > Combined_R1_file.fastq.gz cat Sample1_R2_file.fastq.gz Sample2_R2_file.fastq.gz Sample3_R2_file.fastq.gz > Combined_R2_file.fastq.gz cat Sample1_I1_file.fastq.gz Sample2_I1_file.fastq.gz Sample3_I1_file.fastq.gz > Combined_I1_file.fastq.gz cat Sample1_I2_file.fastq.gz Sample2_I2_file.fastq.gz Sample3_I2_file.fastq.gz > Combined_I2_file.fastq.gz
As this needs to done on a case-by-case basis it has not been implemented by the UniverSC core functions. We provide this workaround for using published data and data already processed by sequencing facilities. Please contact the maintainers or file an issue on GitHub if you are having problems with this case.
Custom inputs
Custom inputs are also supported by giving the name "custom" and length of barcode and UMI separated by a "_" character.
e.g. Custom (16bp barcode, 10bp UMI): custom_16_10
Custom barcode files are also supported for preset technologies. These are particularly useful for well-based technologies to demutliplex based on the wells.
Note that custom inputs do not remove linker or adapter sequences for combinatorial indexng technologies. These must be removed from the Read 1 file before running UniverSC. To request a preset technology setting instead, please submit a feature request on GitHub as described below.
This tool will be released open-source (see legal stuff below). We welcome any feedback on it and any contributions to improve it. Hopefully it will save people time by making it easier to compare technologies.
We have tested it on several technologies but we need users like you to let us know how we can improve it. We hope that it will save you time by handing tedious parts of data formatting so that you can focus on the results.
Please cite our publication when you use our software as follows:
Battenberg, K., Kelly, S.T., Ras, R.A., Hetherington, N.A., Hayashi, K., and Minoda, A. (2022) A flexible cross-platform single-cell data processing pipeline. Nat Commun 13(1): 1-7. [***]
@Article{pmid36369450, author="Battenberg, K. and Kelly, S. T. and Ras, R. A. and Hetherington, N. A. and Hayashi, M. and Minoda, A. ", title="{{A} flexible cross-platform single-cell data processing pipeline}", journal="Nat Commun", year="2022", volume="13", number="1", pages="1-7", month="Nov", note = {https://github.com/minoda-lab/universc package version 1.2.4}, URL = {https://doi.org/10.1038/s41467-022-34681-z} }
The preprint can also be found here:
Kelly, S.T., Battenberg, Hetherington, N.A., K., Hayashi, K., and Minoda, A. (2021) UniverSC: a flexible cross-platform single-cell data processing pipeline. bioRxiv 2021.01.19.427209; doi: [***] package version 1.1.3. https://github.com/minoda-lab/universc
@article {Kelly2021.01.19.427209, author = {Kelly, S. Thomas and Battenberg, Kai and Hetherington, Nicola A. and Hayashi, Makoto and Minoda, Aki}, title = {{UniverSC}: a flexible cross-platform single-cell data processing pipeline}, elocation-id = {2021.01.19.427209}, year = {2021}, doi = {10.11.1.3021.01.19.427209}, publisher = {Cold Spring Harbor Laboratory}, abstract = {Single-cell RNA-sequencing analysis to quantify RNA molecules in individual cells has become popular owing to the large amount of information one can obtain from each experiment. We have developed UniverSC (https://github.com/minoda-lab/universc), a universal single-cell processing tool that supports any UMI-based platform. Our command-line tool enables consistent and comprehensive integration, comparison, and evaluation across data generated from a wide range of platforms.Competing Interest StatementThe authors have declared no competing interest.}, eprint = {https://www.biorxiv.org/content/early/2021/01/19/2021.01.19.427209.full.pdf}, journal = {{bioRxiv}}, note = {package version 1.1.3}, URL = {https://github.com/minoda-lab/universc}, }
The software can also be directly cited as a manual:
@Manual{, title = {{UniverSC}: a flexible cross-platform single-cell data processing pipeline}, author = {S. Thomas Kelly, Kai Battenberg, Nicola A. Hetherington, Makoto Hayashi, and Aki Minoda}, year = {2021}, note = {package version 1.1.3}, url = {https://github.com/minoda-lab/universc}, }
Reporting issues
To add further support for other technologies or troubleshoot problems, please submit an Issue to the GitHub repository: https://github.com/minoda-lab/universc/issues
Please submit https://github.com/minoda-lab/universc/issues on GitHub to report
problems or suggest features. https://github.com/minoda-lab/universc/pulls
to the dev branch on GitHub are also welcome to add features or correct problems. Please see
the contributor guide for more details.
Where possible, please provide an minimal example of the first few lines of each FASTQ file for testing purposes.
It is also helpful to describe the technology, such as:
Technologies that may be difficult to support are those with:
Please bear this in mind when submitting requests. We will *** to add further technologies but it could take significant resources to add support for techniques with these designs. Note that updates to the tool have added support for several examples of these.
This script requires Cell Ranger to be installed and exported to the PATH (version 3.0.0 or higher recommended). The script itself is exectuable and does not require installation to run but you can put it in your PATH or bin of your Cell Ranger install if you wish to do so. We provide scripts to do this for your convenience.
See the details below on how set up Cell Ranger and launch_universc.sh.
Download UniverSC
To download UniverSC open a terminal prompt and enter the following commands.
cd $HOME/Downloads git clone https://github.com/minoda-lab/universc.git cd universc
If you already have Cell Ranger installed, then you can run the script without installing it.
bash launch_universc.sh
You can call it in another directory by giving the path to the script.
cd $/HOME/my_project bash $HOME/Downloads/universc/launch_universc.sh
See the details below on how to install Ce
您可以使用以下命令拉取该镜像。请将 <标签> 替换为具体的标签版本。如需查看所有可用标签版本,请访问 标签列表页面。
探索更多轩辕镜像的使用方法,找到最适合您系统的配置方式
通过 Docker 登录认证访问私有仓库
无需登录使用专属域名
Kubernetes 集群配置 Containerd
K3s 轻量级 Kubernetes 镜像加速
VS Code Dev Containers 配置
Podman 容器引擎配置
HPC 科学计算容器配置
ghcr、Quay、nvcr 等镜像仓库
Harbor Proxy Repository 对接专属域名
Portainer Registries 加速拉取
Nexus3 Docker Proxy 内网缓存
需要其他帮助?请查看我们的 常见问题Docker 镜像访问常见问题解答 或 提交工单
docker search 限制
站内搜不到镜像
离线 save/load
插件要用 plugin install
WSL 拉取慢
安全与 digest
新手拉取配置
镜像合规机制
manifest unknown
no matching manifest(架构)
invalid tar header(解压)
TLS 证书失败
DNS 超时
域名连通性排查
410 Gone 排查
402 与流量用尽
401 认证失败
429 限流
D-Bus 凭证提示
413 与超大单层
来自真实用户的反馈,见证轩辕镜像的优质服务