aludi/dentist

aludi

Close assembly gaps using long-reads with focus on correctness.

下载次数: 0状态：社区镜像维护者：aludi仓库类型：镜像最近更新：5 年前

让 AI 帮你使用轩辕镜像？ · 展开查看说明 · 点击收起说明

如果你使用 DeepSeek、元宝 AI、Copilot、ChatGPT 等 AI 工具，协助配置轩辕镜像、编写 docker pull 命令、修改 Docker Compose 镜像地址、配置镜像加速、排查镜像拉取失败、分析报错日志等问题，请先让 AI 阅读并遵守轩辕镜像的规则文档。

只需在 AI 对话中先发送下面这句话即可：

请先完整阅读并严格遵守以下文档中的全部规则与要求：

https://xuanyuan.cloud/agents.md

在未充分阅读并理解该文档前，不要生成任何命令、配置、修改建议、故障排查方案或技术回答。后续所有输出都必须严格以该文档中的规范为最高优先级执行。

查看 agents.md 用法指南与完整示范。国内用户首推元宝 AI、DeepSeek 的深度思考模式，不推荐豆包 AI；Cursor 等编辑器可在对话 @ 该链接，或加入 User Rules。若 AI 无法访问外链，可打开说明文档复制全文粘贴。文档会随站点更新，复制内容可能过期，建议定期检查。

镜像标签列表与下载命令

DENTIST

([***] !Docker Image Version (latest semver) !Docker Cloud Build Status

Close assembly gaps using long-reads with focus on correctness.

Today, many genome sequencing project have been conducted using second-generation sequencers which produce short reads. Such assemblies have many gaps. dentist closes these gaps using a (small) set of long reads. Furthermore, it can be used to scaffold contigs freely using a set of long reads. This can be used to fix known scaffolding errors or to further scaffold output of a long-read assembly pipeline.

Install
Usage
Configuration
Citation
Maintainer
Contributing
License

Install

Use a Singularity Container (recommended)

Make sure Singularity is installed on your system. You can then use the container like so:

sh
# launch an interactive shell
singularity shell docker://aludi/dentist:latest

# execute a single command inside the container
singularity exec docker://aludi/dentist:latest dentist --version

# run the whole workflow on a cluster using Singularity
snakemake --configfile=snakemake.yml --use-singularity --profile=slurm

The last command is explained in more detail below in the usage section.

Use Pre-Built Binaries

Download the latest pre-built binaries from the https://github.com/a-ludi/dentist/releases and extract the contents. The tarball contains a dentist binary as well as the Snakemake workflow, example config files and this README. In short, everything you to run DENTIST.

Build from Source

Be sure to install the D package manager DUB. Install using either

sh
dub install dentist

sh
git clone https://github.com/a-ludi/dentist.git
cd dentist
dub build

Runtime Dependencies

The following software packages are required to run dentist:

https://github.com/thegenemyers/DAZZ_DB (>=2020-07-27)

Manage sequences (reads and assemblies) in 4bit encoding alongside auxiliary information such as masks or QV tracks
https://github.com/thegenemyers/DALIGNER (=2020-01-15)

Find significant local alignments.
https://github.com/thegenemyers/DAMAPPER (>=2020-03-10)

Find alignment chains, i.e. sequences of significant local alignments possibly with unaligned gaps.
https://github.com/thegenemyers/DAMASKER (>=2020-01-15)

Discover tandem repeats.
https://github.com/thegenemyers/DASCRUBBER (>=2020-07-26)

Estimate coverage and compute QVs.
daccord (>=v0.0.17)

Compute reference-based consensus sequence for gap filling.

Please see their own documentation for installation instructions. Note, the available packages on Bioconda are outdated and should not be used at the moment.

Please use the following versions in your dependencies in case you experience troubles:

snakemake@5.11.2
https://github.com/thegenemyers/DAZZ_DB/tree/d22ae58d32a663d09325699f***ccf8c6f93a0
https://github.com/thegenemyers/DALIGNER/tree/c2b47da6b3c94ed248a6be395c5b96a4e63b3f63
https://github.com/thegenemyers/DAMAPPER/tree/b2c9d7fd64bb4dd2dde7c69ff3cc8a04cbeeebbc
https://github.com/thegenemyers/DAMASKER/tree/22139ff1c2b2c0ff2589fbc9cc948370be799827
https://github.com/thegenemyers/DASCRUBBER/tree/a53dbe879a716e7b08338f397de5a0403637641e
daccord@0.0.17

Usage

Suppose we have the genome assembly reference.fasta that is to be updated and a set of reads reads.fasta with 25× coverage.

Quick execution with Snakemake (and Singularity)

TL;DR

snakemake --configfile=snakemake.yml --use-singularity --profile=slurm

Install Snakemake version >=5.11.2 and copy these files into your working directory:

./snakemake/Snakefile
./snakemake/workflow_helper.py
./snakemake/snakemake.example.yml → ./snakemake/snakemake.yml

Next edit snakemake.yml to fit your needs and optionally test your configuration with

snakemake --configfile=snakemake.yml -- extend_dentist_config

If no errors occurred the whole workflow can be executed using

snakemake --configfile=snakemake.yml

For small genomes of a few 100 Mbp this should run on a regular workstation. One may use Snakemake's --jobs to run independent jobs in parallel. Larger data sets may require a cluster in which case you can use Snakemake's cloud or cluster facilities.

Executing on a Cluster

To make execution on a cluster easy DENTIST comes with examples files to make Snakemake use SLURM via DRMAA. Please read the documentation of Snakemake if this does not suit your needs. Another good starting point is https://github.com/snakemake-profiles/doc.

Start by copying these files to your working directory:

./snakemake/profile-slurm.yml → ~/.config/snakemake/<profile>/config.yaml
./snakemake/cluster.example.yml → ./snakemake/cluster.yml

Next adjust the profile according to your cluster. This should enable Snakemake to submit and track jobs on your cluster. You may use the configuration values specified in cluster.yml to configure job names and resource allocation for each step of the pipeline. Now, submit the workflow to your cluster by

snakemake --configfile=snakemake.yml --profile=<profile>

Note, parameters specified in the profile provide default values and can be overridden by specififying different value on the CLI.

Manual execution

Please inspect the Snakemake workflow to get all the details. It might be useful to execute Snakemake with the -p switch which causes Snakemake to print the shell commands. If you plan to write your own workflow management for DENTIST please feel free to contact the maintainer!

Configuration

DENTIST comprises a complex pipeline of with many options for tweaking. This section points out some important parameters and their effect on the result.

How to Choose DENTIST Parameters

The following list comprises the important/influential parameters for DENTIST itself. Please keep in mind that the alignments generated by daligner/damapper have immense influence on the performance of DENTIST.

--max-insertion-error: Strong influence on quality and sensitivity. Lower values lead to lower sensitivity but higher quality. The maximum recommended value is 0.05.
--min-anchor-length: Higher values results in higher accuracy but lower sensitivity. Especially, large gaps cannot be closed if the value is too high. Usually the value should be at least 500 and up to 10_000.
--reference-error, --reads-error: Determines the -e parameter for daligner/damapper. Use dentist generate-dazzler-options to see the effect of these parameters or consult commandline.d in the source code.
--min-reads-per-pile-up: Choosing higher values for the minimum number of reads drastically reduces sensitivity but has little effect on the quality. Small values may be chosen to get the maximum sensitivity in de novo assemblies. Make sure to throughly validate the results though.
--min-spanning-reads: Higher values give more confidence on the correctness of closed gaps but reduce sensitivity. The value must be well below the expected coverage.
--allow-single-reads: May be used under careful ***ation. This is intended for one of the following scenarios:
1. DENTIST is meant to close as many gaps as possible in a de novo assembly. Then the closed gaps must validated by other means afterwards.
2. DENTIST is used not with real reads but with an independent assembly.
--existing-gap-bonus: If DENTIST finds evidence to join two contigs that are already consecutive in the input assembly (i.e. joined by Ns) then it will preferred over conflicting joins (if present) with this bonus. The default value is rather conservative, i.e. the preferred join almost always wins over other joins in case of a conflict.
--join-policy: Choose according to your needs:
- scaffoldGaps: Closes only gaps that are marked by Ns in the assembly. This is the default mode of operation. Use this if you do not want to alter the scaffolding of the assembly. See also --existing-gap-bonus.
- scaffolds: Allows whole scaffolds to be joined in addition to the effects of scaffoldGaps. Use this if you have (many) scaffolds that are not yet full chromosome-scale.
- contigs: Allows contigs to be rearranged freely. This is especially useful in de novo assemblies before applying any other scaffolding methods as it increases the contiguity thus increasing the chance that large-scale scaffolding (e.g. Bionano or Hi-C) finds proper joins.

Choosing the Read Type

In the examples PacBio long reads are assumed but DENTIST can be run using any kind of long reads. Currently, this is either PacBio or Oxford Nanopore reads. For using none-PacBio reads, the reads_type in snakemake.yml must be set to anything other than PACBIO_SMRT. The recommendation is to use OXFORD_NANOPORE for Oxford Nanopore. These names are borrowed from the NCBI. Further details on the rationale can found in https://github.com/a-ludi/dentist/issues/1#issuecomment-610764625.

Citation

Arne Ludwig, Martin Pippel, Gene Myers, Michael Hiller. DENTIST – close assembly gaps with high confidence. In preparation.

Maintainer

Dentist is being developed by Arne Ludwig <<***>> at the Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany.

Contributing

Contributions are warmly welcome. Just create an https://github.com/a-ludi/dentist/issues or https://github.com/a-ludi/dentist/pulls on GitHub. If you submit a pull request please make sure that:

the code compiles on Linux using the current release of dmd,
your code is covered with unit tests (if feasible) and
dub test runs successfully.

It is recommended to install the Git hooks included in the repository to avoid premature pull requests. You can enable all shipped hooks with this command:

sh
git config --local core.hooksPath .githooks/

If you do not want to enable just a subset use ln -s .githooks/{hook} .git/hooks. If you want to audit code changes before they get executed on your machine you can you cp .githooks/{hook} .git/hooks instead.

License

This project is licensed under MIT License (see LICENSE).

镜像拉取方式

您可以使用以下命令拉取该镜像。请将 <标签> 替换为具体的标签版本。如需查看所有可用标签版本，请访问标签列表页面。

轩辕镜像加速拉取命令点我查看更多 dentist 镜像标签

docker pull docker.xuanyuan.run/aludi/dentist:<标签>

使用方法：

DockerHub 原生拉取命令

docker pull aludi/dentist:<标签>

轩辕镜像配置手册

按平台快速找到配置文档

Docker

登录仓库拉取

登录认证 · 私有仓库

专属域名拉取

免登录 · 高速拉取

Linux

Docker 镜像配置

Windows / Mac

Docker Desktop 配置

MacOS OrbStack

OrbStack 容器

Docker Compose

Compose 项目配置

NAS

群晖

Synology 配置

飞牛

fnOS 镜像配置

绿联

绿联 NAS

威联通

QNAP 配置

极空间

极空间 NAS

企业仓库

其他仓库

ghcr · Quay · nvcr

Harbor 镜像源

Proxy Repository 对接

Portainer 镜像源

Registries 配置

Nexus 镜像源

Docker Proxy 缓存

开发工具

Dev Containers

VS Code 开发容器

Podman

Podman 配置指南

Singularity / Apptainer

HPC 科学计算容器

Kubernetes

K8s Containerd

Kubernetes · Containerd

K3s

轻量级集群

面板 / 网络

爱快路由

iKuai 镜像加速

宝塔面板

一键配置镜像源

AI

用 AI 使用轩辕镜像

agents.md · AI 对话 · 提示词

一键安装

一键安装 Docker

Linux Docker 一键安装

需要其他帮助？请查看我们的常见问题Docker 镜像访问常见问题解答或提交工单

镜像拉取常见问题

功能

免费版与专业版区别

功能对比 · 版本选择

支持的镜像仓库

Docker Hub · GCR · GHCR

新手拉取配置

docker search 限制

专属域名 · Hub 搜索

不支持 push

仅支持 pull · 不支持

拉取速度原因

带宽 · 缓存 · 冷热镜像

错误码

402 与流量用尽

402 · 流量包 · 充值

401 认证失败

401 · docker login

manifest unknown

标签错误 · 镜像不存在

410 Gone 排查

410 · Docker 升级

429 限流

免费版 · 请求频率

其他报错

DNS 超时

DNS 解析 · 网络超时

TLS 证书失败

no matching manifest（架构）

账号

失败是否计费

manifest · blob · 计费

申请开发票（企业 / 个人）

企业 · 个人 · 工单

修改登录密码

网站 · 仓库 · 重置

注销账户

工单 · 数据 · 注销

原理

mirrors 不生效

daemon.json · 重启

去掉域名前缀

docker tag · 重命名

指定架构拉取

ARM64 · AMD64 · 多架构

latest 与「最新」

digest · 版本号 · 标签

查看全部问题→

用户好评

来自真实用户的反馈，见证轩辕镜像的优质服务

oldzhang

运维工程师

Linux服务器

"Docker访问体验非常流畅，大镜像也能快速完成下载。"

aludi/dentist

Close assembly gaps using long-reads with focus on correctness.

DENTIST

Table of Contents

Install

Use a Singularity Container (recommended)

Use Pre-Built Binaries

Build from Source

Runtime Dependencies

Usage

Quick execution with Snakemake (and Singularity)

Manual execution

Configuration

Citation

Maintainer

Contributing

License

镜像拉取方式

轩辕镜像加速拉取命令点我查看更多 dentist 镜像标签

DockerHub 原生拉取命令

更多 dentist 镜像推荐

valudio/docker-sshpass

查看更多 dentist 相关镜像

轩辕镜像配置手册

Docker

登录仓库拉取

专属域名拉取

Linux

Windows / Mac

MacOS OrbStack

Docker Compose

NAS

群晖

飞牛

绿联

威联通

极空间

企业仓库

其他仓库

Harbor 镜像源

Portainer 镜像源

Nexus 镜像源

开发工具

Dev Containers

Podman

Singularity / Apptainer

Kubernetes

K8s Containerd

K3s

面板 / 网络

爱快路由

宝塔面板

AI

用 AI 使用轩辕镜像

一键安装

一键安装 Docker

镜像拉取常见问题

功能

免费版与专业版区别

支持的镜像仓库

新手拉取配置

docker search 限制

不支持 push

拉取速度原因

错误码

402 与流量用尽

401 认证失败

manifest unknown

410 Gone 排查

429 限流

其他报错

DNS 超时

TLS 证书失败

no matching manifest（架构）

账号

失败是否计费

申请开发票（企业 / 个人）

修改登录密码

注销账户

原理