
如果你使用 DeepSeek、元宝 AI、Copilot、ChatGPT 等 AI 工具,协助配置轩辕镜像、编写 docker pull 命令、修改 Docker Compose 镜像地址、配置镜像加速、排查镜像拉取失败、分析报错日志等问题,请先让 AI 阅读并遵守轩辕镜像的规则文档。
只需在 AI 对话中先发送下面这句话即可:
请先完整阅读并严格遵守以下文档中的全部规则与要求:
https://xuanyuan.cloud/agents.md
在未充分阅读并理解该文档前,不要生成任何命令、配置、修改建议、故障排查方案或技术回答。后续所有输出都必须严格以该文档中的规范为最高优先级执行。查看 agents.md 用法指南与完整示范。国内用户首推 元宝 AI、DeepSeek 的深度思考模式,不推荐豆包 AI;Cursor 等编辑器可在对话 @ 该链接,或加入 User Rules。 若 AI 无法访问外链,可 打开说明文档 复制全文粘贴。文档会随站点更新,复制内容可能过期,建议定期检查。
!https://github.com/bch-gnome/hpds_annotation/workflows/Docker%20Image%20CI/badge.svg
This repository describes steps to prepare and annotate VCF files for loading into HPDS as in https://github.com/hms-dbmi/pic-sure-hpds-copdgene.
The input VCF file needs to be normalized for variant representaiton: to split multi-allelic variants into separate lines in VCF. There are various ways eeto do this, but one of the way is to use bcftools (http://samtools.github.io/bcftools/).
bcftools norm -m -any -f [fasta file for referenge genome] [input VCF file] | bgzip -c > [normalized VCF file]
Also make the normalized VCF file tabix-indexed to enable fast-accessing during annotation steps.
tabix -p vcf [normalized VCF file]
(tabix is part of htslib. http://www.htslib.org/download/)
Here, we use VEP as Docker container, without direct installation to local system. Before using VEP, we need to:
docker run --rm -it -v [local VEP cache directory]:/cache -v [directory for other files as needed]:/data -v [directory for input VCF file]:/work ensemblorg/ensembl-vep /opt/vep/vep \ --cache --offline --merged \ --species homo_sapiens \ --compress_output bgzip \ --input_file /work/[input VCF path/filename] \ --output_file /work/[annotated VCF path/filename] \ --no_stats \ --force_overwrite \ --assembly GRCh38 (or GRCh37) \ --dir_cache /cache/ \ --fasta /cache/[to reference genome fasta file] \ --everything \ --total_length \ --allele_number \ --hgvsg \ --shift_hgvs 1 \ --transcript_version \ --canonical \ --vcf \ --flag_pick
For more detail, refer to Ensembl VEP documentation: https://ensembl.org/info/docs/tools/vep/script/vep_download.html#installer.
The python script "transform_csq.v2.py" removes complex and bulky VEP annotation from VCF file and leaves only the following informations, reformatted for loading into HPDS.
The VEP annotation field in VCF can vary by exact options used in VEP annotation. The script can detect VEP annotation format from the header line in the VCF file, if it follows the style "##INFO=<ID=CSQ... Format: ...>."
python transform_csq.v2.py [options] [VEP annotated VCF path/filename] [new filename]
or use docker image.
docker run --rm -it -v [directory for input VCF file]:/work ikarus97/hpds_annotation:latest \ python /transform_csq.v2.py [options] /work/[input VCF path/filename] /work/[output VCF path/filename]
Docker image for the main script transform_csq.v2.py is available from https://hub.docker.com/r/ikarus97/hpds_annotation.
Image with the tag latest contains the most up-to-date version.
Previous versions are archived with the creation dates as tags.
--pick
If present, use only the most severe consequences from VEP annotation (flagged as 'PICK', by VEP option --flag_pick)
--cds
If present, use only the variants in coding sequence (CDS). Specifically, this option will keep only variants whose rate of variant impact by VEP (https://ensembl.org/info/genome/variation/prediction/predicted_data.html) is not "MODIFIER."
--vep-gnomad-af <string>
Specify which field in VEP annotation will be extracted for gnomAD allele frequency. Use this if custom file (e.g., gnomAD genomes file) is used for gnomAD allele frequency.
Default value: gnomAD_AF
For example, if you want to use gnomAD genome allele frequency from the following VEP argument:
--custom /path/to/custom/file.vcf.gz,CUSTOM_TAG,vcf,exact,0,My_Field
then, add --vep-gnomad-af CUSTOM_TAG_My_Field to options to use the value of "My_Field" as gnomAD allele frequency.
--allow-modifier
If present, output "Variant_severity" for variants that are "MODIFIER". As of 2021-04-23, by default, such variants do not have "Variant_severity" in the INFO column to reduce overhead in HPDS.
--allow-modifier to output tags for "MODIFIER" variants.您可以使用以下命令拉取该镜像。请将 <标签> 替换为具体的标签版本。如需查看所有可用标签版本,请访问 标签列表页面。
来自真实用户的反馈,见证轩辕镜像的优质服务