BBTools是用于DNA和RNA测序reads的比对工具,大多数小工具基于shell、java组成,可以处理像Illumina,454,Sanger,Ion Torrent,Pac Bio和Nanopore等数据,支持fastq,fasta,sam,scarf,fasta + qual,压缩或raw文件格式,几乎涵盖了我们常用的诸多操作,BBTools也是开源的,可以无限制免费使用,总体来说常常处理测序数据的小伙伴值得拥有。
BBTools套件包含如下工具:
- BBDuk
- BBMap
- BBMask
- BBMerge
- BBNorm
- CalcUniqueness
- Clumpify
- Dedupe
- Reformat
- Repair
- Seal
- Split Nextera
- Statistics
- Tadpole
- Taxonomy
安装BBTools:
# 下载地址:https://sourceforge.net/projects/bbmap/
# 下载完成后解压
cd (installation parent folder)
# 解压
$ tar -xvzf BBMap_(version).tar.gz
# 所有的脚本都在 (installation directory)/下面
示例:
1)建立索引:bbmap.sh ref = contigs.fa
2)构建索引信息到当前目录:bbmap.sh in = reads.fq out = mapped.sam
3)同时构建map和index:bbmap.sh in = reads.fq out = mapped.sam ref = ref.fa
4)在内存构建索引:bbmap.sh in = reads.fq out = mapped.sam ref = ref.fa nodisk
5)将fq文件分成mapping上的和未mapping的fq文件:bbmap.sh in = reads.fq outm = mapped.fq outu = unmapped.fq
6)计算覆盖率:bbmap.sh in=reads.fq covstats=constats.txt covhist=covhist.txt basecov=basecov.txt bincov=bincov.txt
7)输出bam文件(需要安装samtools):bbmap.sh in = reads.fq out = mapped.bam
8)生成排序后的的bam文件:bbmap.sh in = reads.fq out = mapped.sam bamscript = bs.sh; sh bs.sh
9)要将RNA-seq原始户据映射到基因组:bbmap.sh in = reads.fq out = mapped.sam maxindel = 200k ambig = random intronlen = 20 xstag = us
10)快速mapping:bbmap.sh in=reads.fq out=mapped.sam fast
11)统计Reads信息:readlength.sh in=reads.fq
BBTools/BBmap包含近百款测序常用工具,具体大家可以参考官方文档:地址
参考资料:
1.https://jgi.doe.gov/data-and-tools/bbtools/bb-tools-user-guide/