In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
Tabix how to operate VCF files, for this problem, this article details the corresponding analysis and solutions, hoping to help more small partners who want to solve this problem to find a simpler and easier way.
The installation process is as follows
wget https://sourceforge.net/projects/samtools/files/tabix/tabix-0.2.6.tar.bz2tar xjvf tabix-0.2.6.tar.bz2cd tabix-0.2.6/make
Download the source code, unzip it, and compile it. After successful compilation, there will be two executable files tabix and bgzip.
Due to the large number of SNP loci, the corresponding VCF file is also very large. For example, to save storage space, the most common method is compression. bgzip can compress VCF files as follows
bgzip view.vcf
After compression, the original view.vcf file becomes a view.vcf.gz file. The compression suffix is.gz. If you want to decompress, there are two ways to use it
bgzip -d view.vcf.gzgunzip view.vcf.gz
The bgzip compression algorithm is similar to the gzip compression algorithm, so for bgzip compressed files, you can use gunzip to decompress them in addition to bgzip itself.
It should be noted that although there are similarities between the two algorithms, there are essential differences. When compressing VCF files, gzip cannot be used instead of bgzip.
For large VCF files, how to quickly access the records in them is also difficult. Tabix can index VCF files, and once the index is built, access will be much faster. Tabix indexes VCF files as follows
tabix -p vcf view.vcf.gz
Note that the input VCF file must be a VCF file compressed by bgzip, and the generated index file is view.vcf.gz.tbi with the suffix.tbi.
After building the index, you can quickly obtain the records of the specified area. The example is as follows
1. Acquire SNP locus tabix view.vcf.gz112 located on chromosome 11. SNP locus tabix view.vcf.gz 11:23435453 at mutation position greater than or equal to 2343545 on chromosome 11 was obtained. Get the SNP locus tabix view.vcf.gz11:2343540 - 2343596 on chromosome 11 mutation position between 2343540 and 2343596 The answer to the question about how to operate VCF files in tabix is shared here. I hope the above content can be of some help to everyone. If you still have a lot of doubts, you can pay attention to the industry information channel for more related knowledge.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.