In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
Today, I would like to talk to you about the four reasons why uBAM has been unable to become popular. Many people may not know much about it. In order to make you understand better, the editor has summarized the following for you. I hope you can get something from this article.
UBAM is a non-aligned BAM file, and fastq can convert it to this format through the tool picard.
It has many advantages over fastq format, such as: the data of the same read are all in the same row; expansibility is strong, you can add rich metadata; to facilitate maintenance, and the sequencing data of the same sample can even be stored through a single uBAM.
It has been 4 + years since I first knew about uBAM. I was also very optimistic about it and thought that it must be the standard for storing offline data in the future. However, after many years, it is strange that uBAM is so good (GATK has always supported this format), why it has not become popular for a long time?
At present, as far as I know, the uBAM format is only used by some large research institutions, such as Broad Institute in the United States and Sanger in the United Kingdom, to store offline data.
After thinking about it during this period of time, I think there may be the following reasons to share with you:
BAM is "bulky", it is not a text file, you can not directly through the text tool to open it to see the specific content. It can only be operated on through third-party tools or specialized SAM/BAM packages (or API). This will bring a lot of trouble to many researchers who are not familiar with this method of treatment. This is tantamount to raising the threshold for operating this file directly, and from this point of view, the user experience is really far inferior to that of fastq.
Mainstream tools are not fully supported, except for samtools and a small number of related tools, there are not many other tools that directly support operating BAM on the command line
The space share of BAM files is not much smaller than that of compressed fastq, and its advantages are limited.
In terms of underlying IO efficiency, in fact, fastq (or gzip-compressed fastq) in text format is higher than BAM.
From this phenomenon of uBAM, it may also reflect some problems about product design (or scheme design). With regard to this issue, I have seen three places. Welcome to clap bricks:
First, experience. If a product or solution is to become popular, in addition to solving the requirements, we should pay more attention to the use experience than the advanced technology and the completeness of the product itself.
Second, first-mover advantage. Once time lags behind (for example, fastq is many years earlier than uBAM), the change of users' habits needs to be supported by complete technical solution tools, reduce switching costs, or even achieve painless switching, so as to maximize the advantages of new products.
Third, the more difficult it is to ban things that seem simple. Fastq format is a very simple and concise data format for storing sequencing data, it only contains all the contents that must be included, and the goal is clear, that is, sequence ID, sequencing data and quality values, they are all indispensable information, no matter how useless, it seems to be the extreme.
After reading the above, do you have any further understanding of the four reasons why uBAM has been delayed to become popular? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.