In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly explains the difference and connection between the three formats of narrow,broad and gapped peak. Interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Let's learn the difference and connection between narrow,broad and gapped peak.
When performing peak calling analysis, the following three peak formats are often encountered
narrow peaks format
broad peaks fotmat
gapped peaks format
Peak is defined as a reads-rich region of the genome. The core information is the starting and ending positions on chromosomes. In addition, there are software scores for the peak region, such as the common pvalue, qvalue, fold_enrichment and so on.
Similar to how genome alignment information is stored in BAM format, these three data formats were developed specifically to standardize the output of different peak calling software. All three formats are essentially bed files, but the number of columns is not quite the same.
1. Narrow Peaks Format
This format is also known as point-source peaks format, macs2 default output is this format, is a BED6+4 format, the number of columns is 10, as shown below
The first four columns represent chrom, chromStart, chromEnd, name, which are used to describe the peak interval and name. Note that the starting position in bed format starts from 0.
The fifth column represents score, int(-10 $> log10qvalue) in the output of macs2, and the sixth column represents strand,., The seventh column represents signalvalue, usually using the value of fold_enrichment, the eighth column represents pvalue, which is-log10 (pvalue) in the output of macs2, the ninth column represents qvalue, which is-log10 (qvalue) in the output of macs2, and the tenth column represents peak, which is the center of peak in the output of macs2, i.e., the distance between summit and the start of peak.
2. Broad Peaks Format
This format is based on the narrow peaks format and loses the last column of information. It is BED6+3 format, and the number of columns is 9.
3. Gapped Peaks Format
The first two formats describe continuous peak intervals and are suitable for storing enriched region information at DNA level, such as peak intervals identified by chip_seq and ATAC_seq, while gapped peaks format is used to describe discontinuous peak intervals, where discontinuity usually refers to that multiple exon regions are included in peak intervals and is suitable for storing enriched region information at RNA level, such as peak intervals identified by m6A_seq.
This format is extended on the basis of BED12 and evolved into BED12+3 format with 15 columns. The meaning of each column is as follows
The first six columns have the same meaning as the two peak formats mentioned above, and the last three columns have the same meaning as the broad peak. In order to express the exon information contained in the peak interval, the following six columns are introduced by referring to the BED12 format of the transcript.
thickStart
thickEnd
itemRgb
blockCount
blockSizes
blockStarts
thickStart and thickEnd are somewhat similar to the start and end positions of CDS in transcripts. When storing peak information, it is common practice to set the values of these two columns to be the same as those of chromStart and chromEnd. itemRgb is an RGB color value, such as 255,0,0. If there is no corresponding color information, it is represented by 0.
blockCount represents the number of exons contained in the peak interval, blockSizes represents the length of each exon interval, multiple exons are connected by commas, blockStarts represents the starting position of each exon interval on the genome, and multiple exons are connected by commas.
Please refer to the following links for information on these three formats
https://genome.ucsc.edu/FAQ/FAQformat.html#format13
At this point, I believe that everyone has a deeper understanding of the differences and connections between the three formats of "narrow,broad, gapped peak." Let's actually operate it! Here is the website, more related content can enter the relevant channels for inquiry, pay attention to us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.