In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
How to use SICER for peak calling, in view of this problem, this article introduces the corresponding analysis and answer in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible method.
The length range of peak in chip_seq data is large, ranging from hundreds of bp peak covering several nucleosomes to peak containing thousands of kb of multiple genes. For example, in the histone modification of H3K4me2 and H3K4me3, the peak is about a few hundred bp, while the length of H3K27me3 is between tens and hundreds of kb. In histone modification, peak has the characteristics of large length span and weak signal dispersion, which makes the peak calling software based on transcription factor TF binding sites less accurate in analyzing this kind of data.
SICER is a software for peak calling of histone modified chip data. The core idea is to identify enrichment regions based on sliding window and local Poisson distribution. The following figure shows the peak region of H3K27me3 identified by the software with default parameters.
The black area is the peak region obtained by ENCODE analysis, and the red region is the peak region obtained by SICER analysis. The software's official website is as follows
Https://home.gwu.edu/~wpeng/Software.htm
For example, it is convenient to use. Someone has repackaged the software to make it more convenient to use. The source code is hosted on github. The URL is as follows.
Https://github.com/dariober/SICERpy
The basic usage is as follows
Python SICERpy\
-c input.bam\
-w 200\
-g 3\
-t ip.bam\
> peak.bed
The-w parameter represents the size of the sliding window, and the default value is 200. The smaller the value is, the shorter and more dispersed the length of the identified peak interval is; the larger the value is, it will cause transition fitting, and the identified peak interval is too long, resulting in the loss of real information, as shown below
For transcription factors, it is officially recommended that the sliding window be set to 50-100bp, and for histone modification, it is recommended to set to 200bp.
The-g parameter represents the size of the gap, and the default value is 3. Similar to windows size, this parameter also directly affects the definition of peak interval, as shown below
For transcription factors, it is officially recommended that the value remain the same as the sliding window value; for histone modification, the recommended value is 3.
The output file is in bed format, with a total of 8 columns, each with the following meanings
Chrom
Start
End
Chip read count
Input read count
Pvalue
Fold_enrichment
Fdr
You can filter the highly reliable peak information by using the FDR value of the last column, as follows
Awk'$8
< 0.01' peaks.bed >Peaks.01.bed
This is the answer to the question about how to use SICER for peak calling. I hope the above content can be of some help to you. If you still have a lot of doubts to solve, you can follow the industry information channel to learn more about it.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.