Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What's the use of HiC-Pro software?

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article is to share with you about the use of HiC-Pro software. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

HiC-Pro is an efficient Hi-C data analysis software, which provides the complete function from the original data to the normalized HI-C map construction. It is efficient and easy to use. The corresponding article links to the software are as follows

Https://genomebiology.biomedcentral.com/track/pdf/10.1186/s13059-015-0831-x

The complete pipeline is shown in the following figure

The red box marks the data preprocessing part, and its function is similar to that of HICUP software, including sequence alignment and screening valid pairs; preprocessing followed by binning, constructing the original interaction matrix contact map under different resolutions, and finally normalizing the original contact map to get the corrected contact map.

One of the powerful functions of HiC-Pro is that it can construct haplotype-level Hi-C maps. Haplotype-level Hi-C maps contribute to a more refined understanding of the three-dimensional structure of the genome and further in-depth and detailed study of gene regulation and other functions.

The whole process is divided into the following steps

1. Sequence alignment

HiC-Pro adopts a two-step comparison strategy, as shown below

Considering the relationship between the position of the connection point on the insertion fragment and the sequence reading length, the first step is to compare the R1 and R2 ends with the genome respectively. For the reads without alignment, it may be the chimera reads with the connection point, or it may be the unmapping reads itself. By removing part of the sequence from the 3 'end, the chimera sequence can also be aligned with the genome, and the two-step strategy ensures the utilization of data.

two。 Filter valid pairs

R1 and R2 are considered separately when comparing, but they actually come from the same fragment. The screening of this step is actually an effective fragment that can represent chromatin interaction. Such fragment must be a chimera sequence, consisting of sequences from two interacting chromatin regions, as shown in the following figure.

Only the reads from the chimera fragment is defined as valid pairs, followed by subsequent analysis.

3. Construct the original Hi-C map

According to the specified resolution, count the number of valid pairs in the two bin regions, remove the PCR repetition, and construct the original interaction matrix.

4. Normalization

Systematic errors such as GC content and mapping probability in different regions make the original interaction matrix unable to represent chromatin interaction information effectively, so it needs to be normalized. An iterative correction normalization algorithm is used to normalize the original interaction matrix and correct the system error.

HIC-Pro also provides a series of quality control standards, as shown in the following figure

Most of a high-quality library will certainly be able to match the genome, as shown in figure A, the comparison rates of R1 and R2 are very high. The reads on the comparison should be mainly unique mapping, as shown in the second figure of figure A, multiple hits and low quality are also one of the indicators of library quality.

The proportion of valid pairs is the most direct embodiment of the quality of the library, and the proportion of valid pairs should be at least more than 50%.

The chromatin interaction is further divided into inter-interaction between chromatin. Corresponding to trans contact in figure B, and intra-interaction in chromatin, corresponding to cis contact. For cis contact, it is divided into short and long according to the distance threshold.

First of all, the proportion of intra-interaction in a high-quality library is more than 40%. Secondly, because the linear near-distance chromatin is easier to combine randomly and introduce systematic error, the proportion of cis long contacts in high-quality library is more than 40%.

All the parameters of HiC-Pro are placed in a configuration file, which can run the entire pipeline at one click or distribute it, and perform some of these steps separately, which is very flexible and will be described in detail later.

Thank you for reading! This is the end of this article on "what is the use of HiC-Pro software?". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report