Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to carry out CNV Analysis of WES

2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

How to carry out the CNV analysis of WES, I believe that many inexperienced people are at a loss about it. Therefore, this paper summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.

It is a very effective method to detect CNV based on genome-wide data, but the cost of genome-wide is still high. The whole exon group has been relatively mature in the detection of SNP. Considering that the variation on the exon may be more pathogenic, researchers also hope to achieve an efficient and economical CNV detection by detecting the CNV on the exon. A lot of software has been developed for CNV analysis of WES.

The length of the CNV region may span multiple exons or genes, and the breakpoint is located outside the exon, so the strategy of Read-pair and split-read in the whole genome analysis can not be applied to the CNV analysis of WES, but can only be analyzed by the strategy of read-depth.

However, unlike the whole genome, the whole exon targeting captures the exon region of the genome. considering the systematic errors such as GC content and sequence capture, the correlation between the distribution of sequencing depth and CNV is more complex, and it is more difficult to model and measure, so many CNV detection software suitable for WGS analysis can not be used for WES analysis.

In order to effectively reduce the impact of system errors and improve the accuracy of CNV detection, many WES analysis software will need a control sample to compare the control samples with the test samples to identify the differences between them, so as to avoid the impact of system errors. The same protocol means the same systematic error, and the direct difference between the two is caused by the difference of the sample itself, which is the function of the control sample. So the classic use of WES's CNV test is to detect somatic CNV, or SCNA mutations, and provide paired cancer and paracancerous samples for analysis.

In the following literature, several software for exon CNV detection are listed in detail.

Https://academic.oup.com/bib/article/16/3/380/245577

According to the need for control samples, the samples are divided into the following three categories

Paired data, control sample that needs to be matched

Pooled data, no control sample is required

Paired and pooled data, both strategies will work.

1. Paired data

The list of software is as follows

ExomeCNV

Varscan2

Control-Freec

Exome2cnv

PropSeg

2. Pooled data

The list of software is as follows

Condex

ExomeCOPY

Cn.mops

Conifer

ExomeDepth

XHMM

ExoCNVTest

Excavator

3. Paired and pooled data

The list of software is as follows

Contar

ADTEx

FishingCNV

The article was published in 2014, and many new tools have been published since then, such as excavator. The article published on Nucleic Acids Research in 2016 introduces the power of excavator2 for CNV analysis. The link is as follows

Https://academic.oup.com/nar/article/44/20/e154/2607979

The algorithm models of different tools are different, and each has its own advantages and disadvantages. An article published in 2014 evaluated a number of software, with the following title

In the article, many softwares for CNV analysis are listed as follows

Finally, the following four softwares are selected for evaluation.

XHMM

CoNIFER

ExomeDepth

CONTRA

The evaluation has been made from the following aspects

1. CNV length and distribution

The CNV length distribution detected by different software is different, and the statistics of the results are as follows

The length of CNV can span from dozens of bp to several Mb. It is generally believed that CNV with less than 300bp and length around 6kb should be the largest in number. WES's CNV detection tools are based on read-depth algorithm, using sliding window method, the larger the window, the higher the credibility of the final identified CNV, so the ability to detect small fragments of CNV is poor.

As can be seen from the statistical results, Conifer did not identify the CNV below 1kb, because this software requires CNV to cover at least three exon regions, while XHMM and ExomeDepth can detect small and large fragments of CNV at the same time. The number of CONTRA detected is too large because its algorithm for correcting read-depthh is too sensitive, so too much CNV is identified, so it is more suitable to detect small fragments of CNV smaller than 1kb.

The number and types of CNV identified by different software are shown below.

two。 Consistency with WGS

Two softwares, cnvnator and ERDS, are used to detect the CNV of WGS data, and then the consistency with the results of WES is analyzed and evaluated in terms of exon. When an area of more than 50% of exon falls in the CNV area, the overlap of exon and WGS data exon detected by different software are compared. The results are as follows.

Although they are all low, it is clear that the ExomeDepth overlap rate is the highest, followed by XHMM.

3. Consistency with Common CNV

Using the cnvs with a frequency of more than 5% in the 1000G project as the common cnv, using the above method to evaluate the consistency of different software and common cnv, the results are consistent with WGS, and ExomeDepth is the highest, followed by XHMM.

4. Mendelian Error Rate assessment

In general, the probability of denovo CNV is very low. Taking denovo CNV as an indicator of Mendelian Error Rate, CNV analysis is carried out on individuals and their parents at the same time to evaluate the frequency of denovo cnv. The results are as follows.

The proportion of CNV that does not conform to Mendelian inheritance of each software is very high, conifer is the highest, and CONTRA is the lowest.

5. False positive detection of deletion CNV

For deletion CNV, there is only one copy of the chromosome region, and the SNV in this region must be homozygous, so the CNV region containing heterozygous SNV is taken as a false positive result. Considering the accuracy of SNP typing, the deletion region that meets the following two conditions is defined as a false positive result.

Contains more than 2 heterozygous SNP

More than 20% of SNP loci are heterozygous.

The false positive statistical results of copy number deletion are as follows

6. Consistency between different softwar

The consistency between different software is calculated based on the exon level, and the results are as follows

From the point of view of the above six indicators, no software is better than other software. in different indicators, different softwares have their own advantages and disadvantages.

In the CNV detection of WES, it is difficult to strike a balance between sensitivity and specificity based on the results of one software, and the best method is to combine the results of multiple software to judge.

After reading the above, have you mastered the method of how to carry out CNV analysis of WES? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report