Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to detect Fusion Gene in FusionMap

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

How to detect the fusion gene in FusionMap, many novices are not very clear about this. In order to help you solve this problem, the following editor will explain it in detail. People with this need can come and learn. I hope you can get something.

The fusion gene is detected in two ways:

For the sequence that does not match the genome, that is, unmapped reads, the fusion gene is identified by identifying Fusion junction-spanning reads; this part of reads covers the junction of the fusion gene, and the sequences on both sides of the junction are aligned to multiple genes that make up the fusion

For the reads that aligns the genome, the fusion gene is identified by identifying the Inter-transcript read pairs. Although this part of the reads does not directly cover the junction, its R1 terminal and R2 terminal align to different genes respectively.

The schematic diagram is as follows

In fusionmap, it is assumed that the fusion gene consists of two genes, and for Fusion Junction-spanning reads that fails to match the genome, it is divided into two categories: set a threshold for alignment length, if the length of this reads in both genes is greater than the threshold, it belongs to seed reads;. If the length of alignment in either gene is less than the threshold, it belongs to rescued reads, as shown below.

Fusionmap will output the table shown below

The key columns of information are explained as follows

FusionID: the ID of the identified fusion gene is prefixed with FUS, the first number is the starting position of the fusion gene, and the second number is the end position of the fusion gene. The position here is actually a cumulative position. All the chromosomes are connected alphabetically to form a reference chromosome, so that each gene has a position on this chromosome, so the position here is the cumulative position. It can be found that the number of the termination position is always larger than the starting position. The content in parentheses is the direction of the chain of the two genes that form the fusion gene.

Strand: the direction of the chain of two genes that form a fusion gene, including four combinations of +, -, -, and -.

Position1: the starting position of the detected fusion gene

Chromosome1: the chromosome on which gene1 is located

Chromsome2: the chromosome on which gene2 is located

Position2: the termination position of the detected fusion gene

KnowGene1: symbol of gene1

KnowTranscriptStrand: the direction of transcripts of gene1. If there are multiple transcripts, there will be multiple directions.

KnowGene2: symbol of gene2

KnowTranscripitStrand: the direction of transcripts of gene2. If there are multiple transcripts, there will be multiple directions.

FusionGene: name of the fusion gene, gene1- > gene2

In addition, there are several kinds of information that are difficult to understand.

1. The number of reads supporting the fusion gene

Includes the following three categories

Accepted_hits.UniqueCuttingPositionCount

Accepted_hits.SeedCount

Accepted_hits.RescuedCount

SeedCount and RescuedCount represent the number of seed reads and rescued reads mentioned above, and the sum of the two is the number of Fusion Junciton-spanning reads. In addition, there is Inter-transcript reads. The total number of these two reads is the number of reads supporting the fusion gene. The more reads, the better. However, because there are PCR duplications in the process of building the database, in order to provide more reliable reads number information, it is necessary to remove redundancy and ensure that PCR reads does not repeat calculation. After removing the redundancy, you get the UniqueCuttingPositionCount. The schematic diagram is as follows

The black line is the transcript formed by the real fusion gene, the gray fragment is the sequence generated by randomly interrupting the transcript, and the red is the corresponding breakpoint of the fusion gene, with a total of four reads in the picture, but the middle two reads are in the same position, which may be PCR repeats, so in fact, there are only three reads that support the fusion gene. When counting the number of reads, fusinomap actually only depends on whether the termination position in the second gene is the same. For the fusion gene in the example, the final UniqueCuttingPositionCount value is 3. The higher this value is, the more reliable the fusion gene is.

two。 Codon type

The fusion transcript will also be translated. Compared with the codon of the original two genes, the codon of the fusion transcript will shift code. For the information in the column of frameshift in the result, the diagram is as follows.

These are four common fusion transcript codon types. In the FrameshiftClass list, these four common types are defined as In-Frame, and the other types are defined as Frame-Shift.

3. Bases on both sides of the connection point

It is generally believed that the fusion transcript is formed by the connection of the exon region of the two genes, while the base sequences of the initiation and termination of exon are relatively conserved. According to this characteristic, fusionmap defines SplicePattern, that is, the pattern of sequences on both sides of the junction. The splicing mode such as GA-TC is the most common, the type is CanonicalPatter [Major], followed by GC-AG and AT-AC, the type is CanonicalPatter [Minor], for other splicing modes. It is generally uncommon and the type is NonCanonicalPatter. The more common the splicing pattern at the breakpoint of a fusion gene is, the more likely it is that the detected fusion gene is a real fusion gene.

Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report