In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
In this issue, the editor will bring you about how to use MutationalPatterns for tumor mutation spectrum analysis. The article is rich in content and analyzed and described from a professional point of view. I hope you can get something after reading this article.
MutationalPatterns is an R packet on bioconductor, which can be used to analyze the frequency spectrum of tumor mutation. The frequency spectrum of tumor mutation is defined for point mutation. There are four base pairwise mutations of Arector Tpeng C and G, which have a common 4X3=12 arrangement. Considering the principle of base pairing of positive and negative chains, the mutation of A-> C on the positive chain is T-> G on the corresponding negative chain, so it is further transformed into a combinatorial problem, so the mutation of a certain site can be divided into the following six modes.
C > A, which means C > An and G > T
C > G, which means C > G and G > C.
C > T means C > T and G > A
T > A, which means T > An and A > T
T > C means T > C and A > G
T > G means T > G and A > C.
Further consider the sequence context of the mutation site, that is, the upstream and downstream each take one base plus the base of the mutation site to form a 3-base motif, there can be 4X4X6=96 modes, and the frequency distribution of each mode is the mutation spectrum. The mutation spectrum can be used as the characteristic of a tumor sample to compare with each other. Through the MutationalPatterns package, you can easily extract the information of the mutation spectrum according to the VCF file corresponding to the sample. First, read the file. The code is as follows
# load R package
> library (MutationalPatterns)
# list the path to vcf
> vcf_files sample_names library (BSgenome.Hsapiens.UCSC.hg19)
> ref_genome vcfs type_occurrences plot_spectrum (type_occurrences)
The visualization results are as follows
The classic usage scenarios of the R packet are as follows
1. Calculate the mutation spectrum of the sample
According to the vcf file, the frequency of 96 motif in each sample is calculated and visualized, the code is as follows
> mut_mat plot_96_profile (mut_mat [, c (1pm 2)], condensed = TRUE)
The visualization results are as follows
two。 Comparison of the difference of abrupt spectrum distribution between two samples
The code is as follows
> plot_compare_profiles (mut_mat [, 1], mut_mat [, 2], condensed = TRUE)
The visualization results are as follows
The upper left corner shows the cosine similarity similarity between the two spectra. The first two layers in the picture correspond to the two spectra that need to be compared, and the third layer is the difference between the two spectra, which is directly subtracted by frequency.
3. NMF find mutation signature
Through the non-negative matrix factorization NMF algorithm, the feature is extracted from the original mutation spectrum, which is called mutation feature mutation signature. The code is as follows.
> library (NMF)
> estimate nmf_res colnames (nmf_res$signatures) rownames (nmf_res$contribution) plot_96_profile (nmf_res$signatures, condensed = TRUE)
4. Mutation singnature contribution
The mutation spectrum of each sample is the result of different mutation features. The contribution rate of different mutation features in each sample is visualized by the following code
Plot_contribution (nmf_res$contribution, nmf_res$signature, mode = "relative")
The visualization results are as follows
5. Compare the similarity between multiple mutation spectrum / mutation characteristics
Calculate the cosine similarity similarity between the spectrum of the picture, and the result is shown in a heat map. The code is as follows.
> cos_sim_samples_signatures = cos_sim_matrix (mut_mat, mut_mat)
> plot_cosine_heatmap (cos_sim_samples_signatures)
The visualization results are as follows
Through this R packet, the common analysis content of abrupt spectrum can be easily realized.
The above is the editor for you to share how to use MutationalPatterns for tumor mutation spectrum analysis, if you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.