Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use Site Models in CODEML for positive selection Gene Analysis

2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

Editor to share with you how to use Site Models in CODEML for positive selection gene analysis. I hope you will get something after reading this article. Let's discuss it together.

Introduction to Site Models

Site Models is a positive selective action analysis model of CODEML program in PAML software. Its main point of view is that the ω values of different sites in the same sequence are different, where ω = dN/dS, indicating non-synonymous / synonymous replacement rate. The omega ratio is used to measure positive selection. In short, omega values

< 1,= 1,>

1 indicates negative pure selection, neutral evolution and positive selection. However, the average ω ratio of all loci and all species lines are almost never more than 1, because positive selection cannot act on all loci over a long period of time. Therefore, what we really need to explore is the positive selection effects on some species lines and some loci.

When doing Site Models analysis, you need to set the Model=0 in the control file. There are several different models in Site Models, which can be specified by the Nssites parameter and set different values according to the selection of different Model. It is worth noting that this allows you to select multiple Site Models. Such as Nssites=0 1 2 3 7 8.

What does a different Site Models mean?

M0 is one-ratio Model, and it is worth it that the ω value of all sites is constant.

M1 means that it is assumed that the ω value of some loci is 0 and that of others is 1.

M2 adds the third kind of ω on the basis of M1, which is calculated by data and may be greater than 1.

M3 assumes that the ω values of all sites show a simple discrete distribution trend.

M7 assumes that the ω of all sites belongs to the matrix (0Magne1) and has a beta distribution.

M8 adds another kind of ω value to M7, which can be calculated and can be greater than 1.

What kind of results can be obtained by comparing different Model?

In Site Models, M0 represents one ratio for all sites and M3 indicates that the ω values of all loci are simply discrete. The comparison of the two models is not used for the detection of positive selection, but for the detection of the consistency of ω values between loci.

M1 and M2 and M7 and M8 are used to detect positive selection. The authors recommend using these two groups of comparisons for LRT test to verify positive selection. However, Prof.Yang believes that The M1-M2 comparison is more stable than the M7-M8 comparison. The calculation time of M7 and M8 models is longer, if there are more genes to be analyzed, we can consider not comparing M7-M8.

How to detect positive sites?

CODEML computation: the main purpose is to run the CODEML program after setting the command value in control file. Three files are needed to run the CODEML program, namely, sequence file (PHYLIP format), tree file and control file. Example of a control file:

Seqfile = Fungi.fasta * sequence data file name treefile = Fungi.tree * tree structure file name outfile = mlc * main result file name noisy = 3 * 0 semi-automatic; 1: how much rubbish on the screen verbose = 0 * 1: detailed output, 0: concise output runmode = 0 * 0: user tree; 1: semi-automatic; 2: automatic * 3: StepwiseAddition; -2: pairwise seqtype = 1 * 1 1:F1X4 2:F3X4, 3:codon table clock = 0 * 0: no clock, unrooted tree, 1: clock, rooted tree aaDist = 0 * 0:equal, +: geometric -: linear, {1-5 NSsites G1974Graduate Miyata model v} model = 0 NSsites = 0 3 1 2 7 8 * 0:one w; 1 NSsites NearlyNeutral; 2 NSsites PositiveSelection; 3 10:3normal icode selection; 4 freqs; 5 0:one gamma, 6 gamma gamma, 7 gamma gamma, 7 beta gamma, 7 beta gamma 10:3normal icode = 0 * 2-10:see below Mgene = 0 * 0:rates, 1:separate 2:pi, 3:kappa, 4:all fix_kappa = 0 * 1: kappa fixed, 0: kappa to be estimated kappa = .3 * initial or fixed kappa fix_omega = 0 * 1: omega or omega_1 fixed, 0: estimate omega = 1.3 * initial or fixed omega, for codons or codon-based AAs ncatG = 10 * # of categories in the dG or AdG models of rates getSE = 0 * 0: don't want them 1: want S.E.s of estimates RateAncestor = 0 * (0Power1 * 2): rates (alpha > 0) or ancestral states (1 or 2) Small_Diff = .45e-6 cleandata = 1 * remove sites with ambiguity data (1:yes, 0:no)? Fix_blength = 0 * 0: ignore,-1: random, 1: initial, 2: fixed, 3: proportional

Likelihood ratio test: that is, the significance level of the two models is compared, which can be calculated by using PAML software with Chi2 subroutine. First calculate the Ln L difference between the two corresponding models, and multiply the absolute value by 2, that is, 2 △ Ln L = | Ln L 1-Ln L2 |. Using the Chi2 program to calculate the P value, the command is: Chi2 2.03 (2 is the degree of freedom df,site models df usually uses 2 △ 2.03 is 2 △ Ln L).

Output value prob

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report