How to understand Anosim Analysis in R language 07/06 Update SLTechnology News&Howtos

How to understand Anosim Analysis in R language

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

It is believed that many inexperienced people have no idea about how to understand the Anosim analysis in R language. therefore, this paper summarizes the causes and solutions of the problem. I hope you can solve this problem through this article.

Whether it is field environmental samples or indoor test samples, generally we will set up sample or parallel samples to enhance the accuracy of analysis, and block design will be carried out if necessary, so it is necessary to compare and distinguish the differences between groups in data analysis. However, for microbial community data, because there are many species and different sensitive environmental factors, the parameter test based on normal distribution is difficult to meet the needs of analysis. To carry out multivariate nonparametric test (non-parametric multivariate statistical tests) to calculate significance, R language vegan contains a variety of non-parametric test methods, including Anosim, Adonis, MRPP and so on. Different methods have differences in the selection of statistics and zero model.

Anosim analysis (Analysis of similarities) is a nonparametric test based on permutation test and rank sum test, which is used to test whether the difference between groups is significantly greater than that within groups, so as to determine whether the grouping is meaningful. Anosim analysis uses distance for analysis, which defaults to method= "bray". You can choose a different distance (the same as the vegdist () function), or you can directly use the distance matrix for analysis. In R, we can use the anosim () function in the vegan package for analysis Here we analyze the microbial community data as an example: # read the extracted OTU_table and environmental factor information data=read.csv ("otu_table.csv", header=TRUE, row.names=1) envir=read.table ("environment.txt", header=TRUE) rownames (envir) = envir [, 1] env=envir [,-1] # screen high abundance species and standardize species data means=apply (data, 1, mean) otu=data [names (means > 10]) ] otu=t (otu) # clustering based on geographical distance kms=kmeans (env, centers=3, nstart=22) Position=factor (kms$cluster) # Anosim analysis library (vegan) anosim=anosim (otu, Position, permutations=999) summary (anosim)

The ANOSIM statistic R in the figure above is the statistics of the Anosim test, and its distribution measures the distribution of the zero model, and Upper quantiles of permutations is the quantile of the statistics obtained by 999 permutations. Specifically, the principle of Anosim analysis is to first calculate the distance between the two pairs of samples, sort the distance between the two pairs of samples from small to large and calculate the rank (rank, r). According to the classification of distance (belonging to inter-group distance or intra-group distance), the difference between the mean rb of inter-group distance rank and the mean rw of intra-group distance rank is taken as statistics:

If R > 0, it means that the distance within the group is less than the distance between groups, that is, the grouping is effective, which is similar to the principle of comparing intra-group variance with inter-group variance in the analysis of variance. From the above analysis results, we can see that the RFT is 0.4613, which is greater than the zero model 99% quantile 0.290, so the p value is 0.001, the result is significant. We can extract the analysis results as follows: the rank of the distance:

Since there are 22 samples, there should be C (22,2) = 231 distances. The attribution corresponding to the above distance is as follows:

Now let's visualize according to this sort: mycol=c (52pje 619pje 453pje 71pje 13448c 548pcm555) mycol=colors () [mycol] par (mar=c) result=paste ("R =", anosim$statistic, "p =", anosim$signif) boxplot (anosim$dis.rank~anosim$class.vec, pch= "+", col=mycol, range=1, boxwex=0.5, notch=TRUE, ylab= "Bray-Curtis Rank", main= "Bray-Curtis Anosim", sub=result) the result is as follows:

After setting the parameter notch=TRUE, grooves are drawn on both sides of the box to display the confidence interval of the median, which makes it easier to compare the median. It can be seen that the effect of the second group is poor, but on the whole, the grouping is effective. After reading the above, have you mastered how to understand Anosim analysis in R language? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.