The related principle of CCA and what is the application of Python 04/21 Update SLTechnology News&Howtos

The related principle of CCA and what is the application of Python

2025-04-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article shows you how CCA works and how Python is used. It is concise and easy to understand. It will definitely make your eyes shine. I hope you can gain something from the detailed introduction of this article.

Today, I would like to share the relevant principles of CCA and Python applications. CCA is used a lot in feature extraction of EEG and other EEG data. It is necessary to be familiar with its principles.

CCA canonical correlation analysis

CCA(canonical correlation analysis) Multivariate statistical analysis that uses the correlation between pairs of synthetic variables to reflect the overall correlation between two groups of indicators. Its basic principle is: in order to grasp the correlation between the two groups of indicators on the whole, two representative comprehensive variables U1 and V1 (respectively linear combination of variables in the two groups of variables) are extracted from the two groups of variables, and the correlation between the two comprehensive variables is used to reflect the overall correlation between the two groups of indicators.

In 1936 Hotelling proposed canonical correlation analysis. Consider the linear combination of two groups of variables and study the correlation coefficient p(u,v) between them. In all linear combinations, find a pair of linear combinations with the largest correlation coefficient, and use the single correlation coefficient of this combination to express the correlation of two groups of variables, called the typical correlation coefficient of two groups of variables, and these two linear combinations are called a pair of typical variables. In the case of two sets of multivariate variables, several pairs of canonical variables are needed to fully reflect the correlation between them. Next, among the linear combinations of the two independent variables u1,v1, find the linear combination with the largest correlation coefficient, which is the second pair of canonical variables, and p(u2,v2) is the second canonical correlation coefficient. In this way, several pairs of typical variables can be obtained, thus extracting all the information between the two groups of variables.

The essence of canonical correlation analysis is to select several representative comprehensive indexes (linear combination of variables) from two groups of random variables, and use the correlation relationship of these indexes to express the correlation relationship between the original two groups of variables. When the canonical correlation coefficient is large enough, the values of one group of variables can predict the values of the linear combination of the other group of variables as in regression analysis.

Principle Description

Case realization

#import toolkit import h6pyimport rccaimport sysimport numpy as npimport cortexzscore = lambda d: (d-d.mean(0))/d.std(0)

Step 1: Load the data

Please download data from CRCNS: http://crcns.org/data-sets/vc/vim-2The following analysis assumes that the data is located in a directory named " data" in the current directory.

data = []vdata = []numSubjects = 3# subjects is a list of 3 subjects.subjects = ['S1',' S2','S3']# xfms is a list of transformed names in Pycortex that are used to align functional and anatomical data for each subject. xfms = ['S1_xfm', 'S2_xfm', 'S3_xfm']dataPath ="./ data/VoxelResponses_subject%d.mat"for subj in range(numSubjects): #Open Data f = h6py.File(dataPath % (subj+1),'r') #Get data size datasize = (int(f["ei"]["datasize"].value[2]),int(f["ei"]["datasize"].value[1]),int(f["ei"]["datasize"].value[0])) #Get Cortical Mask from Pycortex mask = cortex.db.get_mask(subjects[subj], xfms[subj], type = 'thick') #Get training data for this subject data_subj = np.nan_to_num(zscore(np.nan_to_num(f["rt"].value.T))) data.append(data_subj.reshape((data_subj.shape[0],)+datasize)[:, mask]) #Get validation data for subjects vdata_subj = np.nan_to_num(zscore(np.nan_to_num(f["rv"].value.T))) vdata.append(vdata_subj.reshape((vdata_subj.shape[0],)+datasize)[:, mask])

Step 2: Define CCA parameters

#Here sets a series of regularization values between 1e-4 and 1e2 regs = np.array(np.logspace(-4, 2, 10))#Here considers the number of components between 3 and 10 numCCs = np.arange(3, 11)#Initialize cca model cca = rcca.CCACrossValidate(numCCs=numCCs, regs=regs)

Step 3: Train the data, analyze and save the results

"" Description: This analysis is computationally intensive due to the large amount of data. Running it in a notebook takes a lot of time, so it is recommended to parallelize it and/or run it on a cluster of computers and then load the results for visualization. """#cca training data cca.train(data)#validation data cca.validate(vdata)#calculate variance, interpret validation response in each voxel cca.compute_ev(vdata)#save analysis results cca.save("./ data/CCA_results.hdf5")

Step 4: Visualize the analysis results

#Import visualization kit %matplotlib inlineimport matplotlib.pyplot as plt#Import Brewer Color Plot for visualization from brewer2mpl import qualitativenSubj = len(cca.corrs)nBins = 30bmap = qualitative.Set1[nSubj]f = plt.figure(figsize = (8, 6))ax = f.add_subplot(111)for s in range(nSubj): #Draw correlation histograms between all voxels of all three objects ax.hist(cca.corrs[s], bins = nBins, color = bmap.mpl_colors[s], histtype="stepfilled", alpha = 0.6)plt.legend(['Subject 1', 'Subject 2', 'Subject 3'], fontsize = 16)ax.set_xlabel('Prediction correlation', fontsize = 20)ax.set_ylabel('Number of voxels', fontsize = 20)ax.set_title("Prediction performance across voxels", fontsize = 20)# p

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.