In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
A brief introduction to Mahout
Mahout is a powerful data mining tool and a collection of distributed machine learning algorithms, including the implementation, classification and clustering of distributed collaborative filtering called Taste. The biggest advantage of Mahout is based on hadoop implementation, which converts many algorithms running on a single machine into MapReduce mode, which greatly improves the amount of data and performance that the algorithm can handle.
Machine learning algorithm implemented in Mahout:
Algorithm class
Algorithm name
Chinese name
Classification algorithm
Logistic Regression
Logical regression
Bayesian
Bayes
SVM
Support vector machine
Perceptron
Perceptron algorithm
Neural Network
Neural network
Random Forests
Random forest
Restricted Boltzmann Machines
Finite Boltzmann machine
Clustering algorithm
Canopy Clustering
Canopy clustering
K-means Clustering
K-means algorithm
Fuzzy K-means
Fuzzy K-means
Expectation Maximization
EM clustering (expectation maximization clustering)
Mean Shift Clustering
Mean shift clustering
Hierarchical Clustering
Hierarchical clustering
Dirichlet Process Clustering
Dirichlet process clustering
Latent Dirichlet Allocation
LDA clustering
Spectral Clustering
Spectral clustering
Association rule mining
Parallel FP Growth Algorithm
Parallel FP Growth algorithm
Regress
Locally Weighted Linear Regression
Locally weighted linear regression
Dimension reduction / dimension reduction
Singular Value Decomposition
Singular value decomposition
Principal Components Analysis
Principal component analysis
Independent Component Analysis
Independent component analysis
Gaussian Discriminative Analysis
Gaussian discriminant analysis
Evolutionary algorithm
Parallelizes the Watchmaker framework
Recommendation / collaborative filtering
Non-distributed recommenders
Taste (UserCF, ItemCF, SlopeOne)
Distributed Recommenders
ItemCF
Vector similarity calculation
RowSimilarityJob
Calculate the similarity between columns
VectorDistanceJob
Calculate the distance between vectors
Non-Map-Reduce algorithm
Hidden Markov Models
Hidden Markov model
Set method extension
Collections
Extends the Collections class of java
II. Installation and configuration of Mahout
Download Mahout
Http://archive.apache.org/dist/mahout/
2. Decompression
Tar-zxvf mahout-distribution-0.9.tar.gz
Third, configure environment variables
Configure Mahout environment variables
# set mahout environment
Export MAHOUT_HOME=/usr/local/mahout-distribution-0.9
Export MAHOUT_CONF_DIR=$MAHOUT_HOME/conf
Export PATH=$MAHOUT_HOME/conf:$MAHOUT_HOME/bin:$PATHma
Verify whether Mahout is installed successfully
Execute the command mahout. If some algorithms are listed, they are successful, as shown in the figure:
5. Entry-level use of Mahout
5.1.Starting Hadoop
5.2. Download test data
a. Download a file synthetic_control.data, download address http://archive.ics.uci.edu/ml/databases/synthetic_control/synthetic_control.data, and place the file in the $MAHOUT_HOME directory.
5.3. Upload test data
c. Create the test directory testdata and import the data into this tastdata directory (the name of the directory here can only be testdata)
Hadoop fs-mkdir-p / user/root/testdata
Hadoop fs-put synthetic_control.data / user/root/testdata
5.4 using the kmeans clustering algorithm in Mahout, execute the command:
Mahout-core org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
It takes about 5 minutes to complete the clustering.
5.5 View clustering results
Execute hadoop fs-ls/user/root/output to view the clustering results.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
Http://blog.sina.com.cn/s/blog_5007d1b1010087ng.html
© 2024 shulou.com SLNews company. All rights reserved.