In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article is a detailed introduction to "what are the entry points of machine learning". The content is detailed, the steps are clear, and the details are properly handled. I hope this article "what are the entry points of machine learning" can help you solve your doubts. Let's go deeper and learn new knowledge together with the ideas of the small editor.
Machine Learning (ML) is a multidisciplinary discipline, which involves probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and other disciplines. Machine learning studies how computers simulate or implement human learning behaviors to acquire new knowledge or skills, reorganize existing knowledge structures, and continuously improve their performance. Machine learning is an important branch of artificial intelligence, and it is the fundamental way to make computers intelligent, and its application covers all fields of artificial intelligence.
Definition of Machine Learning
Machine learning is relative to human learning. Machine learning is the study of how to use machines to simulate human learning activities.
Machine learning is the study of machines acquiring new knowledge and skills and recognizing existing knowledge. Simon defines learning as "learning if a system can improve its performance by performing a process." This definition has three main points:
First, learning is a process.
Second, learning is about a system.
Third, learning can change system performance. Overview is the process, system and change performance.
The first of these is natural. The system in the second point is quite complex, usually referring to a computer, but it can also be a computing system, or even a human-computer computing system that includes people. The third point emphasizes only "improving system performance" and does not limit this "improvement" approach.
machine learning processes
Generally speaking, a complete machine learning system should include environment, learning unit, knowledge base, and execution unit, as shown in Figure 1-5. Computer perceives and obtains information from environment through various software and hardware, processes information into useful knowledge by using learning unit, and stores it in knowledge base. Knowledge is used to guide the execution unit to generate actions, including decision making, task execution, etc. Observe the execution effect and feed it back to the learning unit.
Figure 1 Machine learning model
Classification of machine learning
Machine learning can be divided into supervised learning, unsupervised learning, semi-supervised learning, and other algorithms according to whether the training data has labels or not, as shown in Figure 2.
Supervised learning is the process of using a set of known class samples to train and adjust the parameters of the classifier to achieve the required performance, also known as supervised training or teacher learning. Supervised learning is the machine learning task of inferring a function from labeled training data. The training data consists of a set of training examples. In supervised learning, each instance consists of an input object (usually a vector containing multiple features) and a desired output value (also known as a supervised signal). A supervised learning algorithm analyzes the training data and generates an inference. It can be used to map out new instances. An optimal solution would allow the algorithm to correctly determine class labels for instances that are not visible. This requires that the learning algorithm be formed unseen from a training set of data in a "reasonable" way. Commonly used supervised learning algorithms include linear regression, logistic regression, decision tree, neural network, support vector machine, etc.
If all training data is unlabeled, it is called unsupervised learning. For example clustering algorithm, dimensionality reduction algorithm. The goal of clustering is to find a scheme to divide a set of samples into clusters such that the instances within each cluster are as similar as possible and the elements of different clusters are as dissimilar as possible.
If part of the training data is labeled and the other part is unlabeled, and the number of unlabeled data is often larger than the number of labeled data, this is called semi-supervised learning. Semi-supervised learning is based on the rule that the distribution of data must not be completely random. Acceptable or even very good classification results can be obtained by local features of labeled data and overall distribution of more unlabeled data.
In addition, there are some learning methods that cannot be classified into the above three methods, such as reinforcement learning, recommendation algorithm, meta-learning, etc., which are not detailed one by one.
Typical Machine Learning Methods
Many machine learning methods have been proposed. These methods can show excellent performance in solving different problems. Common and typical machine learning methods include regression analysis, classification (decision tree, support vector machine, neural network), clustering (K-means), dimensionality reduction, feature extraction, etc., as shown in Figure 3.
regression algorithm
Regression analysis is based on mastering a large number of observation data, using mathematical statistics to establish a regression function expression between dependent variables and independent variables, called regression equation. Regression analysis is called univariate regression analysis when the causal relationship studied involves only a dependent variable and one independent variable; it is called multivariate regression analysis when the causal relationship studied involves a dependent variable and two or more independent variables. Therefore, depending on the number of independent variables, it can be a univariate regression or a multiple regression. In addition, linear regression analysis and nonlinear regression analysis are classified according to whether the function expression describing the causal relationship between independent variables and dependent variables is linear or nonlinear.
decision tree
Decision tree is a tree decision structure constructed by recursively analyzing the importance of each attribute on the basis of known probability of occurrence of various situations. Because this method takes attributes as nodes and attribute values as branches, the graph drawn looks like the branches of a tree, so it is called decision tree. In machine learning, a decision tree is a predictive model that represents a mapping between object attributes and object values. Each node in the tree represents an object, each bifurcated path represents a possible attribute value, and each leaf node corresponds to the value of the object represented by the path taken from the root node to that leaf node. A decision tree has only one output. If it has multiple outputs, separate decision trees can be built to handle different outputs.
Decision trees use entropy to characterize the disorder of a system. After classifying a sample set with an attribute, the sample changes toward ordering. The information entropy of the sample set classified by the attribute is calculated, and compared with the information entropy before classification, the information gain can be obtained. Information gain represents the degree to which a sample changes from disorder to order. Therefore, which attribute has the largest information gain after classification indicates which attribute is more important. Select the most important attribute as root attribute, recursively calculate the information gain after the last classification for other attributes, and finally get a tree structure from input to output. Machine learning techniques that generate decision trees from data are called decision tree learning, or decision trees in layman's terms. Decision tree is a very common classification method. Common decision tree algorithms are ID3, C4.5, CART, etc.
neural network
Artificial Neural Network (ANN) is a complex network system formed by a large number of simple processing units (called neurons) widely interconnected. It reflects many basic characteristics of human brain function and is a highly complex nonlinear dynamic learning system. Neural network has the ability of massively parallel, distributed storage and processing, self-organization, self-adaptation and self-learning, especially suitable for dealing with imprecise and fuzzy information processing problems which need to consider many factors and conditions simultaneously. Theoretically, neural networks can adequately approximate arbitrary complex nonlinear relationships.
A neural network consists of several neurons interconnected. The network is divided into input layer, hidden layer and output layer. The input layer is responsible for receiving the signal, the hidden layer is responsible for decomposing and processing the data, and the final result is integrated into the output layer. A node in each layer represents a processing unit and can be thought of as simulating a neuron. Several processing units form a layer, and several layers form a network, that is, a neural network. In a neural network, each processing unit is in fact a logistic regression model, which receives inputs from the upper layer and transmits the prediction results of the model as outputs to the next layer. Through this process, neural networks can perform very complex nonlinear classification.
At present, the main neural network models are BP network, Hopfield network, ART network and Kohonen network. Neural networks have been widely used in automatic control, combinatorial optimization, pattern recognition, image processing, signal processing, robot control, health care, economics and many other fields.
Support Vector Machine (SVM)
In classification problems, data points are points in n-dimensional real space. The goal of classification is to be able to separate these points by an n-1 dimensional hyperplane. This is often referred to as a linear classifier. There are many classifiers that meet this requirement. But one also wants to find the best plane for classification, i.e., the plane with the largest separation between data points belonging to two different classes, which is also called the maximum separation hyperplane. If this face can be found, then the classifier is called a maximum separation classifier.
SVM is one such algorithm that strives to minimize structural risk. Support vector machines map vectors to a higher dimensional space, where a maximum-interval hyperplane is established. Two parallel hyperplanes are built on both sides of the hyperplane separating the data. Set up separating hyperplanes with proper orientation to maximize the distance between two parallel hyperplanes. The assumption is that the larger the distance or gap between parallel hyperplanes, the smaller the total error of the classifier, thus achieving the goal of minimizing the structural risk. Support vectors refer to the training sample points at the edge of the interval.
The key of SVM lies in kernel function. Vector sets in low-dimensional spaces are usually difficult to partition, and the solution is to map them to high-dimensional spaces. But the difficulty with this approach is the increase in computational complexity, and the kernel function neatly solves this problem. Although it maps the problem to a higher dimensional space, it still computes in a lower dimensional space. If we choose proper kernel function, we can obtain the classification function of high dimensional space without increasing the computational complexity. In SVM theory, different kernel functions will lead to different SVM algorithms. Common kernel functions include linear kernel, polynomial kernel, Gaussian kernel, Laplacian kernel and Sigmoid kernel.
clustering algorithm
If the training data do not have class labels, that is, do not know which class the sample belongs to, you can estimate the labels of these data through training. Such algorithms are called unsupervised algorithms. The most typical unsupervised algorithm is clustering algorithm.
In simple terms, clustering algorithms calculate distances in populations and divide data into multiple populations based on how far they are. The sample distance within each cluster should be as small as possible, while the distance between elements of different clusters should be as large as possible. K-Means is the most typical clustering algorithm.
The basic idea of K-means clustering algorithm is: randomly select K objects as initial cluster centers; then calculate the distance between each object and each seed cluster center, and assign each object to its nearest cluster center. Cluster centers and the objects assigned to them represent a cluster. Once all objects have been assigned, the cluster center for each cluster is recalculated based on the existing objects in the cluster. This process repeats until a certain termination condition is met. Termination conditions may be that no (or minimum number of) objects are reassigned to different clusters, no (or minimum number of) cluster centers change again, and the sum of squares of errors is a local minimum.
Other clustering methods include mean-shift clustering, density-based clustering, Gaussian mixture model (GMM) maximum expectation (EM) clustering, agglomerative hierarchical clustering, etc.
dimensionality reduction algorithm
Dimension reduction algorithm is also an unsupervised learning algorithm, its main feature is to reduce the data from high dimension to low dimension. In this case, the dimension actually represents the size of the data feature. The main function of dimensionality reduction algorithm is to compress data and improve the efficiency of other machine learning algorithms. Data with thousands of features can be compressed into several features by dimensionality reduction algorithms. Another benefit of dimensionality reduction algorithms is the visualization of data, for example, compressing 5-dimensional data to 2-dimensional, which can then be visualized in 2-dimensional planes. The main representative of dimensionality reduction algorithm is PCA algorithm (principal component analysis algorithm).
deep learning
Deep learning is a new field of machine learning research, motivated by the establishment and simulation of neural networks for analytical learning in the human brain, which mimics the mechanisms of the human brain to interpret images, sounds, texts and other data. The concept of deep learning stems from the study of artificial neural networks. A multilayer perceptron with multiple hidden layers is a deep learning structure. It combines low-level features to form a more abstract high-level representation of attribute categories or features to discover distributed feature representations of data.
In 2006, Hinton et al. proposed the concept of deep learning, which brought hope to solve the optimization problems related to deep structure, and then proposed multilayer autoencoder deep structure. In addition, Lecun et al. proposed convolutional neural networks, which were the first true multilayer structure learning algorithms that utilized spatial relative relationships to reduce the number of parameters to improve training performance.
Like machine learning methods, deep machine learning methods can be divided into supervised and unsupervised learning. The learning models established under different learning frameworks are very different. For example, convolutional neural networks (CNNs) are machine learning models with deep supervised learning, while deep belief nets (DBNs) are machine learning models with unsupervised learning. In general, machine learning can be divided into 4 steps, namely, analysis and definition of problems, data preprocessing, model (algorithm) selection, model training, evaluation and optimization, and model deployment application, as shown in Figure 4.
Analyze and define problems
Analyzing and defining problems means analyzing the objectives, properties and types of problems according to the actual problems faced, and making clear whether they are classification problems, clustering problems, regression problems, or other types of problems. data preprocessing
All machine learning algorithms are built on data. Before entering the model training, data preprocessing must be performed.
The first is the process of collecting data, such as reading databases, data warehouses, data files, crawling data using web crawlers, etc.
This is followed by data cleansing, including data format conversion, converting the data into a form that the algorithm can handle; processing noisy data, missing values; sampling of the data (probably not so much data is needed); and equivalence transformation of the data, including measures for unifying the data (which are important in distance calculations), zero-averaging, normalization, decomposition of attributes, and merging.
A preliminary analysis of the data is sometimes required to give some initial insight into the data. If the data is labeled, you can figure out the distribution of categories, so you can know the lower limit of the accuracy of the model classification. One more thing you can do is get associations between attributes. And if so, how strongly. This helps to remove some redundant attributes, reduce the dimensionality of the data, and know which attributes have a greater impact on the results for weight selection. It is also often necessary to visualize the data to preliminarily judge the characteristics, distribution and relevance of the data, such as histogram, scatter plot, box plot, etc. Histograms describe the relationship between the values of each dimension and its class labels, and you can also see from the graph what distribution each dimension of data obeys. Scatter plots are plotted for each two groups of attributes so that the associations between attributes can be easily seen.
Algorithm selection, model training, evaluation and optimization
For a particular problem, sometimes there are many algorithms to solve, so do you need to try each method once? It doesn't have to be, because it takes too much time, and not all algorithms work. Spot check is a quick validation of multiple algorithms to determine which algorithm to train further.
When conducting random checks of the algorithm, it is not necessary to use all the data in the dataset for training, only a small part needs to be used. After the algorithm is selected, all the data are used for further training. This process can be performed using a cross-validation approach.
When performing algorithm sampling, the more types of algorithms in the candidate set, the better, so as to test which type of algorithm can learn the structure in the data better. After selecting an algorithm, it is not necessary to use the algorithm directly for further learning, but an improved version based on the algorithm may be used.
In this part, there is also a very important piece of content is the division of training set, test set, the selection of results measurement standards, and the reliability of results.
Model deployment application
When the trained model can solve a problem well, it is combined with the actual system or product to predict and guide various practical problems in production and life.
Read here, this article "What are the introductory knowledge points of machine learning" has been introduced, want to master the knowledge points of this article still need to practice to understand, if you want to know more related content articles, welcome to pay attention to the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.