IBM's long interpretation of artificial Intelligence, Machine Learning and Cognitive Computing 07/04 Update SLTechnology News&Howtos

IBM's long interpretation of artificial Intelligence, Machine Learning and Cognitive Computing

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

IBM's long interpretation of artificial Intelligence, Machine Learning and Cognitive Computing

The development of artificial intelligence has experienced several ups and downs, and recently it has ushered in a new wave of unprecedented highlights driven by deep learning technology. Recently, the official website of IBM published an overview article, which briefly combed the development process of artificial intelligence technology, and introduced the concepts and principles of perceptron, clustering algorithm, rule-based system, machine learning, deep learning, neural network and other technologies.

Human beings have never stopped thinking about how to create intelligent machines. During this period, the development of artificial intelligence has its ups and downs, including success and failure, as well as the potential hidden in it. Today, there are too many news reports about the application of machine learning algorithms, from cancer detection prediction to image understanding and natural language processing, artificial intelligence is empowering and changing the world.

The history of modern artificial intelligence has all the elements to be a great play. In the early 1950s, the development of artificial intelligence ushered in its first spring with thinking machines and focus figures such as Alan Turing and von Neumann. After decades of prosperity and decline, and incredibly high expectations, artificial intelligence and its pioneers have once again joined hands to come to a new level. Now, artificial intelligence is showing its real potential, deep learning, cognitive computing and other new technologies continue to emerge, and there is no lack of application direction.

This paper discusses some important aspects of artificial intelligence and its subfields. Let's start with the timeline of the development of artificial intelligence and analyze all the elements one by one.

The timeline of modern artificial intelligence

In the early 1950s, artificial intelligence focused on so-called strong artificial intelligence, hoping that machines could perform any intellectual task like human beings. The development of strong artificial intelligence is stagnant, which leads to the emergence of weak artificial intelligence, that is, the problem of applying artificial intelligence technology to narrower fields. Until the 1980s, the research of artificial intelligence has been separated by these two paradigms, the two camps are opposite each other. However, around 1980, machine learning began to become mainstream, and its aim was to equip computers with the ability to learn and build models so that they could make predictions in specific areas.

Figure 1: timeline of the development of modern artificial intelligence

Based on the research of artificial intelligence and machine learning, deep learning came into being around 2000. Computer scientists have used new topology and learning methods in multilayer neural networks. Finally, the evolution of neural network has successfully solved many thorny problems in many fields.

In the past decade, cognitive computing (Cognitive computing) has also emerged, with the goal of building systems that can learn and interact with human nature. IBM Watson proved the value of cognitive computing by successfully defeating world-class players in Jeopardy games.

In this article, I will explore all the above areas one by one and explain some key algorithms.

Basic artificial intelligence

Studies before 1950 put forward the idea that the brain is made up of electrical impulses, and it is the interaction between impulses that gives rise to human thought and consciousness. Alan Turing shows that all calculations are numbers, so building a machine that can simulate the human brain is not out of reach.

As mentioned above, many of the early studies were strong artificial intelligence, but they also put forward some basic concepts, which are still used by machine learning and deep learning.

Fig. 2 timeline of artificial intelligence methods in 1950-1980

Artificial intelligence search engine

Many problems in artificial intelligence can be solved by brute-force search. However, given the medium-problem search space, the basic search is quickly affected. One of the earliest examples of artificial intelligence search was the development of checkers programs. Arthur Samuel (Arthur Samuel) built the first checkers program on the IBM 701 electronic data processing machine to optimize the search tree (alpha-beta pruning). The program also records and rewards specific actions, allowing apps to learn every game they play (this is the first self-learning program). In order to improve the learning rate of the program, Samuel programmed it as a self-game to improve his game and learning ability.

Although you can successfully apply search to many simple problems, this method will soon fail as the number of choices increases. Take the simple one-word chess game as an example, at the beginning of the game, there are nine possible moves, each move has eight possible opposite moves, and so on. The complete one-word chess tree contains 362880 nodes. If you continue to extend this idea to chess or go, you will soon develop the disadvantage of search.

Perceptron

Perceptron is an early supervised learning algorithm for single-layer neural networks. Given an input feature vector, the perceptron can classify the input. By using the training set, the weights and deviations of the network can be updated for linear classification. The perceptron was first implemented in IBM 704 and then used for image recognition on custom hardware.

Figure 3: perceptrons and linear classification

As a linear classifier, the perceptron has the ability to solve the linear separation problem. A typical example of the limitation of a perceptron is its inability to learn its own OR (XOR) function. Multilayer perceptrons solve this problem and lay the foundation for more complex algorithms, network topology and deep learning.

Clustering algorithm

The use of perceptrons is supervised. Users provide data to train the network, and then test the network on new data. Clustering algorithm is an unsupervised learning (unsupervised learning) method. In this model, the algorithm organizes a set of feature vectors into clusters according to one or more attributes of the data.

Figure 4: clustering in a two-dimensional feature space

The simplest clustering algorithm you can implement with a small amount of code is k-means (k-means). Where k represents the number of clusters you assign to the sample. You can initialize a cluster with a random feature vector, and then add other samples to its nearest neighbor cluster (assuming that each sample can represent a feature vector, and you can use Euclidean distance to determine the "distance"). As you add more samples to a cluster, its centroid (centroid, the center of the cluster) will be recalculated. The algorithm then re-examines the samples to ensure that they are all in the nearest neighbor clustering until no samples need to change the cluster.

Although k-means clustering is relatively effective, you must determine the size of k in advance. Depending on the data, other methods may be more effective, such as hierarchical clustering (hierarchical clustering) or distribution-based clustering (distribution-based clustering).

Decision tree

Decision trees are very similar to clustering. Decision tree is a prediction model about observation, and some conclusions can be drawn. The conclusion is represented as a leaf on the decision tree, and the node is the decision point of the observation bifurcation. The decision tree comes from the decision tree learning algorithm, in which the data set is divided into different subsets according to the attribute value test (attribute value tests). This segmentation process is called recursive partition (recursive partitioning).

Consider the example in the following figure. In this dataset, I can observe whether someone is productive or not based on three factors. Using a decision tree learning algorithm, I can identify attributes by an indicator (one example is information gain). In this example, mood is the main factor affecting productivity, so I split the data set based on whether the Good Mood item is Yes or No. However, on the Yes side, I also need to shred the dataset again based on the other two attributes. The different colors in the table correspond to the different colored leaf nodes on the right.

Figure 5: a simple dataset and its resulting decision tree

An important nature of decision trees is their inherent organizational ability, which allows you to easily (graphically) explain the way you classify an item. Popular decision tree learning algorithms include C4.5 and classification and regression tree (Classification and Regression Tree).

Rule-based system

The earliest system based on rules and reasoning was Dendral, which was developed in 1965, but it was not until the 1970s that the so-called expert system (expert systems) became popular. A rule-based system will also have rules for the required knowledge and will use an inference system (reasoning system) to draw conclusions.

A rule-based system usually consists of a rule set, a knowledge base, a reasoning engine (using forward or reverse rule chains), and a user interface. In the picture below, I use the knowledge that "Socrates is a man", the rule "if it is a man, he will die" and an interaction "who will die? "

Figure 6: rule-based system

Rule-based systems have been applied in the fields of speech recognition, planning and control, and disease recognition. Kaleidos, a system developed in the 1990s to monitor and diagnose dam stability, is still in use today.

Machine learning

Machine learning is a sub-field of artificial intelligence and computer science, and it also has the foundation of statistics and mathematical optimization. Machine learning covers both supervised and unsupervised learning technologies and can be used for prediction, analysis and data mining. Machine learning is not limited to deep learning. But in this section, I'll introduce several algorithms that make deep learning so efficient.

Figure 7: timeline of machine learning methods

Back propagation

The powerful power of neural network comes from its multi-layer structure. The training of monolayer perceptrons is very direct, but the resulting network is not strong. Then the question arises: how do we train multi-layer networks? This is the opportunity for backpropagation to display its talents.

Back propagation is an algorithm for training multi-layer neural networks. Its working process is divided into two stages. The first stage is to propagate the input through the whole neural network to the last layer (called feedforward). In the second stage, the algorithm calculates an error and then backpropagates the error from the last layer to the first layer (adjusting the weight).

Figure 8: schematic diagram of back propagation

In the process of training, the middle layer of the network will organize itself and map part of the input space to the output space. Back propagation, using supervised learning, can identify the error of input-to-output mapping, and then adjust the weight (using a learning rate) to correct the error. Back propagation is still an important aspect of neural network learning. As computing resources get faster and cheaper, it will continue to be used in larger and denser networks.

Convolution neural network

Convolution neural network (CNN) is a multi-layer neural network inspired by animal visual cortex. This architecture is useful in many applications, including image processing. The first CNN was created by Yann LeCun, when the CNN architecture was mainly used for handwritten character recognition tasks, such as reading zip codes.

LeNet CNN consists of several layers of neural networks which can realize feature extraction and classification respectively. The image is divided into a plurality of acceptable regions, which enter into a convolution layer that can extract features from the input image. The next step is pooling, which reduces the dimension of the features extracted by the convolution layer (through downsampling) while retaining the most important information (usually through the maximum pooling method). The algorithm then performs another convolution and pooling, which then enters a fully connected multilayer perceptron. The final output of the convolution neural network is a set of nodes that can recognize image features (in this case, each recognized number is a node). Users can train the network through back propagation.

Figure 9.LeNet convolution neural network architecture

The use of deep processing, convolution, pooling and fully connected classification layer opens the door for a variety of new applications of neural networks. In addition to image processing, convolution neural network has been successfully applied to a variety of tasks such as video recognition and natural language processing. Convolution neural network has also been effectively implemented on GPU, which greatly improves the performance of convolution neural network.

Long and short term memory (LSTM)

Remember the previous discussion in back propagation? The network is feedforward training. In this architecture, we send inputs to the network and propagate them forward to the output layer through the hidden layer. However, there are other topologies. The architecture I'm going to study here allows direct loops to be formed between nodes. These neural networks are called cyclic neural networks (RNN), and they can feed content to the previous layer or subsequent nodes in the same layer. This feature makes these networks idealized for time series data.

In 1997, a special circular network called long-term and short-term memory (LSTM) was invented. LSTM contains memory units in the network that can memorize values for a long or short time.

Figure 10. Long-term and short-term memory networks and memory units

A memory unit contains doors that can control the flow of information into or out of the unit. The input gate (input gate) controls when new information can flow into the memory unit. The amnesia gate (forget gate) controls how long a piece of information remains in the memory unit. Finally, the output gate controls when the output uses the information contained in the memory unit. The memory unit also includes controlling the weight of each door. The training algorithm (usually through back propagation of time (backpropagation-through-time), a variant of the back propagation algorithm) optimizes these weights based on the resulting error.

LSTM has been used in speech recognition, handwriting recognition, speech synthesis, image description and other tasks. I will also talk about LSTM next.

Deep learning

Deep learning is a relatively novel set of methods, which fundamentally change machine learning. Deep learning itself is not an algorithm, but it is a series of algorithms that can use unsupervised learning to implement deep networks. These networks are very deep, so new computing methods are needed to build them, such as GPU, in addition to computer clusters.

At present, this paper has introduced two deep learning algorithms: convolution neural network and long-term and short-term memory network. These algorithms have been combined to implement some surprisingly intelligent tasks. As shown in the following figure, convolution neural networks and short-and long-term memory have been used to identify and describe objects in pictures or videos in natural language.

Figure 11. Image description based on convolution neural network and long-term and short-term memory

Deep learning algorithms have also been used in face recognition, can also identify tuberculosis with 96% accuracy, and are also used in self-driving and other complex problems.

However, although there are many results in the use of deep learning algorithm, there are still problems that need to be solved. A recent application of deep learning for skin cancer detection has found that this algorithm is more accurate than certified dermatologists. However, doctors can list the factors that lead to their diagnosis, but there is no way to know the factors used in the classification of deep learning procedures. This is called the black box problem of deep learning.

Another application, called Deep Patient, can successfully predict disease when providing patients with cases. The app proved to be better than doctors at predicting disease-even the notoriously unpredictable form of schizophrenia. So, even if the model works well, no one can go deep into these large neural networks to find out why.

Cognitive computing

Artificial intelligence and machine learning are full of cases of biological enlightenment. Although early artificial intelligence focused on the ambitious goal of building machines that mimic the human brain, now cognitive computing is moving towards that goal.

Cognitive computing is based on neural network and deep learning, and uses the knowledge of cognitive science to build a system that can simulate the process of human thinking. However, cognitive computing covers many disciplines, such as machine learning, natural language processing, vision and human-computer interaction, rather than just focusing on a single technology.

An example of cognitive learning is IBM's Waston, which shows the most advanced question-and-answer interaction at that time on Jeopardy. IBM has extended it to a range of web services. These services provide programming interfaces for a number of applications to build powerful virtual proxies. These interfaces include: visual recognition, speech text conversion (speech recognition), text to speech conversion (speech synthesis), language understanding and translation, and dialogue engine.

move on

This article covers only a small part of the history of artificial intelligence and the latest neural networks and deep learning methods. Although artificial intelligence and machine learning have experienced many ups and downs, new methods such as deep learning and cognitive computing have significantly improved the level of these disciplines. Although it may not be possible to achieve a conscious machine, today there are artificial intelligence systems that can improve human life.

(source: artificial intelligence industry chain alliance)

Transferred from:

Https://mp.weixin.qq.com/s/WSiNxdLoGlyibdMFJqfOFQ

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.