In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces "what is tensor tensor". In daily operation, I believe many people have doubts about what tensor tensor is. The editor consulted all kinds of data and sorted out simple and easy-to-use operation methods. I hope it will be helpful for you to answer the doubts about "what is tensor tensor?" Next, please follow the editor to study!
Maybe you've downloaded TensorFlow and are ready to start working on deep learning. But you will wonder: what on earth is the Tensor in TensorFlow, that is, the tensor? Maybe you checked Wikipedia, and now you're getting more confused. Maybe you see it in the NASA tutorial and still don't know what it's talking about? The problem is that most of the guidelines on tensors assume that you have mastered all the terms they use to describe mathematics.
Don't worry!
I hate math as much as a child, so if I can understand, so can you! We just need to explain it all in simple terms. So, what is a tensor (Tensor), and why is it Flow?
0-dimensional tensor / scalar scalar is a number
1-dimensional tensor / vector 1-dimensional tensor is called "vector".
Two-dimensional tensor is called matrix.
3D tensor common data stored in tensor time series data stock price text data color picture (RGB)
Let's first take a look at what the tensor is.
Tensor = container
Tensor is the foundation of modern machine learning. At its core is a data container, which in most cases contains numbers and sometimes strings, but this is rare. So think of it as a digital bucket.
There are many forms of tensors, and first let's look at the most basic forms that you will encounter in deep learning, between 0 and 5 dimensions. We can think of the various types of tensors as this (to be attracted by the cat in the title, don't worry! The cat will appear in the back!) :
0-dimensional tensor / scalar, each number in a tensor / container bucket is called a "scalar". Scalar is a number. You may ask, why not just call them a number? I don't know, maybe mathematicians just like to sound cool? Scalars do sound cooler than numbers.
In fact, you can use a tensor of a number, which we call a 0-dimensional tensor, which is a tensor with only 0 dimensions. It's just a bucket with a number. Imagine that there is only one drop of water in the bucket, and that is a 0-dimensional tensor.
In this tutorial, I will use the Python,Keras,TensorFlow and the Python library Numpy. In Python, the tensor is usually stored in the Nunpy array, and Numpy is a very frequently used packet for scientific computing in most AI frameworks.
You'll often see Jupyter Notebooks on Kaggle (see the reading link at the end of the article, "learn AI: take you to build an economic trial version of AI) about turning data into Numpy arrays. Jupyter notebooks is essentially embedded by working code tags. It can be thought of as a combination of interpretation and program.
Why do we want to convert the data to Numpy arrays?
It's simple. Because we need to convert all input data, such as string text, images, stock prices, or video, into a uniform standard so that it can be easily processed.
In this way, we convert the data into digital buckets, so we can process them with TensorFlow.
It simply organizes the data into a usable format. In web programs, you may represent them through XML, so you can define their characteristics and operate them quickly. Similarly, in deep learning, we use tensor buckets as basic Lego blocks.
1-dimensional tensor / vector if you are a programmer, you already know that it is similar to 1-dimensional tensor: array.
Every programming language has an array, which is just a set of data blocks in a single column or row. It is called a 1-dimensional tensor in deep learning. The tensor is defined according to the total number of axes. The 1-dimensional tensor has only one axis. A 1-dimensional tensor is called a vector. We can think of a vector as a single-column or single-row number.
If you want to get this result in Numpy, you can use the NumPy's ndim function to see that the tensor has multiple axes. We can try an one-dimensional tensor.
2-dimensional tensor you may already know another form of tensor, matrix-- 2-dimensional tensor is called matrix, this is not Keanu Reeves (Keanu Reeves) movie "Matrix", imagine an excel table. We can think of it as a digital grid with rows and columns. This row and column represent two axes. One matrix is a two-dimensional tensor, which means there are two dimensions, that is, a tensor with two axes.
In Numpy, we can say as follows:
X = np.array ([5, 10, 15, 30, 25), [20, 30, 65, 70, 90], [7, 80, 95, 20, 30])
We can store human characteristics in a two-dimensional tensor. A typical example is a mailing list.
For example, we have 10000 people, and we have the following characteristics and characteristics of each of us:
First Name (first name) Last Name (surname) Street Address (Street address) City (City) State (State) Country (country) Zip (Postal Code)
This means that we have seven characteristics of 10000 people.
The tensor has a "shape", and its shape is a bucket, that is, the maximum size of the tensor is defined by our data. We can put everyone's data into a two-dimensional tensor, which is (10000pc7).
You might want to say it has 10000 columns and seven rows. No, no. Tensors can be converted and manipulated so that columns become rows or rows become columns.
3D tensor
At this point, tensors really become useful, and we often need to store a series of two-dimensional tensors in buckets, which forms a three-dimensional tensor.
In NumPy, we can express as follows:
X = np.array ([[5pr 10rect 15je 15jort 30dr 25], [20pje 30pr 65re 70pr 90], [780pr 95je 20rei30] [[3mie 0liere 5je 0pr 45], [12mie Ling 2mei 6phe 7mei 90], [18lie Ling 95120je 95120je 30], [18phe Ling 95120je 30] [[17pr 13pr 25pens 30g 15], [23kim 36 MIT 9Q 7J 80], [1le le le 7le 522que 3]))
You've guessed that a three-dimensional tensor has three axes, and you can see it like this:
The output of x.ndim is: 3
Let's take a look at the mailing list above. Now that we have 10 mailing lists, we will store the two-dimensional tensor in another bucket and create a three-dimensional tensor, which has the following shape:
(number_of_mailing_lists, number_of_people, number_of_characteristics_per_person)
(10pc10000pcm7)
You may have guessed it, but a three-dimensional tensor is a cube of numbers.
We can continue to stack cubes and create a larger and larger tensor to edit different types of data, that is, 4-dimensional tensor, 5-dimensional tensor, and so on, up to N-dimensional tensor. N is an unknown number defined by mathematicians. It is an additional unit that lasts all the way to an infinite set. It can be 5 or 10 or infinity.
In fact, a three-dimensional tensor is best seen as a cube with such things as length, width and height.
Formulas stored in tensor data
Here are some common dataset types stored in various types of tensors:
3D = time series
4-D = Ima
5 Dimensions = Video
What almost all these tensors have in common is the sample size. The sample size is the number of elements in the collection, which can be some images, some videos, some files, or some Twitter.
Usually, the real data is at least one amount of data.
Think of the different dimensions in the shape as fields. We find the minimum value of a field to describe the data.
Therefore, even though the 4-dimensional tensor usually stores the image, that is because the sample size occupies the fourth field of the tensor.
For example, an image can be represented by three fields:
(width, height, color_depth) = 3D
However, in machine learning, we often have to deal with more than one picture or document-we have to deal with a collection. We might have 10000 pictures of tulips, which means we're going to use a 4D tensor, like this:
(sample_size, width, height, color_depth) = 4D
Let's take a look at some examples of multi-dimensional tensor storage models:
Time series data
It is very effective to simulate time series with 3D tensor!
Medical scanning-We can encode the EEG signal into a 3D tensor because it can be described by these three parameters:
(time, frequency, channel)
The transformation looks like this:
If we have brainwave scans of multiple patients, it forms a 4D tensor:
(sample_size, time, frequency, channel)
Stock Prices
In trading, the stock has the highest, lowest and final price per minute. The candle diagram in the following figure is shown:
The New York Stock Exchange opens from 9:30 to 4:00, or 6.5 hours, for a total of 6.5 x 60 = 390 minutes. In this way, we can deposit the highest, lowest and final stock prices per minute into a 2D tensor (390.3). If we track transactions for a week (five days), we will get a 3D tensor:
(week_of_data, minutes, high_low_price)
That is: (5, 390, 390, 3)
By the same token, if we look at 10 different stocks for a week, we will get a 4D tensor.
(10pr 5pr 390pr 3)
Suppose we are looking at a mutual fund of 25 stocks, each of which is represented by our 4D tensor. So, this mutual fund can have a 5D tensor to say:
(25pr 10pr 5pr 390pr 3)
Text data
We can also use 3D tensors to store text data. Let's take a look at the example of Twitter.
First of all, Twitter has a 140-character limit. Second, Twitter uses the UTF-8 coding standard, which can represent millions of characters, but in fact we are only interested in the first 128characters because they are the same as ASCII codes. So, a tweet can be packaged as a 2D vector:
(140128)
If we downloaded 1 million tweets by Chuan Puge (as many tweets as he can remember in a week), we would save them in 3D tensor:
(number_of_tweets_captured, tweet, character)
This means that our collection of Trump tweets will look like this:
(1000000140128)
Picture
The 4D tensor is very suitable for storing picture files such as JPEG. As we mentioned earlier, a picture has three parameters: height, width, and color depth. A picture is a 3D tensor, a picture set is 4D, and the fourth dimension is the sample size.
The famous MNIST data set is a handwritten digital sequence. As an image recognition problem, it has perplexed many data scientists for decades. Now, computers can solve this problem with 99% or more accuracy. Even so, this data set can still be used as an excellent benchmark for testing new machine learning algorithm applications or for doing experiments on your own.
Keras can even help us import MNIST datasets automatically with the following statements:
From keras.datasets import mnist (train_images, train_labels), (test_images, test_labels) = mnist.load_data ()
This data set is divided into two parts: the training set and the test set. Each picture in the dataset has a label. This label has the correct readings, such as 3, 7 or 9, which are judged and filled in manually.
The training set is used to train the neural network learning algorithm, and the test set is used to verify the learning algorithm.
MNIST images are black and white, which means they can be encoded in 2D tensors, but we are used to encoding all images in 3D tensors, and the extra third dimension represents the color depth of the image.
The MNIST dataset has 60000 images, all 28 x 28 pixels, with a color depth of 1, that is, only grayscale.
TensorFlow stores image data like this:
(sample_size, height, width, color_depth)
So we can assume that the 4D tensor of the MNIST dataset is as follows:
(60,000,28,28,28,0,000,28,0,28,000,28,000,28,000,28,000,28,000,28,28,100)
Color picture
Color pictures have different color depths, depending on their color (note: has nothing to do with resolution) coding. A typical JPG image uses RGB coding, so it has a color depth of 3, representing red, green, and blue, respectively.
This is a picture of my beautiful cat (Dove) at 750 x 750 pixels, which means we can represent it in a 3D tensor:
(750, 750.750.3)
My beautiful cat Dove (750x750 pixels)
In this way, my lovely Dove will be reduced to a cold string of numbers, as if it had deformed or flowed.
Then, if we have a lot of pictures of different types of cats (though none of them are as beautiful as Dove), maybe 100000, not DOVE, 750 x 750 pixels. We can define it as a 4D tensor in Keras:
(10000, 750, 750, 750, 3)
5D tensor
5D tensor can be used to store video data. In TensorFlow, video data will be encoded as follows:
(sample_size, frames, width, height, color_depth)
If we look at a video with 5 minutes (300s), 1080pHD (1920 x 1080 pixels), 15 frames per second (a total of 4500 frames) and a color depth of 3, we can store it with a 4D tensor:
(4500, 1920, 1080, and 3)
When we have multiple videos, the fifth dimension of the tensor will be used. If we had 10 such videos, we would get a 5D tensor:
(10, 4500, 1920, 1080, 3)
In fact, this example is crazy!
The magnitude of this tensor is ridiculous, more than 1TB. Let's consider this example to illustrate a problem: in the real world, we sometimes need to shrink the sample data as much as possible to facilitate processing and calculation, unless you have endless time.
The number of median values of this 5D tensor is:
10 x 4500 x 1920 x 1080 x 3 = 279936000000
In Keras, we can use a data type called dype to store floating-point numbers for 32bits or 64bits
Each value in our 5D tensor will be stored in 32 bit, and now we convert it in units of TB:
279936000000 x 32 = 8957952000000
This is only a conservative estimate, maybe using 32bit for storage is not enough (who can calculate what will happen if you use 64bit to store it), so reduce your sample size, my friend.
In fact, I cite this last crazy example with a special purpose. We have just learned about data preprocessing and data compression. You can't throw piles of data at your AI model without doing anything. You have to clean and reduce that data to make the follow-up work more concise and efficient.
Reduce the resolution, remove unnecessary data (that is, de-reprocessing), which greatly reduces the number of frames, and so on, which is also the work of data scientists. If you can't preprocess the data well, you can hardly do anything meaningful.
At this point, the study of "what is a tensor tensor" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.