Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is big data's concept?

2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

The editor of this article introduces in detail "what is big data's concept". The content is detailed, the steps are clear, and the details are handled properly. I hope this article "what is big data's concept" can help you solve your doubts. Let's go deep into the editor's train of thought and learn new knowledge together.

With the arrival of the era of big data, "big data" has become a popular word in the Internet information technology industry. On the question of what is big data, everyone agrees with big data's "4V" view. Big data's four V, that is to say, big data's four characteristics, are large amount of data (Volume), various data types (Variety), fast processing speed (Velocity) and low value density (Value).

1. Large amount of data (Volume)

If words and graphics printed on paper are also regarded as data, then the first data explosion in human history occurred during the invention of papermaking and printing. In the more than two decades from 1986 to 2010, the number of data generated around the world has increased 100-fold.

With time, the speed of data generation is faster, we are living in an era of "data explosion".

Today, 30% of the world's settings are connected to the Internet, and in the near future, more users will become Internet users, cars, televisions, household appliances, production machines and other devices will also be fully connected to the Internet. With the promotion and popularization of the Internet of things, a variety of sensors and cameras will be found in every corner of our work and life, and these devices are automatically generating a large amount of data all the time.

According to estimates made by the famous consulting firm IDC (Internet Data Center), the data generated by human society has been growing at a rate of 50 per cent a year, that is, doubling every two years, which is known as big data Moore's Law.

This means that the amount of data generated by human beings in the last two years is equal to the sum of all the data generated before. It is estimated that by 2020, the world will have a total of 35ZB data, compared with 2010, the amount of data will increase to nearly 30 times.

Unit conversion relation

Byte (byte) 1Byte=8bit

KB (Kilobyte kilobytes) 1KB=1024Byte

MB (MegaByte, megabyte) 1MB=1024KB

GB (Gigabyte, gigabyte) 1GB=1024MB

TB (Trillionbyte, terabyte) 1TB=1024GB

PB (Petabyte, petabytes) 1PB=1024TB

EB (Exabyte, exabyte) 1EB=1024PB

ZB (Zettabyte, zettabyte) 1ZB=1024EB

2. Various data types (Variety)

Big data has many data sources, scientific research, enterprise applications and Web applications are constantly generating new data. Biological big data, Traffic big data, Medical big data, Telecom big data, Electric Power big data, Financial big data and so on have all shown "blowout" growth, and the number involved is so huge that they have jumped from TB level to PB level.

Big data has a wide range of data types, including structured data and unstructured data, of which the former accounts for about 10%, which mainly refers to the data stored in relational databases, while the latter accounts for about 90%. It mainly includes email, audio, video, Wechat, Weibo, location information, link information, mobile call information, web log and so on.

Such a wide variety of heterogeneous data poses new challenges to data processing and analysis technology, as well as new opportunities.

3. Fast processing speed (Velocity)

The speed of data generation in the era of big data is very fast. In the Web 2.0 app, Sina can generate 20, 000 Weibo posts in a minute, Twitter can generate 100000 tweets, Apple can download 47000 apps, Taobao can sell 60, 000 items, Renren can generate 300000 visits, Baidu can generate 900000 search queries, and Facebook can generate 6 million views. The famous large Hadron Collider (LHC) produces about 600 million collisions per second and generates about 700MB data per second, and thousands of computers analyze these collisions.

Many applications in big data era need to give real-time analysis results based on rapidly generated data to guide production and life practice. therefore, the speed of data processing and analysis usually has to reach a second response. this is essentially different from the traditional data mining technology, which usually does not require real-time analysis results.

In order to achieve the purpose of rapid analysis of massive data, the emerging big data analysis technology usually uses cluster processing and unique internal design. Take Google's Dremel as an example, it is a scalable and interactive real-time query system for the analysis of read-only nested data. By combining multi-level tree execution process and column data structure, it can complete the aggregate query of trillions of tables in a few seconds. The system can be extended to thousands of CPU to meet the needs of tens of thousands of Google users to operate PB-level data. And the query of PB level data can be completed in 2 to 3 seconds.

4. Low value density (value)

Although big data looks beautiful, the value density is far lower than the data already available in the traditional relational database. In the era of big data, a lot of valuable information is scattered in huge amounts of data. Take the surveillance video of the community as an example, if there are no accidents, the continuously generated data is of no value, and when unexpected situations such as theft occur, only a small video recording the event process is valuable. However, in order to obtain that valuable video in the event of unexpected situations such as theft, we have to invest a lot of money to buy monitoring equipment, network equipment, and storage equipment, which consumes a lot of power and storage space. to preserve the continuous monitoring data from the camera.

If this example is not typical enough, then we can imagine a larger scenario. Suppose an e-commerce website wants to carry out targeted marketing through Weibo data. In order to achieve this goal, it is necessary to build a big data platform that can store and analyze Sina Weibo data, so that it can forecast targeted commodity demand trends according to users' Weibo content. The vision is beautiful, but the reality is very costly. It may cost millions of yuan to build the entire big data team and platform, and the final increase in enterprise sales profits may be much lower than the investment. Big data's value density is low.

After reading this, the article "what is the concept of big data" has been introduced. If you want to master the knowledge points of this article, you still need to practice and use it yourself to understand it. If you want to know more about related articles, welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report