Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Big data's basic concepts

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

The concept of big data must be familiar to everyone. After all, it is one of the hottest topics in recent years. Today, when computers and the Internet are so popular, all of us generate a lot of data on the Internet every day, such as data when browsing goods on Taobao and instant messaging using social app. The daily rise and fall of the stock market and trading volume are also data. Thus it can be seen that how large the data generated on the Internet every day is, it can be said that the data is everywhere:

However, the large amount of data is only one of the features of big data's concept. Big data has four features referred to as 4V features:

In 2001, Gartner analyst Doug Lenny pointed out in a speech related to his 2001 research that data growth has challenges and opportunities in three directions: Volume, that is, the amount of data; Velocity, that is, the speed of data input and output; and Variety, that is, diversity.

On the basis of Lenny's theory, IBM put forward the 4V characteristics of big data, which has been widely recognized by the industry. First, the quantity (Volume), that is, the data is huge, jumping from the TB level to the PB level; second, Variety, that is, there are a variety of data types, including not only traditional formatted data, but also web logs, videos, pictures, geographic location information from the Internet, etc.; third, Velocity, that is, the processing speed is fast, and if the processing speed is not high enough, it cannot be applied to scenarios where data is updated in real time. Fourth, Value, that is, the pursuit of high-quality and valuable data.

Big data 4V features:

A large amount of Volume, since it is called big data, then the amount of data must be large Variety diversity, data can be a variety of structures, can be structural data, semi-structural data and non-structural data Value value, these large amounts of data need to be able to be mined out of valuable data, because worthless data is just a pile of garbage Velocity high speed, data processing speed is fast, timeliness is strong Because in many scenarios, you have to update and detect data in real time.

To learn more about big data's 4V features, please refer to the following article:

Http://www.mahaixiang.cn/sjfx/803.html

Https://www.jianshu.com/p/b3281082edb3

Https://www.leiphone.com/news/201410/NgTsZw3yDjEbk9on.html

The problem that big data wants to solve

Big data is used to mine valuable data, if the data can not bring value to the enterprise, can not bring better experience to users, then the data is useless. Mining value from data is the problem that big data wants to solve, just like panning for gold and mining. We use big data technology to mine useful data from massive data and eliminate useless data:

The challenge brought by big data

The technologies involved by big data:

1. Data acquisition:

We need to collect and centralize the scattered data before we can analyze the data.

two。 Data storage:

After collecting a large amount of data, storage is a problem, and the storage space is large enough.

3. Data processing / analysis / mining:

After the problem of storage is solved, we begin to process these data, analyze and mine valuable data.

4. Visualization:

Finally, it is necessary to visualize and graphically show the mined data to others, so it is impossible for your leader to look at a pile of numbers or strings.

Big data's challenges in terms of technical architecture:

1. Challenges to existing database management technologies:

It is not realistic for massive data to be stored in traditional relational databases. Although databases can be clustered, they basically cannot handle data analysis above TB level, so structured query and processing can not be used to solve these problems at this stage.

two。 Traditional database technologies do not take into account multiple categories of data:

The structure of relational database is the relational structure of database > table > > field, while big data has the characteristics of data diversification, so it is difficult to store.

3. Real-time technical challenges:

The value of data decreases over time, so it is a problem to make the data present in real time

4. Network architecture, data center, operation and maintenance challenges:

As the data has been in a state of substantial growth, and the data has to be presented in real time, this is a challenge to network transmission. Moreover, there is a large amount of data, which must be stored by multiple servers, which brings some challenges to the data center and operation and maintenance.

Other challenges brought by big data:

1. Data privacy:

Needless to say, the huge amount of data will certainly contain some users' private data, and we have to ensure that these data are not leaked.

two。 Data sources are complex and diverse:

As mentioned earlier, one of the characteristics of big data is the diversity of data. How to deal with diverse data is a problem.

How to deal with the challenges brought by big data

For the challenges mentioned above, Google already has the technology to address them:

MapReduce can solve the problem of computing efficiency. Big Table can solve the problem of read and write speed. GFS can solve the problem of storage capacity.

However, Google only published papers on these technologies, and did not open source these technologies, so we cannot use them. However, fortunately, the Apache Foundation imitated Google's big data technology and developed the Hadoop biosphere, and Hadoop is also a necessary framework for learning big data's technology.

There is also MapReduceHbase in Hadoop, which corresponds to Big TableHDFS and GFS.

How to learn big data well

1. The best way to learn a framework is to check its official website, because the documentation on the official website is the most authoritative and detailed.

two。 Consolidate and integrate the knowledge points through the actual combat of the project

3. Participate in some community activities: Meetup, open source community conferences, offline salons, etc., and communicating with others can help raise your horizons.

4. Remember: more hands-on, more practice, more persistence.

5. It is best to learn English well, because many good technical papers and articles are in English, and the language of the official website is also English.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report