In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-14 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
What this article shares with you is about the difference and relationship between big data and massive data. The editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article. Let's take a look at it with the editor.
"big data" contains the meaning of "massive data" and transcends massive data in content. In short, "big data" is "massive data" + complex type data. The size or complexity of all big data's data sets, including trading and interactive data sets, exceeds the ability of common technologies to capture, manage and process these data sets at a reasonable cost and time frame.
If it is only a huge amount of structural data, then the solution is relatively simple, users through the purchase of more storage devices, improve the efficiency of storage devices to solve such problems. However, when it is found that the data in the database can be divided into three types: structured data, unstructured data and semi-structured data, the problem does not seem to be that simple.
Big data attacked in a surge.
When the type of complex data surges, then the impact on the user's IT system will be another way to deal with. Through some market research data, many industry experts and third-party research institutions have found that the era of big data is coming. A survey found that 85% of these complex data are unstructured data that widely exist in social networks, Internet of things, e-commerce and so on. The emergence of these unstructured data is often accompanied by the emergence and application of new channels and technologies such as social networks, mobile computing and sensors.
Nowadays, there is also a lot of hype and a lot of uncertainty in big data's concept. To this end, the editor asked some industry experts about the relevant issues in detail, asked them to talk about what big data is and what is not, and how to deal with big data and other issues, and met with netizens in the form of the series of articles.
Some people also call multi-TB datasets "big data". According to market research firm IDC, data usage is expected to increase 44-fold, and global data usage will reach about 35.2ZB (1ZB).
= 1 billion TB). However, the file size of individual datasets will also increase, resulting in the need for greater processing power in order to analyze and understand these datasets.
EMC has said that more than 1000 of its customers use more than 1PB (Gigabit) data in its arrays, and that number will grow to 100000 by 2020. Some customers will start using thousands of times more data within a year or two, 1EB (1 exabyte)
= 1 billion GB) or more data.
For large companies, big data's rise is partly due to the fact that computing power is available at a lower cost, and systems are now able to perform multitasking. Second, the cost of memory is also plummeting, enterprises can process more data in memory than ever before, and it is becoming easier and easier to aggregate computers into server clusters. IDC believes that the combination of these three factors gave birth to big data. At the same time, IDC also said that if a technology is to become big data's technology, it must first be affordable, and secondly, it must meet two of the three "V" criteria described by IBM: variety, volume and velocity.
The difference between big data and massive data
Diversity means that data should contain both structured and unstructured data.
Mass means that the amount of data aggregated together for analysis must be very large.
Speed means that the speed of data processing must be very fast.
Big data "doesn't always say that there are hundreds of TB. According to the actual use, the data of hundreds of GB can sometimes be called big data, which mainly depends on its third dimension, that is, the speed or time dimension.
According to Garter, the volume of global information is growing at an annual rate of more than 59 per cent, and volume is a significant challenge in managing data and business, and IT leaders must focus on the amount, type and speed of information.
Volume: the increase in the amount of data within the enterprise system is caused by transaction volume, other traditional data types and new data types. Too much data is a storage problem, but too much data is also a problem of massive analysis.
Category: IT leaders have been having trouble translating large amounts of trading information into decisions-there are now more types of information to analyze-
Mainly from social media and mobile (context-aware). Categories include tabular data (databases), hierarchical data, files, e-mail, measurement data, video, still images, audio, stock market data, financial transactions and more.
Speed: this involves data flow, the creation of structured records, and the availability of access and delivery. Speed means how fast the data is being generated and how quickly the data must be processed to meet the demand.
Although big data is a major issue, Gartner analysts say the real problem is to make big data more meaningful, looking for models in big data to help organizations make better business decisions.
Hundreds of scholars talk about how to define "big data"
Although "Big Data" can be translated into big data or massive data, there is a difference between big data and massive data.
Definition 1: big data = massive data + complex type data
Dan Bin, chief product consultant of Informatica China, believes that "big data" contains the meaning of "massive data" and transcends massive data in content. In short, "big data" is "massive data" + complex type data.
But Mr Bin further points out that the size or complexity of all big data's data sets, including trading and interactive data sets, exceeds the ability of commonly used technologies to capture, manage and process them at a reasonable cost and time frame.
Big data is composed of three major technological trends:
Massive transaction data: traditional relational data and unstructured and semi-structured information continue to grow in online transaction processing (OLTP) and analysis systems from ERP applications to data warehouse applications. This situation becomes more complex as enterprises move more data and business processes to public and private clouds. Huge amount of interactive data: this new force is made up of social media data from Facebook, Twitter, LinkedIn and other sources. It includes call detail records (CDR), device and sensor information, GPS and geolocation mapping data, massive image files transferred through the managed File transfer (ManageFile Transfer) protocol, Web text and clickstream data, scientific information, email, and so on. Massive data processing: the emergence of big data has spawned architectures designed for data-intensive processing, such as Apache Hadoop with open source and running in commodity hardware clusters. The challenge for enterprises is to quickly and reliably access data from Hadoop in a cost-effective manner.
Definition 2: big data includes three elements A, B and C.
How to understand big data? NetApp
Chen Wen, general manager of Greater China, believes that big data means making a breakthrough by making things different through faster access to information. Big data is defined as a large amount of data (usually unstructured), which requires us to rethink how to store, manage and recover data. So, how old is big? One way to think about this is that it is so large that no tool we use today can handle it, so the key to how to digest data and turn it into valuable insights and information is transformation.
Based on the workload requirements learned from customers, big data as understood by NetApp includes three elements A, B and C: Analytic, Bandwidth and Content.
1. Big Analytics to help gain insights-
It refers to the requirement of real-time analysis of large data sets, which can lead to new business models, better customer service, and achieve better results.
2. High bandwidth (Big Bandwidth) to help you move faster-
Refers to the requirement to deal with extremely high-speed critical data. It supports fast and efficient digestion and processing of large data sets.
3. Big content (Big Content), without losing any information-
Refers to highly scalable data storage that requires extremely high security and can be easily recovered. It supports manageable repositories of information content, not just data that has been stored for too long, and can span different continental plates.
Big data is a breakthrough economic and technological force, it is IT
Support introduces a new infrastructure. Big data's solution removes the limitations of traditional computing and storage. With the help of growing private and public data, an epoch-making new business model is emerging, which is expected to bring new substantial revenue growth points and competitive advantages to big data customers.
The above is what is the difference and connection between big data and massive data. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.