In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
How to carry out big data analysis based on mdrill, I believe that many inexperienced people are at a loss about it. Therefore, this paper summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.
Project profile
Mdrill is a set of data software opened by Ali. For the amount of TB-level data, it can respond in seconds with only 10 machines, the data can be imported in real time, and any dimension can be combined and filtered.
As an online data analysis and processing software, mdrill can analyze the data of any combination of dimensions at the level of ten billion in a few seconds to tens of seconds.
In Ali, 10 machines complete 3 billion of the daily data storage, of which 1 billion are real-time data imports and 2 billion are offline imports. At present, the total storage of the cluster is more than 100 billion 80 million 400 dimensions of data.
Characteristics of mdrill
1. To meet big data query needs: adhoc daily data volume of 3 billion, with the accumulation of time, the data will become larger and larger, mdrill uses column storage, index, distributed technology, appropriate partitions to meet the needs of users for real-time online analysis of data.
two。 Support for incremental updates: offline mdrill data supports incremental updates based on partitions.
3. Support for real-time data import: real-time import at level 1 billion per day (peak 200 million per hour) is supported with only 10 machines.
4. Fast response time: column storage, inverted indexing, efficient data compression, memory computing, various caches, partitions, distributed processing, etc., enable mdrill to analyze tens of billions of levels of data in only a few seconds to tens of seconds.
5. Low cost: at present, there are only 10 PCs with 48G memory in Ali adhoc, but it does store more than 100 billion data.
6. Full-text search mode: powerful condition settings, any combination, no matter difficult or easy second preview, 16 billion of the data every day are screened at will.
The growth of mdrill data
Time point
Amount of data
Event
December of the year
Less than 200 million
Adhoc debuts for the first time
January, 13.
20 ~ 3 billion
The capacity was expanded from 2 machines to 10.
May 2, 13
10 Billion
More than ten billion for the first time
July 24, 13
40 billion
Open source for the first time
November 13
100 billion
Full-text Retrieval Mode ods_allpv_ad_d launched
December 13
150 billion
Access to real-time data and wireless data
February 14
320 billion
At present, there are only 11 machines, and the utilization rate of hard disk is 30%.
After reading the above, have you mastered the method of how to analyze big data based on mdrill? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
Attachment to mf: http://down.51cto.com/data/2367020
© 2024 shulou.com SLNews company. All rights reserved.