In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article will explain in detail how to do a large amount of TB-level data in big data's analysis. The content of the article is of high quality, so the editor will share it with you for reference. I hope you will have some understanding of the relevant knowledge after reading this article.
Data analysis often encounters the problem of large amount of data, such as memory overflow when using R language and Python language, even if the whole machine memory is used to achieve the maximum utilization rate, it is still useless, for example, the amount of data is 10T, and under the large amount of data, we should not only ensure that the data can get results, but also need a good model for iterative training to get a good model. These are hard.
There are two questions.
Large amount of data
Accuracy of model training
For the first problem, no matter how large the memory of a single machine is, it is impossible to deal with the unpredictable growth of data in the future. at this time, it needs distributed processing and divide and conquer with parallel computing power.
For the second problem, a good model usually requires a lot of training, and we all know that these training data usually have to run in large and complex iterations, whether for CPU or memory RAM. At this time, we need a good training tool to help us solve this problem.
Solution.
Pyspark
At this time, a distributed solution pyspark was born, there are rich third-party libraries in python, data analysis, machine learning, python writing hadoop,python and writing spark are used in industry a lot, mainly to solve the problem of python data analysis and model training in big data scene.
Big data analysis of TB-level data on how to share here, I hope that the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.