In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
Editor to share with you how to optimize Hadoop, I believe that most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to understand it!
Testing some programs with hadoop, however, the data used are all small data, and it is quite slow to process with hadoop. Each file has only a few KB, and there are hundreds of files. In this way, the data of 20MB will take more than 40 minutes to run, and even 4 machines will take 14 minutes to run. This is a complete waste of hadoop. Finally, it is found that each computer has only two Map, which makes Map very long. I checked some information and found that there are two items for the time being that I can use now.
1. Set the number of Map:
Add to the mapred-site.xml file in the conf folder
Mapred.tasktracker.map.tasks.maximum
The format is as follows:
Mapred.tasktracker.map.tasks.maximum 10
Only 10 are set up here. According to what is said on the Internet, the number of map of a node is 30 to 300. you can set it according to the specific situation.
2. Set JVM reuse
The default is 1, which means that the maximum number of task that can be executed sequentially on a JVM (belonging to the same Job) is 1. In other words, a task starts a JVM.
Obviously, it is a waste of time to restart JVM each time you use it, so you can set the reuse number of this to a little higher. Here, I set it to-1, which means it is infinitely applicable until the program is finished.
It is similar to setting the number of Map. It is also set in mapred-site.xml, as follows:
Mapred.job.reuse.jvm.num.tasks-1
You don't have to format it again, you can run it directly. Now, even if it's pseudo-distributed, it only takes 11 minutes and 30 seconds to run the program.
The author will continue to study and find a more reasonable way. At the same time, the data can be compressed after running the program, but the author has not yet figured out how to use this method.
Reference website:
Http://blog.csdn.net/wangqiaoshi/article/details/18142885
Http://jerrylead.iteye.com/blog/1195335
3. The solution of Java heap space:
This indicates that there is not enough virtual space in java. There are two ways to solve this problem:
① modifies the hadoop file in the hadoop/bin folder:
Find the line "export HADOOP_OPTS=" and modify it as follows:
Export HADOOP_OPTS= "- Xmx4096m" / / set the space of jvm to 4GB, or set it to whatever you want
② (recommended) can also modify the mapred-site.xml file:
Add the following:
Mapred.child.java.opts-Xmx4096m
Just save and exit and restart hadoop.
4. Modify the block size of HDFS:
The default is 64MB, but 128MB is generally used, and there are some special cases to make the block smaller. In summary, in many cases, you need to change the size of the block as follows:
Modify the contents of hdfs-site.xml in the conf folder by adding the following:
The size that dfs.block.size wants by himself, in B
Save exit, format HDFS
The above is all the content of this article "how to optimize Hadoop". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.