In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
This article is to share with you about how to use spark memory in jvm-profiler, the editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article.
Jvm-profiler
Generally speaking, there are two ways to monitor spark memory.
Access to the internal memory usage of Executor through Spark ListenerBus, there is less relevant information that can be obtained. After the https://github.com/apache/spark/pull/21221 is integrated, the usage of each logical partition of executor memory can be collected.
The JVM information is sent to the specified sink through Spark Metrics, and users can also customize the Sink, such as sending it to kafka/Redis.
Uber has recently opened up jvm-profiler to collect information about distributed JVM applications, which can be used for debug CPU/mem/io or the time of method calls. For example, adjust Spark JVM memory size, monitor HDFS Namenode RPC latency, and analyze data consanguinity.
It is easy to apply to Spark.
JVM information is collected every 5S and sent to kafka profiler_CpuAndMemory topic
Hdfs dfs-put jvm-profiler-0.0.9.jar hdfs://hdfs_url/lib/jvm-profiler-0.0.9.jar--conf spark.jars=hdfs://hdfs_url/lib/jvm-profiler-0.0.9.jar--conf spark.executor.extraJavaOptions=-javaagent:jvm-profiler-0.0.9.jar=reporter=com.uber.profiling.reporters.KafkaOutputReporter,metricInterval=5000,brokerList=brokerhost:9092,topicPrefix=profiler_
After consumption, it is stored in HDFS for analysis.
Analysis.
Hive table structure
Analyze the task of user-defined memory
User-defined memory scheduling tasks. The memory utilization of 75% of the tasks is less than 80%, which can be optimized.
User-defined memory scheduling task
User-defined memory development tasks, 45% of the tasks memory usage is less than 20%, and users have bad usage habits.
User-defined memory development task
Summary
By collecting the maximum usage value and setting value of jvm, the following problems can be solved.
Memory abuse
Monitor application memory usage trends to prevent insufficient memory caused by data growth
Spark Executor default memory setting is unreasonable
Expected memory reduction based on application usage
The default memory of executor is reduced by 10%, which frees up 60 GB of memory per task on average.
The utilization of custom memory scheduling tasks has been increased to 70%, and an average of 450 GB of memory can be freed per task.
The utilization rate of custom memory development tasks has been increased to 70%, with an average of 550 GB of memory freed per task.
The above is how to use spark memory in jvm-profiler. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.