Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to solve the problem of JVM stack memory overflow in yarn-cluster mode

2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

How to solve the yarn-cluster mode JVM stack memory overflow problem, many novices are not very clear, in order to help you solve this problem, the following editor will explain in detail for you, people with this need can come to learn, I hope you can get something.

Your machine nodes in the company are usually virtual machines. So yarn-client will have the problem of network proliferation! Yarn-client mode can run, but Yarn-cluster mode can not run, just report such a JVM permanent generation overflow such a problem!

Spark-submit script submits spark application to ResourceManager

Go to ResourceManager and apply to start ApplicationMaster.

Notify a NodeManager to start the ApplicationMaster (Driver process)

ApplicationMaster went to ResourceManager to apply for Executor.

ResourceManager allocates resources occupied by container,container on behalf of the Executor you can start, including memory + CPU returns the address of the NodeManager where container has been started

ApplicationMaster goes to NodeManager and applies to start Executor in container.

The Executor process in turn registers with Driver

Finally, after Driver receives the Executor resource, it can execute our spark code.

When an action is executed, a JOB is triggered

DAGScheduler will divide the JOB into Stage

TaskScheduler will divide the Stage into Task

Task is sent to Executor for execution

Driver will dispatch the Task.

So far, ApplicationMaster (Driver), you will know what resources you have available (executor). Then it executes the job, splits the stage, submits the task of the stage, schedules the task, and assigns it to each executor to execute.

Summarize the differences between yarn-client and yarn-cluster patterns:

Yarn-client mode, driver running on the local machine

In yarn-cluster mode, driver runs on top of a nodemanager node on a yarn cluster.

Yarn-client will cause the local machine to be responsible for scheduling spark jobs, so Nic traffic will surge.

There is no such problem with yarn-cluster mode.

Yarn-client 's driver runs locally, and generally speaking, the local machine and the yarn cluster are not in the same computer room, so the performance may not be very good.

In yarn-cluster mode, driver runs in the same server room as the yarn cluster, and its performance will be better.

Practical experience, problems encountered in yarn-cluster:

Sometimes, when you run some spark jobs that include spark sql, you may run normally in yarn-client mode; in yarn-cluster mode, you may not be able to submit and run, and a memory overflow, OOM, of JVM's PermGen (permanent generation) will be reported. There is an area in JVM that contains some string constants in Class.

In yarn-client mode, driver runs on the local machine, and spark uses the PermGen configuration of JVM, which is a local spark-class file (the spark client is configured by default), and the permanent generation size of JVM is 128m, which is no problem; however, in yarn-cluster mode, driver runs on a node of the yarn cluster, using the unconfigured default setting (PermGen permanent generation size), 82m.

Spark-sql, its interior is to carry out very complex SQL semantic parsing, syntax tree conversion and so on, especially complex, in this complex case, if your sql itself is particularly complex, it is likely to lead to performance consumption, memory consumption. The occupation of the permanent generation of PermGen may be relatively large.

So, at this time, if the demand for permanent generation is more than 82m, but it is less than 128m, there will be a problem as mentioned above. In yarn-client mode, the default is 128m, which works; if in yarn-cluster mode, the default is 82m, there will be a problem. Will report PermGen Out of Memory error log.

How to solve this problem?

Since it is the PermGen permanent generation memory overflow of JVM, there is not enough memory. Let's set up more PermGen for driver in yarn-cluster mode.

/ / in the spark-submit script, add the following configuration:-- conf spark.driver.extraJavaOptions= "- XX:PermSize=128M-XX:MaxPermSize=256M" / / this sets the size of the driver permanent generation. The default is 128m, and the maximum is 256m. Then, in this way, / / you can basically guarantee that your spark job will not have the problem of permanent generation memory overflow caused by the above yarn-cluster mode.

One more sentence, there may be one more question (spark sql,sql, pay attention to one question)

Sql, with a large number of or statements. For example, where keywords='' or keywords='' or keywords='', when there are hundreds of or statements, there may be a problem of jvm stack overflow,JVM stack memory overflow on the driver side.

The memory overflow of JVM stack is basically due to the excessive level of method calls, because it produces a large number of recursion, which is very deep, which exceeds the limit of JVM stack depth. Recursive method. Our guess is that spark sql, when there are a large number of or statements, in the internal source code of spark sql, when parsing sql, such as converting to syntax trees, or generating execution plans, the processing of or is recursive. If there is a lot of or, there will be a lot of recursion.

JVM Stack Memory Overflow, stack memory overflow.

At such times, it is recommended not to make such complex spark sql statements. The alternative is to disassemble a sql statement into multiple sql statements to execute. Each sql statement has less than 100 or clauses; one SQL statement is executed. According to the test experience of production environment, it is OK to have one sql statement and less than 100 or clauses. Normally, that stack memory overflow is not reported.

Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report