Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

CDH: unable to create new native thread

2025-04-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

Find a problem

CDH-4.7.1 NameNode is down

The error in starting NameNode is as follows. A new thread cannot be created. The number of threads used may exceed the threshold set by max user processes.

2018-08-26 08INFO org.mortbay.log 440000532 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 500702018-08-2608jetty-6.1.26.cloudera.42018: 4400773 WARN org.apache.hadoop.security.authentication.server.AuthenticationFilter: 'signature.secret' configuration not set Using a random value as secret2018-08-26 08 INFO org.apache.hadoop.hdfs.server.namenode.NameNode 4400812 INFO org.mortbay.log: Started SelectChannelConnector@alish2-dataservice-01.mypna.cn:500702018-08-26 0814 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Web-server up at: alish2-dataservice-01.mypna.cn:500702018-08-26 0814 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting2018-08-26 0815 INFO org.apache.hadoop.ipc. Server: IPC Server listener on 8020: starting2018-08-26 0814 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting2018-08-26 0828 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 8022: starting2018-08-26 0848 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode joinjava.lang.OutOfMemoryError: unable to create new native threadat java.lang.Thread.start0 (Native Method) at java.lang.Thread.start Thread.java:714) at org.apache.hadoop.ipc.Server.start (Server.java:2057) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.start (NameNodeRpcServer.java:303) at org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices (NameNode.java:497) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize (NameNode.java:459) at org.apache.hadoop.hdfs.server.namenode.NameNode. (NameNode.java:621) at org.apache .hadoop.hdfs.server.namenode.namenode.NameNode. (NameNode.java:606) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode (NameNode.java:1177) at org.apache.hadoop.hdfs.server.namenode.NameNode.main (NameNode.java:1241) 2018-08-26 08-26 0851 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1

The contents of the log are as follows. There is no problem with checking DNS. There is not much reference here.

# cat / var/log/cloudera-scm-agent/cloudera-scm-agent.log [26/Aug/2018 07:30:23 + 0000] 4589 MainThread agent INFO PID '19586' associated with process' 1724Meimeno 'with payload' processname:1724-hdfs-NAMENODE groupname:1724-hdfs-NAMENODE from_state:RUNNING expected:0 pid:19586' exited unexpectedly [26/Aug/2018 07:45:06 + 0000] 4589 Monitor-HostMonitor throttling_logger ERROR (29 skipped) Failed to collect java-based DNS namesTraceback (most recent call last): File "/ usr/lib64/cmf/agent/src/cmf/monitor/host/dns_names.py" Line 53, in collect result, stdout, stderr = self._subprocess_with_timeout (args, self._poll_timeout) File "/ usr/lib64/cmf/agent/src/cmf/monitor/host/dns_names.py", line 42, in _ subprocess_with_timeout return subprocess_with_timeout (args, timeout) File "/ usr/lib64/cmf/agent/src/cmf/monitor/host/subprocess_timeout.py", line 40 In subprocess_with_timeout close_fds=True) File "/ usr/lib64/python2.6/subprocess.py", line 642, in _ init__ errread, errwrite) File "/ usr/lib64/python2.6/subprocess.py", line 1234, in _ execute_child child_exception = pickle.loads (data) OSError: [Errno 2] No such file or directory

Troubleshooting

The max user processes set here to 65535 is already very large, and generally speaking, this bottleneck can not be reached.

# ulimit-acore file size (blocks,-c) 0data seg size (kbytes,-d) unlimitedscheduling priority (- e) 0file size (blocks,-f) unlimitedpending signals (- I) 127452max locked memory (kbytes,-l) 64max memory size (kbytes,-m) unlimitedopen files (- n) 65535pipe size (512 bytes -p) 8POSIX message queues (bytes,-Q) 819200real-time priority (- r) 0stack size (kbytes,-s) 10240cpu time (seconds,-t) unlimitedmax user processes (- u) 65535virtual memory (kbytes,-v) unlimitedfile locks (- x) unlimited

Now that the total number of processes in the system is only over 100, we need to check how many threads there are for each process.

# ps-ef | wc-l

one hundred and sixty nine

It is known that the java process is mainly running on this server, so focus on looking at the number of threads corresponding to the java process, and find that 30315 of this process corresponds to about 32110 threads. Plus the number of other processes and threads, the total number exceeds 65535. NameNode cannot apply for extra threads, so it reports an error.

# pgrep java

1680

5482

19662

28770

30315

35902

# for i in `pgrep java`; do ps-T-p $I | wc-1; done

fifteen

forty-nine

thirty

fifty-three

32110

one hundred and fourteen

# ps-T-p 30315 | wc-l

32110

Or check through the top-H command

# top-H

Top-10:44:58 up 779days, 19:34, 3 users, load average: 0.01,0.05,0.05

Tasks: 32621 total, 1 running, 32620 sleeping, 0 stopped, 0 zombie

Cpu (s): 2.8%us, 4.1%sy, 0.0%ni, 93.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Mem: 16334284k total, 15879392k used, 454892k free, 381132k buffers

Swap: 4194296k total, 0k used, 4194296k free, 8304400k cached

Solution method

After finding the cause of the problem, we can reset the value of max user processes to 100000 and start NameNode successfully again.

# echo "100000" > / proc/sys/kernel/threads-max

# echo "100000" > / proc/sys/kernel/pid_max (default 32768)

# echo "200000" > / proc/sys/vm/max_map_count (default 65530)

# vim / etc/security/limits.d/90-nproc.conf

* soft nproc unlimited

Root soft nproc unlimited

# vim / etc/security/limits.conf

* soft nofile 65535

* hard nofile 65535

* hard nproc 100000

* soft nproc 100000

# ulimit-u

100000

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report