In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/02 Report--
Both mysql and pg databases communicate by listening on an ip/ port or a socket.
A problem is involved here, that is, the length of the monitoring queue.
Mysql is implemented on its own, and there is a configuration option in my.cnf, back_log, which sets the length of the listening queue.
There seems to be no place to set the length of the listening queue for the PG database.
This problem seems to occur when doing a high concurrency stress test for pgbench.
Command:
Pgbench-n-r-c 250-j 250-T 2-f update_smallrange.sql
Error message:
Connection to database "" failed:
Could not connect to server: Resource temporarily unavailable
Is the server running locally and accepting
Connections on Unix domain socket "/ tmp/.s.PGSQL.5432"?
But from the "Resource temporarily unavailable" above, I can't tell which resource is wrong.
After investigation, I found the following link
Http://www.postgresql.org/message-id/20130617141622.GH5875@alap2.anarazel.de
[code]
From:Andres Freund To:pgsql-hackers (at) postgresql (dot) orgSubject:PQConnectPoll, connect (2), EWOULDBLOCK and somaxconnDate:2013-06-1714: 16:22Message-ID:20130617141622.GH5875@alap2.anarazel.de (view raw Whole thread or download thread mbox) Thread: 2013-06-17 14:16:22 from Andres Freund 2013-06-26 11:22:58 from Andres Freund 2013-06-26 16:07:54 from Tom Lane 2013-06-26 18:12:00 from Andres Freund 2013-06-27 00:07:40 from Tom Lane 2013-06-27 06:17:57 from Andres Freund 2013-06-27 13:48:25 from Tom Lane 2013-06-27 16 whole thread or download thread mbox 47 from Tom Lane Lists:pgsql-hackersHi
When postgres on linux receives connection on a high rate client
Connections sometimes error out with:
Could not send data to server: Transport endpoint is not connected
Could not send startup packet: Transport endpoint is not connected
To reproduce start something like on a server with sufficiently high
Max_connections:
Pgbench-h / tmp-p 5440-T 10-c 400-j 400-n-f / tmp/simplequery.sql
Now that's strange since that error should happen at connect (2) time
Not when sending the startup packet. Some investigation led me to
Fe-secure.c's PQConnectPoll:
If (connect (conn- > sock, addr_cur- > ai_addr)
Addr_cur- > ai_addrlen)
< 0) { if (SOCK_ERRNO == EINPROGRESS || SOCK_ERRNO == EWOULDBLOCK || SOCK_ERRNO == EINTR || SOCK_ERRNO == 0) { /* * This is fine - we're in non-blocking mode, and * the connection is in progress. Tell caller to * wait for write-ready on socket. */ conn->Status = CONNECTION_STARTED
Return PGRES_POLLING_WRITING
}
/ * otherwise, trouble * /
}
So, we're accepting EWOULDBLOCK as a valid return value for
Connect (2). Which it isn't. EAGAIN in contrast is on some BSDs and on
Linux. Unfortunately POSIX allows those two to share the same value...
My manpage tells me:
EAGAIN No more free local ports or insufficient entries in the routing cache. For
AF_INET see the description of
/ proc/sys/net/ipv4/ip_local_port_range ip (7)
For information on how to increase the number of local
Ports.
So, the problem is that we took a failed connection as having been
Initially successfull but in progress.
Not accepting EWOULDBLOCK in the above if () results in:
Could not connect to server: Resource temporarily unavailable
Is the server running locally and accepting
Connections on Unix domain socket "/ tmp/.s.PGSQL.5440"?
Which makes more sense.
Trivial patch attached.
Now, the question is why we cannot complete connections on unix sockets?
Some code reading reading shows net/unix/af_unix.c:unix_stream_connect ()
Shows:
If (unix_recvq_full (other)) {
Err =-EAGAIN
If (! timeo)
Goto out_unlock
So, if we're in nonblocking mode-which we are-and the receive queue
Is full we return EAGAIN. The receive queue for unix sockets is defined
As
Static inline int unix_recvq_full (struct sock const * sk)
{
Return skb_queue_len (& sk- > sk_receive_queue) > sk- > sk_max_ack_backlog
}
Where sk_max_ack_backlog is whatever has been passed to the
Listen (backlog) on the listening side.
Question: But postgres does listen (fd, MaxBackends * 2), how can that be
A problem?
Answer:
If the backlog argument is greater than the value in / proc/sys/net/core/somaxconn
Then it is silently truncated to that value; the default value in this file is
one hundred and twenty eight。 In kernels before 2.4.25, this limit was a hard coded value, SOMAXCONN, with
The value 128.
Setting somaxconn to something higher indeed makes the problem go away.
I'd guess that pretty much the same holds true for tcp connections
Although I didn't verify that which would explain some previous reports
On the lists.
TLDR: Increase / proc/sys/net/core/somaxconn
Greetings
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
[/ code]
It turns out that the listen backlog on the PG server (limited by the kernel parameter somaxconn) is not enough. The default value of somaxconn is 128. when it is turned up, restart PG and test OK.
/ proc/sys/net/core/somaxconn
This file defines a ceiling value for the backlog argument of listen (2); see the listen (2) manual page
For details.
The solution is clear here.
Echo 256 > / proc/sys/net/core/somaxconn
Then restart pg to continue on ok.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 215
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.