No live upstreams while connecting to upstream 07/19 Update SLTechnology News&Howtos

No live upstreams while connecting to upstream

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/03 Report--

Nginx Agent tomcat reported an error:

View the log:

15:11:32 on 2017-06-07 [error] 29011: * 376979 no live upstreams while connecting to upstream, client: 11.12.13.14, server: ccc, request: "GET / sdlcrd HTTP/1.1", upstream: "http://sdlcrdBackend/sdlcrd", host:" test-reg.ckl.com:88 " Referrer: "http://test-reg.ckl.com:88/sdsso/mvc/uipBaseController/showApplication?userCode=1492133480995"2017/06/07 15:11:32 [error] 29012: 0: send () failed () 2017-06-07 15:11:36 [error] 29011: * 376979 no live upstreams while connecting to upstream, client: 11.12.13.14, server: ccc, request:" GET / sdlcrd HTTP/1.1 ", upstream:" http://sdlcrdBackend/sdlcrd", Host: "test-reg.ckl.com:88", referrer: "http://test-reg.ckl.com:88/sdsso/mvc/uipBaseController/showApplication?userCode=1492133480995"2017/06/07 15:11:37 [error] 2901240: send () failed (111: Connection refused)

View the nginx upstream configuration:

Upstream sdlcrdBackend {server 192.168.1.74 weight=1 max_fails=2 fail_timeout=30s; sticky name=com.ckl.sdlcrd.UAT.route domain=test-reg.ckl.com; check interval=5000 rise=2 fall=3 timeout=1000 type=http; check_http_send "HEAD / HTTP/1.0\ r\ n\ r\ n"; check_http_expect_alive http_2xx http_3xx;}

Description:

Max_fails=number

# sets the number of failed attempts by Nginx to communicate with the server. Within the time period defined by the fail_timeout parameter, if the number of failures reaches this value, Nginx considers the server unavailable. In the next fail_timeout period, the server will not be tried again. The default number of failed attempts is 1. Setting it to 0 will stop counting attempts and assume that the server is always available. You can configure what is a failed attempt by instructing proxy_next_upstream, fastcgi_next_upstream, and memcached_next_upstream. When configured by default, the http_404 status is not considered a failed attempt.

Fail_timeout=time

# set the time period during which the server is considered unavailable and count the number of failed attempts. During this time, when the server fails a specified number of attempts, the server is considered unavailable. By default, the timeout is 10 seconds.

In practical applications, if your backend application is an application that can be quickly restarted, such as nginx, the self-contained module can meet the needs. But you need to be careful. If there is an unhealthy node in the backend, the load balancer will still forward the request to the unhealthy node first, and then forward it to other nodes, which will waste a forwarding.

However, if the restart operation takes a long time to complete when the backend application is restarted, it may drag down the entire load balancer. At this time, due to the inability to accurately judge the health status of the node, the request handle resides, resulting in a false death state, and finally all the nodes on the load balancer cannot respond to the request normally. Because the company's business programs are developed by java, the back end is mainly nginx cluster and tomcat cluster. This mode is not recommended because the above businesses should be deployed for tomcat restart, and the initialization time of some businesses is too long, which will lead to the above phenomenon.

And the max_fails parameter setting value in the server instruction in the ngx_http_upstream_module module will also conflict with the proxy_next_upstream instruction setting in the ngx_http_proxy_module module. For example, if max_fails is set to 0, it means no health check on the back-end server, which also invalidates the fail_timeout parameter (that is, it doesn't work). At this time, we can actually find the unhealthy node by adjusting the proxy_connect_timeout instruction and proxy_read_timeout instruction in the ngx_http_proxy_module module, and then transfer the request to the healthy node.

Increase the number of tests, and timeout repair:

Upstream sdlcrdBackend {server 192.168.1.74 weight=10 max_fails=2 fail_timeout=60s; sticky name=com.ckl.sdlcrd.UAT.route domain=test-reg.ckl.com; check interval=5000 rise=2 fall=3 timeout=1000 type=http; check_http_send "HEAD / HTTP/1.0\ r\ n\ r\ n"; check_http_expect_alive http_2xx http_3xx;}

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.