In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article introduces the relevant knowledge of "what are the lock problems encountered in the Go system". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
Scenarios that rely on sync.Pool at the bottom
There are some open source libraries that use official sync.Pool to optimize performance, such as the https://github.com/valyala/fasttemplate library we use, whenever you execute code like this:
Template: = "http://{{host}}/?q={{query}}&foo={{bar}}{{bar}}" t: = fasttemplate.New (template," {{","}} ") s: = t.ExecuteString (map [string] interface {} {" host ":" google.com "," query ": url.QueryEscape (" hello=world ")," bar ":" foobar " }) fmt.Printf ("% s", s)
A fasttemplate.Template object is generated internally with a byteBufferPool field:
Type Template struct {template string startTag string endTag string texts [] [] byte tags [] string byteBufferPool bytebufferpool.Pool = this is the field}
The underlying layer of byteBufferPool is the encapsulated sync.Pool:
Type Pool struct {calls [steps] uint64 calibrating uint64 defaultSize uint64 maxSize uint64 pool sync.Pool}
This design creates a problem if the user New a Template object for each request. And evaluate it. For example, in our initial usage, every time we get the user's request, we fill in the template with parameters:
Func fromTplToStr (tpl string, params map [string] interface {}) string {tplVar: = fasttemplate.New (tpl, `{{`, `}}`) res: = tplVar.ExecuteString (params) return res}
When evaluating the template:
Func (t * Template) ExecuteFuncString (f TagFunc) string {bb: = t.byteBufferPool.Get () if _, err: = t.ExecuteFunc (bb, f); err! = nil {panic ("unexpected error:% s", err)} s: = string (bb.Bytes () bb.Reset () t.byteBufferPool.Put (bb) return s}
The byteBufferPool of the Template object is Get, and after using it, the ByteBuffer Reset is put back into the object pool. The problem is that our Template object itself is not reused, so the byteBufferPool itself here doesn't really work.
On the contrary, because each request needs to generate a new sync.Pool, in high concurrency scenarios, execution will be stuck on the sentence bb: = t.byteBufferPool.Get (). Problems can be found quickly through stress testing. When a certain QPS pressure is reached, a large number of Goroutine will be accumulated. For example, 18910 gigabytes are piled on the lock code:
Goroutine profile: total 18910 18903 @ 0x102f20b 0x102f2b3 0x103fa4c 0x103f77d 0x10714df 0x1071d8f 0x1071d26 0x1071a5f 0x12feeb8 0x13005f0 0x13007c3 0x130107b 0x105c931 # 0x103f77c sync.runtime_SemacquireMutex+0x3c / usr/local/go/src/runtime/sema.go:71 # 0x10714de sync. (* Mutex) .Lock+0xfe / usr/local/go/src/sync/mutex.go:134 # 0x1071d8e sync. (* Pool) .pinSlow+0x3e / usr/local/go/src/sync/pool.go:198 # 0x1071d25 sync. (* Pool). Pin+0x55 / usr/local/go/src/sync/pool.go:191 # 0x1071a5e sync. (* Pool) .Get+0x2e / usr/local/go/src/sync/ Pool.go:128 # 0x12feeb7 github.com/valyala/fasttemplate/vendor/github.com/valyala/bytebufferpool. (* Pool). Get+0x37 / Users/xargin/go/src/github.com/valyala/fasttemplate/vendor/github.com/valyala/bytebufferpool/pool.go:49 # 0x13005ef github.com/valyala/fasttemplate. (* Template). ExecuteFuncString+0x3f / Users/xargin/go/src/github.com/valyala/fasttemplate/template.go:278 # 0x13007c2 github.com / valyala/fasttemplate. (* Template) .ExecuteString+0x52 / Users/xargin/go/src/github.com/valyala/fasttemplate/template.go:299 # 0x130107a main.loop.func1+0x3a / Users/xargin/test/go/http/httptest.go:22
There are a lot of Goroutine blocking the acquisition lock. Why? Continue to look at sync.Pool 's Get process:
Func (p * Pool) Get () interface {} {if race.Enabled {race.Disable ()} l: = p.pin () x: = l.private l.private = nil runtime_procUnpin ()
Then there is pin:
Func (p * Pool) pin () * poolLocal {pid: = runtime_procPin () s: = atomic.LoadUintptr (& p.localSize) / / load-acquire l: = p.local / / load-consume if uintptr (pid)
< s { return indexLocal(l, pid) } return p.pinSlow() } 因为每一个对象的 sync.Pool 都是空的,所以 pin 的流程一定会走到 p.pinSlow: func (p *Pool) pinSlow() *poolLocal { runtime_procUnpin() allPoolsMu.Lock() defer allPoolsMu.Unlock() pid := runtime_procPin() 而 pinSlow 中会用 allPoolsMu 来加锁,这个 allPoolsMu 主要是为了保护 allPools 变量: var ( allPoolsMu Mutex allPools []*Pool ) 在加了锁的情况下,会把用户新生成的 sync.Pool 对象 append 到 allPools 中: if p.local == nil { allPools = append(allPools, p) } 标准库的 sync.Pool 之所以要维护这么一个 allPools 意图也比较容易推测,主要是为了 GC 的时候对 pool 进行清理,这也就是为什么说使用 sync.Pool 做对象池时,其中的对象活不过一个 GC 周期的原因。sync.Pool 本身也是为了解决大量生成临时对象对 GC 造成的压力问题。 说完了流程,问题也就比较明显了,每一个用户请求最终都需要去抢一把全局锁,高并发场景下全局锁是大忌。但是这个全局锁是因为开源库间接带来的全局锁问题,通过看自己的代码并不是那么容易发现。 知道了问题,改进方案其实也还好实现,***是可以修改开源库,将 template 的 sync.Pool 作为全局对象来引用,这样大部分 pool.Get 不会走到 pinSlow 流程。第二是对 fasttemplate.Template 对象进行复用,道理也是一样的,就不会有那么多的 sync.Pool 对象生成了。但前面也提到了,这个是个间接问题,如果开发工作繁忙,不太可能所有的依赖库把代码全看完之后再使用,这种情况下怎么避免线上的故障呢? 压测尽量早做呗。 metrics 上报和 log 锁 这两个本质都是一样的问题,就放在一起了。 公司之前 metrics 上报 client 都是基于 udp 的,大多数做的简单粗暴,就是一个 client,用户传什么就写什么,最终一定会走到: func (c *UDPConn) WriteToUDP(b []byte, addr *UDPAddr) (int, error) { ---------- 刨去无用细节 n, err := c.writeTo(b, addr) ---------- 刨去无用细节 return n, err } 或者是: func (c *UDPConn) WriteTo(b []byte, addr Addr) (int, error) { ---------- 刨去无用细节 n, err := c.writeTo(b, a) ---------- 刨去无用细节 return n, err } 调用的是: func (c *UDPConn) writeTo(b []byte, addr *UDPAddr) (int, error) { ---------- 刨去无用细节 return c.fd.writeTo(b, sa) } 然后: func (fd *netFD) writeTo(p []byte, sa syscall.Sockaddr) (n int, err error) { n, err = fd.pfd.WriteTo(p, sa) runtime.KeepAlive(fd) return n, wrapSyscallError("sendto", err) } 然后是: func (fd *FD) WriteTo(p []byte, sa syscall.Sockaddr) (int, error) { if err := fd.writeLock(); err != nil { =========>The key point here is return 0, err} defer fd.writeUnlock () for {err: = syscall.Sendto (fd.Sysfd, p, 0, sa) if err = = syscall.EAGAIN & & fd.pd.pollable () {if err = fd.pd.waitWrite (fd.isFile) Err = = nil {continue}} if err! = nil {return 0, err} return len (p), nil}}
In essence, a large write lock is set on the high-cost network operation, which will also lead to a large number of lock conflicts in high concurrency scenarios, resulting in a large number of Goroutine accumulation and interface delay.
Similarly, knowing the problem, the solution is also very simple. And take a look at the log. Because most of the company's logs are written directly to the file system, in essence, they operate the same file at the same time, and eventually go to:
Func (f * File) Write (b [] byte) (n int, err error) {n, e: = f.write (b) return n, err} func (f * File) write (b [] byte) (n int, err error) {n, err = f.pfd.Write (b) runtime.KeepAlive (f) return n, err}
Then:
Func (fd * FD) Write (p [] byte) (int, error) {if err: = fd.writeLock (); err! = nil {= > writeLock return 0, err} defer fd.writeUnlock () if err: = fd.pd.prepareWrite (fd.isFile) Err! = nil {return 0, err} var nn int for {- omit irrelevant n, err: = syscall.Write (fd.Sysfd, p [nn:max])-omit useless content}}
Like the UDP network FD, there is a writeLock. When the system logs a lot, this writeLock will cause the same problems as metrics reporting.
This is the end of the content of "what are the lock problems encountered in the Go system?" thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.