What is the design method of Linux high performance server architecture? 07/01 Update SLTechnology News&Howtos

What is the design method of Linux high performance server architecture?

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly explains "what is the design method of Linux high-performance server architecture". Interested friends may wish to have a look at it. The method introduced in this paper is simple, fast and practical. Next, let the editor take you to learn "what is the design method of Linux high-performance server architecture?"

I. Framework

Let's start with the organizational structure of a single service program.

(1) Network communication

Since the server program will definitely involve the network communication part, what problems should be solved by the network communication module of the server program?

The author believes that at least the following problems should be solved:

1. How do I detect a new client connection?

two。 How do I accept client connections?

3. How to detect whether there is data coming from the client?

4. How to collect the data sent by the client?

5. How do I detect connection anomalies? What should I do when a connection exception is found?

6. How to send data to the client?

7. How to close the connection after sending data to the client?

Anyone with a little network foundation can answer several of the above questions, such as receiving accept function of socket API for client connection, recv function for collecting client data, send function for sending data to client, and select, poll, epoll and other socket API for detecting whether the client has new connections and whether the client has new data that can be reused with IO technology. Indeed, these basic socket API form the foundation of server network communication, no matter how ingenious the network communication framework is designed, it is built on the basis of these basic socket API. But how to organize these basic socket API cleverly is the crux of the problem. We say that the server is very efficient and supports high concurrency, but it is actually just a technical means of implementation. in any case, from the point of view of software development, it is nothing more than a program, so, as long as the program can meet "minimize waiting as much as possible" is efficient. In other words, efficiency is not "busy to death, idle to death", but everyone can be idle, but if there is work to do, we should try to do it together, instead of one part being busy doing things 123456789 in turn, and the other part sitting around doing nothing. What you said may be a little abstract. Let's give some examples to illustrate it.

For example, if the default recv function has no data, the thread will block there.

Default send function, if the tcp window is not large enough, the data will be blocked if it cannot be sent out.

When the connect function connects to the other end by default, it will also block there.

Or to send a piece of data to the peer, you need to wait for the reply from the peer, and if the other party keeps not answering, the current thread is blocked here.

None of the above is an efficient server development way of thinking, because the above examples do not meet the principle of "minimize waiting", so why wait? Is there a way that these processes don't need to wait, it's better not only that you don't have to wait, but that you can notify me when these things are done. So I can do something else in these cpu time slices that are supposed to wait. Yes, that is, the IO Multiplexing technology (IO reuse technology) that we will discuss below.

(2) comparison of several IO reuse mechanisms

At present, windows systems support select, WSAAsyncSelect, WSAEventSelect and IOCP, while linux systems support select, poll and epoll. Instead of detailing the usage of each specific function, let's discuss something at a deeper level. The API function listed above can be divided into two levels:

Level one: select and poll

Level 2: WSAAsyncSelect, WSAEventSelect, completion port (IOCP), epoll

Why do you divide it so much? Let's first introduce the first level. In essence, the select and poll functions take the initiative to query whether there are events on the socket handle (which may be one or more) within a certain period of time, such as readable events, writable events or error events. That is to say, we still need to take the initiative to do these tests every other period of time. If some events are detected during this period of time, even if our time is not wasted. But what if there is no incident during this period? We can only do useless work, to put it bluntly, or a waste of time, because if a server has multiple connections, in the case of limited cpu time slices, we spend a certain amount of time testing some socket connections, only to find that they do not have any events, but during this time we have some things to deal with, so why do we take the time to do this test? Isn't it good to spend this time doing what we need to do? So for server programs, in order to be efficient, we should try to avoid spending time actively querying whether there are events in some socket, but tell us to deal with them when there are events in these socket. This is what the functions of level 2 do, which is actually equivalent to changing the active query to see if there is an event. When there is an event, the system will tell us that we will deal with it at this time, that is, "good steel is used on the blade". It's just that the function of level 2 notifies us in different ways. For example, WSAAsyncSelect uses the event mechanism of windows message queue to inform us of the window procedure function set, IOCP uses GetQueuedCompletionStatus to return the correct state, and epoll is returned by epoll_wait function.

For example, if the connect function connects to the other end, if the connection socket is asynchronous, connect will return immediately, without waiting, although the connection cannot be completed immediately. After the connection is completed, WSAAsyncSelect will return the FD_CONNECT event to tell us that the connection is successful, and epoll will generate an EPOLLOUT event. We can also know that the connection is complete. Even when socket has data to read, WSAAsyncSelect generates FD_READ events, epoll generates EPOLLIN events, and so on. So with the above discussion, we can get the correct posture for network traffic to detect readable, writable or error events. This is the second principle I put forward here: minimize the time spent doing useless work. This may not show any advantage in the case of sufficient service program resources, but if there are a large number of tasks to deal with, I think it may bring uselessness.

(3) to detect the correct posture of network events.

According to the above introduction, first, in order to avoid meaningless waiting time, second, instead of actively querying the events of each socket, we adopt the strategy of waiting for the operating system to inform us of the status of events. Our socket should be set to asynchronous. On this basis, let's return to the seven questions mentioned in column 1:

1. How do I detect a new client connection?

two。 How do I accept client connections?

The default accept function blocks there. If epoll detects an EPOLLIN event on the listening socket, or WSAAsyncSelect detects a FD_ACCEPT event, it indicates that a new connection is coming. Calling the accept function at this time will not block. Of course, you should also set the new socket to be non-blocking. So we can send and receive data on the new socket.

3. How to detect whether there is data coming from the client?

4. How to collect the data sent by the client?

By the same token, we should only collect data when there are readable events on socket, so that we don't have to wait when we call recv or read functions. How much data should we collect at once? We can decide according to our own needs, and even you can repeat recv or read in a loop. For socket in non-blocking mode, recv or read will return immediately if there is no data, and the error code EWOULDBLOCK will indicate that there is no data left. Example:

Bool CIUSocket::Recv () {int nRet = 0; while (true) {char buff [512]; nRet =: recv (m_hSocket, buff, 512,0); if (nRet = = SOCKET_ERROR) / / close Socket {if (:: WSAGetLastError () = WSAEWOULDBLOCK) break as soon as an error occurs Else return false;} else if (nRet

< 1) return false; m_strRecvBuf.append(buff, nRet); ::Sleep(1); } return true; } 5.如何检测连接异常?发现连接异常之后，如何处理? 同样当我们收到异常事件后例如EPOLLERR或关闭事件FD_CLOSE，我们就知道了有异常产生，我们对异常的处理一般就是关闭对应的socket。另外，如果send/recv或者read/write函数对一个socket进行操作时，如果返回0，那说明对端已经关闭了socket，此时这路连接也没必要存在了，我们也可以关闭对应的socket。 6.如何给客户端发送数据? 给客户端发送数据，比收数据要稍微麻烦一点，也是需要讲点技巧的。首先我们不能像检测数据可读一样检测数据可写，因为如果检测可写的话，一般情况下只要对端正常收取数据，我们的socket就都是可写的，如果我们设置监听可写事件，会导致频繁地触发可写事件，但是我们此时并不一定有数据需要发送。所以正确的做法是：如果有数据要发送，则先尝试着去发送，如果发送不了或者只发送出去部分，剩下的我们需要将其缓存起来，然后设置检测该socket上可写事件，下次可写事件产生时，再继续发送，如果还是不能完全发出去，则继续设置侦听可写事件，如此往复，一直到所有数据都发出去为止。一旦所有数据都发出去以后，我们要移除侦听可写事件，避免无用的可写事件通知。不知道你注意到没有，如果某次只发出去部分数据，剩下的数据应该暂且存起来，这个时候我们就需要一个缓冲区来存放这部分数据，这个缓冲区我们称为"发送缓冲区"。发送缓冲区不仅存放本次没有发完的数据，还用来存放在发送过程中，上层又传来的新的需要发送的数据。为了保证顺序，新的数据应该追加在当前剩下的数据的后面，发送的时候从发送缓冲区的头部开始发送。也就是说先来的先发送，后来的后发送。 7.如何在给客户端发完数据后关闭连接? 这个问题比较难处理，因为这里的"发送完"不一定是真正的发送完，我们调用send或者write函数即使成功，也只是向操作系统的协议栈里面成功写入数据，至于能否被发出去、何时被发出去很难判断，发出去对方是否收到就更难判断了。所以，我们目前只能简单地认为send或者write返回我们发出数据的字节数大小，我们就认为"发完数据"了。然后调用close等socket API关闭连接。关闭连接的话题，我们再单独开一个小的标题来专门讨论一下。 (四)被动关闭连接和主动关闭连接在实际的应用中，被动关闭连接是由于我们检测到了连接的异常事件，比如EPOLLERR，或者对端关闭连接，send或recv返回0，这个时候这路连接已经没有存在必要的意义了，我们被迫关闭连接。而主动关闭连接，是我们主动调用close/closesocket来关闭连接。比如客户端给我们发送非法的数据，比如一些网络攻击的尝试性数据包。这个时候出于安全考虑，我们关闭socket连接。 (五)发送缓冲区和接收缓冲区上面已经介绍了发送缓冲区了，并说明了其存在的意义。接收缓冲区也是一样的道理，当收到数据以后，我们可以直接进行解包，但是这样并不好，理由一：除非一些约定俗称的协议格式，比如http协议，大多数服务器的业务的协议都是不同的，也就是说一个数据包里面的数据格式的解读应该是业务层的事情，和网络通信层应该解耦，为了网络层更加通用，我们无法知道上层协议长成什么样子，因为不同的协议格式是不一样的，它们与具体的业务有关。理由二：即使知道协议格式，我们在网络层进行解包处理对应的业务，如果这个业务处理比较耗时，比如读取磁盘文件，或者连接数据库进行账号密码验证，那么我们的网络线程会需要大量时间来处理这些任务，这样其它网络事件可能没法及时处理。鉴于以上二点，我们确实需要一个接收缓冲区，将收取到的数据放到该缓冲区里面去，并由专门的业务线程或者业务逻辑去从接收缓冲区中取出数据，并解包处理业务。说了这么多，那发送缓冲区和接收缓冲区该设计成多大的容量?这是一个老生常谈的问题了，因为我们经常遇到这样的问题：预分配的内存太小不够用，太大的话可能会造成浪费。怎么办呢?答案就是像string、vector一样，设计出一个可以动态增长的缓冲区，按需分配，不够还可以扩展。需要特别注意的是，这里说的发送缓冲区和接收缓冲区是每一个socket连接都存在一个。这是我们最常见的设计方案。 (六)协议的设计除了一些通用的协议，如http、ftp协议以外，大多数服务器协议都是根据业务制定的。协议设计好了，数据包的格式就根据协议来设置。我们知道tcp/ip协议是流式数据，所以流式数据就是像流水一样，数据包与数据包之间没有明显的界限。比如A端给B端连续发了三个数据包，每个数据包都是50个字节，B端可能先收到10个字节，再收到140个字节;或者先收到20个字节，再收到20个字节，再收到110个字节;也可能一次性收到150个字节。这150个字节可以以任何字节数目组合和次数被B收到。所以我们讨论协议的设计第一个问题就是如何界定包的界线，也就是接收端如何知道每个包数据的大小。目前常用有如下三种方法：固定大小，这种方法就是假定每一个包的大小都是固定字节数目，比如上文中讨论的每个包大小都是50个字节，接收端每收气50个字节就当成一个包; 指定包结束符，比如以一个\r\n(换行符和回车符)结束，这样对端只要收到这样的结束符，就可以认为收到了一个包，接下来的数据是下一个包的内容; 指定包的大小，这种方法结合了上述两种方法，一般包头是固定大小，包头中有一个字段指定包体或者整个大的大小，对端收到数据以后先解析包头中的字段得到包体或者整个包的大小，然后根据这个大小去界定数据的界线。协议要讨论的第二个问题是，设计协议的时候要尽量方便解包，也就是说协议的格式字段应该尽量清晰明了。协议要讨论的第三个问题是，根据协议组装的数据包应该尽量小，这样有如下好处：第一、对于一些移动端设备来说，其数据处理能力和带宽能力有限，小的数据不仅能加快处理速度，同时节省大量流量费用;第二、如果单个数据包足够小的话，对频繁进行网络通信的服务器端来说，可以大大减小其带宽压力，其所在的系统也能使用更少的内存。试想：假如一个股票服务器，如果一只股票的数据包是100个字节或者1000个字节，那100只股票和10000只股票区别呢? 协议要讨论的第二个问题是，对于数值类型，我们应该显式地指定数值的长度，比如long型，如果在32位机器上是32位的4个字节，但是如果在64位机器上，就变成了64位8个字节了。这样同样是一个long型，发送方和接收方可能会用不同的长度去解码。所以建议最好，在涉及到跨平台使用的协议最好显式地指定协议中整型字段的长度，比如int32,int64等等。下面是一个协议的接口的例子： class BinaryReadStream { private: const char* const ptr; const size_t len; const char* cur; BinaryReadStream(const BinaryReadStream&); BinaryReadStream& operator=(const BinaryReadStream&); public: BinaryReadStream(const char* ptr, size_t len); virtual const char* GetData() const; virtual size_t GetSize() const; bool IsEmpty() const; bool ReadString(string* str, size_t maxlen, size_t& outlen); bool ReadCString(char* str, size_t strlen, size_t& len); bool ReadCCString(const char** str, size_t maxlen, size_t& outlen); bool ReadInt32(int32_t& i); bool ReadInt64(int64_t& i); bool ReadShort(short& i); bool ReadChar(char& c); size_t ReadAll(char* szBuffer, size_t iLen) const; bool IsEnd() const; const char* GetCurrent() const{ return cur; } public: bool ReadLength(size_t & len); bool ReadLengthWithoutOffset(size_t &headlen, size_t & outlen); }; class BinaryWriteStream { public: BinaryWriteStream(string* data); virtual const char* GetData() const; virtual size_t GetSize() const; bool WriteCString(const char* str, size_t len); bool WriteString(const string& str); bool WriteDouble(double value, bool isNULL = false); bool WriteInt64(int64_t value, bool isNULL = false); bool WriteInt32(int32_t i, bool isNULL = false); bool WriteShort(short i, bool isNULL = false); bool WriteChar(char c, bool isNULL = false); size_t GetCurrentPos() const{ return m_data->

Length ();} void Flush (); void Clear (); private: string* masked data;}

Where BinaryWriteStream is the class of encoding protocol and BinaryReadStream is the class of decoding protocol. It can be encoded and decoded in the following way.

Coding:

Std::string outbuf; BinaryWriteStream writeStream (& outbuf); writeStream.WriteInt32 (msg_type_register); writeStream.WriteInt32 (m_seq); writeStream.WriteString (retData); writeStream.Flush ()

Decode:

BinaryReadStream readStream (strMsg.c_str (), strMsg.length ()); int32_t cmd; if (! readStream.ReadInt32 (cmd)) {return false;} / / int seq; if (! readStream.ReadInt32 (m_seq)) {return false;} std::string data; size_t datalength; if (! readStream.ReadString (& data, 0, datalength)) {return false;}

(7) Organization of server program structure

Due to too much content, a separate article will be organized to introduce it in detail.

II. Architecture

The server side of a project is often composed of many services, even if a single service achieves the extreme performance, the number of concurrency supported is limited, to take a simple example, if a chat server, each user's information is 1k, then for a machine with 8G memory, without considering other circumstances, there are actually 8.38 million, but this is only a very ideal situation. So we sometimes need to deploy multiple sets of a service, as described in the Framework for the implementation of a single service. Let's give an example:

This is the server architecture of Mogujie TeamTalk. MsgServer is a chat service, and multiple sets can be deployed. When each chat server starts, it will tell loginSever and routeSever its own ip address and port number. When there are users up and down or offline, MsgServer will also tell loginSever and routeSever their latest number of users and user id list. Now a user needs to log in, first connect to loginServer,loginServer according to the user situation on each MsgServer recorded, return the ip address and port number of a minimum load MsgServer to the client, and then use this ip address and port number to log in to MsgServer. When chatting, a user on A MsgServer sends a message to another user. If the user is not on the same MsgServer, MsgServer forwards the message to RouteServer,RouteServer to find the MsgServer where the target user is based on the user's id information recorded by himself and forward it to the corresponding MsgServer.

The above is an example of distributed deployment. Let's take a look at another example. This example is the policy of a single service. When the actual server is dealing with network data, if there is data to be processed on multiple socket at the same time, it may serve the first few socket all the time, and then process the data of the next few socket until the first few socket are processed. This is equivalent to, when you go to a restaurant, everyone orders, but some tables are served all the time, while others are not served all the time. This is definitely not good. Let's take a look at how to avoid this phenomenon:

Int CFtdEngine::HandlePackage (CFTDCPackage * pFTDCPackage, CFTDCSession * pSession) {/ / NET_IO_LOG0 ("CFtdEngine::HandlePackage\ n"); FTDC_PACKAGE_DEBUG (pFTDCPackage); if (pFTDCPackage- > GetTID ()! = FTD_TID_ReqUserLogin) {if (! IsSessionLogin (pSession- > GetSessionID () {SendErrorRsp (pFTDCPackage, pSession, 1, "customer not logged in"); return 0 }} CalcFlux (pSession, pFTDCPackage- > Length ()); / / Statistical traffic REPORT_EVENT (LOG_DEBUG, "Front/Fgateway", "login request% 0x", pFTDCPackage- > GetTID ()); int nRet = 0 Switch (pFTDCPackage- > GetTID ()) {case FTD_TID_ReqUserLogin: / huwp:20070608: checking for a higher version of API will be prohibited from logging in to if (pFTDCPackage- > GetVersion () > FTD_VERSION) {SendErrorRsp (pFTDCPackage, pSession, 1, "Too High FTD Version"); return 0;} nRet = OnReqUserLogin (pFTDCPackage, (CFTDCSession *) pSession) FTDRequestIndex.incValue (); break; case FTD_TID_ReqCheckUserLogin: nRet = OnReqCheckUserLogin (pFTDCPackage, (CFTDCSession *) pSession); FTDRequestIndex.incValue (); break; case FTD_TID_ReqSubscribeTopic: nRet = OnReqSubscribeTopic (pFTDCPackage, (CFTDCSession *) pSession); FTDRequestIndex.incValue (); break;} return 0;}

When there is data to read on a socket, then receive the data on the socket, unpack the received data, and then call CalcFlux (pSession, pFTDCPackage- > Length ()) for traffic statistics:

Void CFrontEngine::CalcFlux (CSession * pSession, const int nFlux) {TFrontSessionInfo * pSessionInfo = m_mapSessionInfo.Find (pSession- > GetSessionID ()); if (pSessionInfo! = NULL) {/ / flow control changed to count pSessionInfo- > nCommFlux + + / if the traffic exceeds the specification, the read operation of the session if (pSessionInfo- > nCommFlux > = pSessionInfo- > nMaxCommFlux) {pSession- > SuspendRead (true);} is suspended.

This function first increments the number of packets processed by a connection session (Session), and then determines whether the maximum number of packets is exceeded, then sets the read suspend flag:

Void CSession::SuspendRead (bool bSuspend) {m_bSuspendRead = bSuspend;}

This will exclude the socket from the list of detected socket next time:

Void CEpollReactor::RegisterIO (CEventHandler * pEventHandler) {int nReadID, nWriteID; pEventHandler- > GetIds (& nReadID, & nWriteID); if (nWriteID! = 0 & & nReadID = = 0) {nReadID = nWriteID;} if (nReadID! = 0) {m _ mapEventHandler] = nReadID; struct epoll_event ev; ev.data.ptr = pEventHandler If (epoll_ctl (m_fdEpoll, EPOLL_CTL_ADD, nReadID, & ev)! = 0) {perror ("epoll_ctl EPOLL_CTL_ADD");} void CSession::GetIds (int * pReadId, int * pWriteId) {GetIds (pReadId,pWriteId); if (m_bSuspendRead) {* pReadId = 0;}}

That is, it no longer detects whether there is data to read on the socket. Then reset the flag after 1 second in the timer, so that if there is data on the socket, it can be detected again:

Const int SESSION_CHECK_TIMER_ID = 9; const int SESSION_CHECK_INTERVAL = 1000; SetTimer (SESSION_CHECK_TIMER_ID, SESSION_CHECK_INTERVAL); void CFrontEngine::OnTimer (int nIDEvent) {if (nIDEvent = = SESSION_CHECK_TIMER_ID) {CSessionMap::iterator itor = m_mapSession.Begin () While (! itor.IsEnd ()) {TFrontSessionInfo * pFind = m_mapSessionInfo.Find ((* itor)-> GetSessionID ()); if (pFind! = NULL) {CheckSession (* itor, pFind);} itor++ } void CFrontEngine::CheckSession (CSession * pSession, TFrontSessionInfo * pSessionInfo) {/ restart traffic calculation pSessionInfo- > nCommFlux-= pSessionInfo- > nMaxCommFlux; if (pSessionInfo- > nCommFlux

< 0) { pSessionInfo->

NCommFlux = 0;} / if the traffic exceeds the specification, the read operation pSession- > SuspendRead (pSessionInfo- > nCommFlux > = pSessionInfo- > nMaxCommFlux) of the session is suspended;}

This is equivalent to serving some dishes to the guests at a certain table in the restaurant first, so that they can eat first. After some dishes have been served, they will not continue to serve this table, but will serve dishes on other empty tables. After everyone has eaten, continue to come back and continue to serve dishes on the original table. As a matter of fact, that's what our restaurants do. The above example is a very good idea for the implementation of single-service flow control, which ensures that each client can get a balanced service, rather than some clients wait a long time to respond.

Another strategy to speed up server processing may be caching, which is actually a strategy of exchanging space for time. For information that is used repeatedly but does not change often, we can use caching if loading the information from the original location is time-consuming (such as from disk, from the database). So nowadays, all kinds of in-memory databases such as redis, leveldb, fastdb and so on are very popular. My basic information about the user in flamingo is cached in the chat service, and when the file service starts, it loads all the program names in the specified directory, and the name of these files is md5, which is the md5 of the contents of the file. In this way, when the client uploads a new file request, if the file md5 is already in the cache, it indicates that the file already exists on the server. At this time, the server no longer has to receive the file, but tells the client that the file has been uploaded successfully.

At this point, I believe you have a deeper understanding of "what is the design method of Linux high-performance server architecture". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.