How to design QQ friend system 07/12 Update SLTechnology News&Howtos

How to design QQ friend system

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/03 Report--

This paper takes the QQ friend system as an example to analyze the design principle and design steps of the QQ friend system. Read the complete article, I believe that you have a certain understanding of the design of QQ friend system.

What is the friend system?

To put it simply, the friend system is the system that maintains the user's friend relationship. We are most familiar with the friend system case is QQ, in fact, QQ is an instant messaging tool, with the friend system precipitated a large number of friend chain, thus creating an indestructible business empire. The importance of the friend system is evident.

People who are familiar with Internet products all know that when the product has a certain number of users, it will often develop a friend system. Its main purpose is to increase user stickiness (if you have friends, you will come often) or to increase community activity (if you have friends, you will communicate more).

And my background development career began with such a system.

At that time, the friend system was a brand new thing for most of our team, because most of us were fresh graduates. Naturally, the architecture of the whole system cannot be created by a group of yellow-haired children. The architecture diagram of that year can no longer be found, but with a little memory and years of experience, we can still outline the architecture of that year.

As shown in the figure, the architecture of the friend system is a common three-tier structure, including the access layer, the logic layer and the data layer.

Let's start with the data layer.

Because we are too familiar with QQ, we can easily list the data of the friend system, including user profile, buddy chain, messages (chat messages and system messages), online status, and so on.

Internet products often have to face a large number of concurrent requests, the traditional relational database is more difficult to meet the needs of reading and writing. In storage, relational databases such as MySQL are generally used for data that reads more and writes less, and cache is often needed to ensure performance; NoSQL (Not Only SQL) should be the current mainstream.

For the friend system, user profiles and friend chain are stored in kv, while messages are stored in the company's self-developed tlist (which can be replaced by redis's list), which is described below in online status.

Then there is the logical layer.

The one with the highest complexity in this system should be the messaging service (which I was not involved in developing).

In the message service, messages are divided into chat messages and system messages (system messages include friend-adding messages, global tips push, etc.), and online messages and offline messages according to their status. In the implementation, three kinds of list are maintained: chat message, system message and offline message. Chat messages are shared by two users, and system messages and offline messages are exclusive to each user. When the user is online, the chat message and the system message are sent directly; if the user is offline, the message is stored in the offline message list and pulled when the user logs in again.

In this way, the message service is not complicated? In fact, the conventional process design in the system design is often relatively simple, but for Internet products, the abnormal situation is the normal, when all kinds of abnormal situations are taken into account, the system will be very complex.

In this example, message sending packet loss is an abnormal situation, how to ensure normal operation in the case of packet loss is not a small problem.

A common solution is for the recipient to reply to the confirmation packet and resend it if the sender does not receive the confirmation packet. But the confirmation packet may be lost, so you can add a confirmation packet to the confirmation packet, which is a never-ending confirmation.

The solution can refer to the retransmission mechanism of TCP. So the question is, why don't we use TCP? Because TCP is still slow, the reliability of chat messages is not as high as that required by transaction data, and losing a few messages will not cause serious consequences, but if users have to wait a long time before they can be received after each message is sent, the experience is very poor.

A more eclectic solution is for the receiver to reply to the acknowledgement packet and resend it if the sender does not receive the acknowledgement within a certain period of time; if the receiver receives two identical packets (the same as the custom seq), it can be reissued.

A discussion triggered by an interview question:

During interviews, I often ask candidates a question: in a distributed system, how can a user have only one terminal online at the same time (users log in to their accounts in two places one after another, and the next login can take the previous login offline)? This is a very basic function in Internet products, which examines the candidate's basic architectural design capabilities.

The design should start with the access server (hereinafter referred to as the interface machine). The interface machine is the external window of the friend system, and its main functions are maintaining user connection, login authentication, encrypting and decrypting data and transparently transmitting data to the back-end service. When a user connects to the friend system, the first thing is to connect to the interface machine. After successful authentication, the interface will maintain the user's session in memory, and all subsequent operations are based on session.

As shown in the figure, if a user tries to log in twice, the interface machine can kick off the first login through session, thus ensuring that only one terminal is online.

Has the problem been solved?

No. Because the actual system certainly will not have only one interface machine, in the case of multiple interfaces, the above method is not feasible. Because each interface machine can only maintain part of the user's session, if the user connects to a different interface machine successively, it will cause the user to log in in multiple places.

Naturally, the solution is to maintain a global view of the user's state. In our friend system, it is called the presence service.

Presence service, as the name implies, is a service that maintains the user's online status (login time, interface machine IP, etc.). User login and logout will trigger the status change here through the interface machine. Because both login and exit packets can be lost, heartbeats are also used for online status maintenance (one heartbeat is marked online and no n heartbeats are marked offline).

A commonly used method is to use bitmap to store online status, specifically to allocate a piece of space in memory. There are a total of 4294967296 natural numbers on a 32-bit machine. If a bit is used to represent a user ID (such as QQ number), 1 represents online and 0 represents offline, then all natural numbers are stored in memory as long as 4294967296 / (8 1024 1024) = 512MB (8bit = 1Byte). Of course, more bit can be assigned to each user as needed in the implementation.

Therefore, the kick-off function is shown in the figure.

When the user logs in, the interface machine first looks up whether there is a session on the machine, updates the session if so, and then sends a login package to the presence service, which checks whether the user is online, updates the status information if online, and sends a kick-off package to the last logged-in interface machine, IP When receiving the kick-off packet, the interface machine checks whether the user ID in the package has session, and if so, sends the kick-off package to the client and deletes the session.

In practice, there are still many details to pay attention to.

Back to the situation where users log in to the same interface machine one after another:

The kick-off process in the figure is correct, but what happens if steps 10 and 13 reverse the order (which is common in UDP transmissions)? You can deduce it by yourself. the kick-off package will kick off the A' for the second login. This is not what we expected. What should I do?

The solution is divided into several details. When the ① interface machine receives the successful login packet No. 13, it first replaces session A with session A, and then kicks the offline package to client A (to avoid being kicked off each other due to multiple survival); the ② offline package must contain identification information other than the user ID, and the unique logo of session should be in the form of ID+XXX (I used ID+LoginTime at the beginning), and XXX is to distinguish a login. When receiving the kick-off packet, the ③ interface machine only needs to judge whether the ID+XXX matches or not to decide whether to send the kick-off packet to the client.

In reality, problems are always strange, but there are always more solutions than problems.

For example, I have encountered the interface machine and presence service time drift (by a few seconds) in the project. In this way, the unique ID of the offline kick cannot be in the form of the user ID+LoginTime. You can generate a unique UUID solution for each login. There are many similar problems, so I won't repeat them.

To sum up, this article mainly introduces the overall structure of the friend system and the implementation of some modules. In fact, the implementation of each module in the distributed system is not difficult, the main difficulty is to deal with the problems caused by the complex network environment (such as packet loss, delay, etc.) and the problems caused by server anomalies (such as increasing server redundancy in response to server downtime, which will lead to other problems).

After reading this article, have you learned to design the QQ friend system? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel. Thank you for reading.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.