Message synchronization Mechanism based on TimeLine Model 07/19 Update SLTechnology News&Howtos

Message synchronization Mechanism based on TimeLine Model

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

Although our current IM is microserviced, the core message delivery model is still described in the following figure, see "the complete Design of a massive online user Instant messaging system (IM)".

In this way, the basic ideas and steps of message synchronization are as follows (the sequence number does not correspond to the sequence number in the figure)

1. Store the message in your offline inbox

2. Push messages to online users

3. Online users return the ack information of the received message.

4. The server clears this offline message from the user

For offline users, you can pull offline messages directly after logging in.

There is something reasonable about this message synchronization.

1. The process is more intuitive.

2. The amount of network interaction is less (compared with the later TimeLine model)

But there are more shortcomings in this plan.

1. We have App and Web, and we need to write an offline message for each end. Since offline messages are diffused, if you write one more, the server will have more pressure.

2. After the message ack comes back, the server needs to delete the corresponding message from the storage. This process performance is also a problem.

This message mode is competent in a relatively single IM application scenario. However, as the message scenario becomes more and more complex, especially after the introduction of SDK, this pattern has many drawbacks. There may be many sides in the application of SDK, and it is impossible for the server to write offline messages for each end!

For SDK, we use TimeLine model to achieve message synchronization between client and server.

The following is a nailing approach, comparing the traditional architecture with the modern architecture. Our current piece of IM message synchronization is somewhere in between.

In traditional architecture, messages are synchronized first and then stored. For online users, the message will be synchronized directly to the online receiver in real time, and the message will not be persisted after the synchronization is successful. For offline users or when the message cannot be synchronized in real time, the message will be persisted to the offline library, and when the receiver reconnects, all unread messages will be pulled from the offline library. When the message in the offline library is successfully synchronized to the recipient, the message is deleted from the offline library. In the traditional messaging system, the main task of the server is to maintain the connection status between the sender and the receiver, and to provide the ability of online message synchronization and offline message caching to ensure that the message can be delivered from the sender to the receiver. The server does not persist messages, so it cannot support message roaming.

Under modern architecture, messages are stored first and then synchronized. The advantage of storing first and then synchronizing is that if the receiver acknowledges that the message has been received, the message must have been saved in the cloud. And the message will be saved by two libraries, one is the message repository, which is used to hold all the messages of the session, mainly to support message roaming. The other is the message synchronization library, which is mainly used for multi-terminal synchronization of the receiver. After the message is sent from the sender and forwarded by the server, the server will save the message to the message repository first and then to the message synchronization repository. After the persistent preservation of the message is completed, online push will be directly selected for the online receiver. But online push is not a necessary path, just a better messaging path. For recipients who fail to push online or offline, there will be another unified way to synchronize messages. The receiver will actively pull all unsynchronized messages from the server, but it is unknown to the server when and where the receiver will synchronize messages, so the server is required to save all messages that need to be synchronized to the receiver. this is the main function of the message synchronization library. For new synchronization devices, there will be a need for message roaming, which is the main role of the message repository, in which you can pull all the historical messages for any session.

The above is a simple comparison between the traditional architecture and the modern architecture. The synchronization and storage process of the whole message in the modern architecture has not become too complicated, but it can achieve multi-terminal synchronization and message roaming. The core of modern architecture is the two message libraries "message synchronization library" and "message repository", which are the core basis of message synchronization and storage.

Let's see what the Timeline model looks like.

As shown in the figure, an abstract representation of the Timeline model, Timeline can be simply understood as a message queue, but this message queue has the following characteristics:

Each message has a sequential ID (SeqId), and the SeqId of the message behind the queue must be larger than the SeqId of the previous message, which ensures that the SeqId must grow, but does not require strict increment.

New messages are always added at the end, ensuring that the SeqId of the new message is always larger than that of the message already in the queue.

According to the SeqId, you can randomly locate a specific message to read, or you can read all the messages in a given range.

With these features, message synchronization can be easily implemented with Timeline. In the example in the figure, the message sender is A, the message receiver is B, and B has multiple receivers, B1, B2, and B3, respectively. A sends messages to B, which needs to be synchronized to multiple ends of B. the messages to be synchronized are exchanged through a Timeline. All messages sent by A to B are saved in this Timeline, and each receiver of B pulls messages from this Timeline independently. After each receiver completes synchronization, the SeqId of the latest synchronized message, that is, the latest point, will be recorded locally as the starting point of the next message synchronization. The server does not save the synchronization status of each end (I think the server can also record the synchronization points of each side), and each end can pull messages from any point at any time.

After reading the TimeLine model, I was troubled. There are both push and pull, how can the client determine where the synchronization point is? In particular, what if the user opens the software and pulls a new message during the synchronization process?

Here I would like to thank Brother Bing (Daniel of LinkedIn) for his hint, he said that their messages were all pulled. Since the message is pulled, what is pushed?

If you take a closer look at the diagram of the modern architecture, step 3 says "push notification". What is pushed is a prompt for a new message. When the client receives this notification, the client pulls the synchronization message, and the client and the server maintain the synchronization point of this end respectively (to save network interaction, after the client pulls the synchronization message, there is no need to confirm to the server, so the synchronization points maintained by the client and the server are not exactly the same, but do not affect the business logic. This detail will be described in a separate article later). Since there are only pull messages, the maintenance of synchronization points becomes very simple, and the client can save the ID (SeqId) of the latest messages pulled.

At this point, a message synchronization model that supports multiple sides has been formed.

So is there any room for optimization in this scheme?

This method increases the number of network interactions compared with our current way, is there any way to save network overhead and enjoy the multi-terminal friendly support of the TimeLine model?

I have read an article that Wechat strictly increments the number of each user's message ID, that is, the strict incremental number of each user's TimeLine model message. The serial number of the first message of the user is 1, the second is 2, and so on.

The development cost of such a numbering service is still relatively high, so why does Wechat do it? I now think one of the reasons is to reduce network interaction. Using push notification, and then pull synchronous messages, after all, one more network interaction. If the message is strictly numbered, you can combine the traditional push message with the new push notification. The message pushed to the client has a strictly incremental message ID, and the client can calculate whether it needs to pull the synchronous message based on the message ID (if the pushed message ID is only 1 larger than the largest message ID on the client, then it is not necessary to pull the synchronous message).

At the implementation level, there are also many challenges, the difficulty is how to map the logical model to the physical model or specific middleware, the details will be described later.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.