Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to understand the Information processing Architecture of Netflix

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/01 Report--

This article introduces the relevant knowledge of "how to understand the information processing architecture of Netflix". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Netflix, an online movie rental provider, has been named the most satisfied website for five times in a row. In the past seven years, Netflix streaming service has grown from occasional thousands of users to millions of users who watch more than 2 billion hours per month on average. Netflix can not be so successful without the collection and analysis of user behavior data, so what data will be collected by Netflix, what will the data be used for, and what is its processing architecture?

In fact, when users start watching movies or TV shows on Netflix's website, Netflix's data system creates a "viewing session (view)" where all event information describing the session is collected. The viewing session data architecture can deal with many scenarios from user experience to data analysis, of which there are three main scenarios:

What videos have users watched? The system needs to know all the viewing history of each user in order to recommend relevant video content to the user, and to display the viewing history in the "recent viewing" column on the page. What the user sees is very important to the measurement of the user's interest and the decision of the product and content.

Where did the user leave the video? For each movie or TV show, Netflix records where each user saw it and from what time it left. This allows Netflix users to continue watching videos on the same or another device.

What videos are you still watching on your current account? Account sharing among family members allows anyone to watch their favorite videos at any time, but it also means that someone has to give up watching when the number of accounts online exceeds the limit. In response to this scenario, Netflix's viewing session data system collects periodic signals for each session to determine whether a member is still watching the relevant video.

The implementation of these scenarios is inseparable from a powerful and stable data processing system. Netflix's current system architecture evolved from early single-database applications, when the main requirement was to be able to provide video services to users with low latency and to handle fast-growing data sets from millions of Netflix streaming devices. Over the past three years, Netflix has been constantly improving the architecture, and now the system can handle hundreds of billions of events every day.

The current architecture diagram is as follows:

The main interface of the whole architecture is viewing session service, which is divided into two parts: stateful layer and stateless layer. The stateful layer stores the latest data for all active views in memory. Through the mod N modular operation of the user account ID, the data is simply divided into N stateful nodes. When stateful nodes come online, the system uses a location selection process to determine which part of the data belongs to them. All persistent data is stored in Cassandra, and there is a Memcached on top of Cassandra to ensure a low-latency read path, but session data may become obsolete in this way, and if an error occurs on a stateful node, the browsing data will not be able to read or write. The introduction of stateless layer is to solve this problem, it improves the availability of the system, when stateful nodes can not be accessed, this layer will feed back outdated data to users.

But even with many improvements, the above architecture still has some flaws:

Although the stateful layer uses a simple slicing technique that obeys the hot spot distribution, the Cassandra layer does not obey these hot spots; at the same time, if it is moved from one AWS Region to multiple AWS Region, then a mechanism must be customized to realize the state communication between the state layers distributed on different Region, which greatly increases the complexity of the system.

As for the viewing session service, it encapsulates the functions of collecting, processing and providing session data. With the evolution of the system and the increase of functions, the responsibility of the service is more and more, which increases the difficulty of operation and maintenance.

Although Memcached provides very good throughput and latency characteristics, a technology that provides native support for first-class data types and operations, such as append, can better meet the requirements.

In order to expand the system to meet the needs of the next order of magnitude, Netflix is rethinking its infrastructure. The main design principles of the new system include:

Availability is more important than consistency.

Micro services. Components that are soft together in a stateful architecture are separated into separate services-- or collect, process, or provide data-- according to their main purpose. The state management function is hosted to the persistence layer, leaving the application layer stateless, while components are decoupled by event queues.

Mixed persistence. Use a variety of persistence techniques to take advantage of each option. Use Cassandra to achieve high-capacity, low-latency writing. Use Redis to achieve high-capacity, low-latency reading.

The new architecture that follows the above principles is implemented as follows:

Of course, this architecture diagram is only the current design diagram of Netflix, and we don't know to what extent it has been implemented. Netflix said that re-architecting critical systems to scale to the next order of magnitude is a very difficult task, requiring long development, testing, and verification, and migration is not so easy. But guided by these architectural principles, Netflix believes that the next generation of systems they are building can meet their large-scale, fast-growing needs.

This is the end of "how to understand the Information processing Architecture of Netflix". Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report