Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Reveal the IT infrastructure behind LOL? embark on the journey of deployment diversity

2025-01-14 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >

Share

Shulou(Shulou.com)06/01 Report--

At the beginning of this issue, we will share Tungsten Fabric user case articles one after another to discover more application scenarios of TF together. The protagonist of the "Reveal LOL" series is TF user Riot Games. As the developer and operator of LOL League of Legends, Riot Games faces the challenge of complex deployment on a global scale. Let's reveal the "heroes" behind LOL and see how they run online services.

By Jonathan McCaffrey (Source: Riot Games)

My name is Jonathan McCaffrey and I work on Riot's Infrastructure team. This is the first article in a series where we'll dive into how to deploy and operate backend capabilities globally. Before delving into technical details, it is important to understand how Rioters think about feature development. At Riot, player value is paramount, and development teams often work directly with the player community to deliver features and improvements. To provide the best player experience, we need to act quickly and have the ability to keep changing plans quickly based on feedback. The mission of the infrastructure team is to pave the way for our developers to do just that--the more Riot teams are empowered, the faster they can deliver functionality to players.

Of course, easier said than done! Given the diversity of our deployments, many challenges arise: our servers are spread across public clouds, private data centers, and partner environments like Tencent and Garena, all of which are geographically and technologically diverse.

This complexity places a huge burden on functional teams when they are ready to deliver components. That's where the infrastructure team comes in--we remove some of the barriers to deployment with our container-based internal cloud environment, which we call "rCluster." In this article, I'll discuss Riot's journey from manual deployment to using the rCluster startup feature. To illustrate rCluster's products and technologies, I'll walk you through the release of the Hextech Crafting system (Hextech Crafting is the name of League of Legends 'unpacking system).

a little history

When I first started at Riot seven years ago, we didn't have much deployment or server management processes; Riot was a visionary startup with a small budget and a need to grow fast. When building the production infrastructure for League of Legends, we rushed to meet the demands of the game, the demand for more features from developers, and the demand from regional teams to open new regions around the world. We enable servers and applications manually, with little thought given to principles or strategic planning.

In the process, we turned to Chef for many common deployment and infrastructure tasks. At the same time, more and more public clouds are beginning to be used for big data and Web work. These changes have also triggered numerous changes in our network design, vendor selection and team structure.

Our data centers house thousands of servers, and new servers are installed for nearly every new application. The new servers will exist in VLANs that they create manually and have routing and firewall rules for secure access between networks. Although this process can help us improve safety and clearly define the fault domain, it is time-consuming and laborious. Even more troubling, most of the new features at the time were designed as small Web services, which allowed our LoL ecosystem to proliferate in the number of independent apps.

Most importantly, development teams lack confidence in their application testing capabilities, especially when it comes to deployment issues such as configuration and network connectivity. Tightly tying applications to physical environments means that differences between production data center environments are not replicated in QA (testing), Staging (pre-go-live), and PBE (pattern-based development). Each environment is handmade, unique, and ultimately inconsistent. (Note: This article mainly wants to describe two problems, the first is that the customer's application and environment are closely related, but because the application environment of different teams or departments is different, there may be problems due to inconsistency in the application launch)

As we grappled with the challenges of manual server and network provisioning in an ecosystem with an increasing number of applications, Docker began to gain popularity among our development teams as a way to address issues of configuration consistency and development environment. Once we started using Docker, it became clear that Docker could do more and play a key role in dealing with infrastructure.

for 2016 and beyond

The infrastructure team set a goal to address these issues for players, developers and Riot for the 2016 season. By the end of 2015, we had moved from manually deploying features to deploying features like Hextech Crafting in Riot regions in an automated and consistent manner. Our solution is to use rCluster, a new system that leverages Docker and SDN software-defined networking in a microservices architecture. Switching to rCluster compensates for inconsistencies in our environment and deployment process and allows product teams to focus on their product development.

Let's dig deeper into this technology to see how rCluster supports features like Hextech Crafting in the background. To explain, Hextech Crafting is a League of Legends feature that gives players a new way to unlock in-game items.

Internally called "Loot," this feature consists of three core components:

Loot service-Java application that serves Loot requests via HTTP/JSON ReST API. Loot Cache-Cache clusters that use Memcached and small golang sidecars for monitoring, configuration, and start/stop operations. Loot database-MySQL database cluster with one master server and multiple slaves.

When you open the crafting screen, the following will happen:

The player opens the crafting screen in the client.

The client makes RPC calls to the front-end application (also known as "feapp") to proxy calls between players and internal back-end services.

feapp calls Loot Server feapp to look for Loot services in Service Discovery to find their IP and port information. feapp makes HTTP GET calls to the Loot service. The Loot service checks the Loot cache to see if a player's inventory exists. Inventory items are not in the cache, so the Loot service calls the Loot database to see what items the player currently owns and populates the cache with that result. The Loot service replies to the GET call. feapp sends RPC responses back to the client.

Working with the Loot team, we were able to build Server and Cache layers into Docker containers, and their deployment configurations are defined in JSON files, as follows:

Loot Server JSON Example:

Loot Cache JSON Example:

However, to actually deploy this functionality and make real progress in reducing the aforementioned problems-we need to create clusters that can support Docker in North America, South America, Europe, Asia, etc. This requires us to solve a number of difficult problems, such as:

Scheduling Containers Networking with Docker Continuous Delivery Running Dynamic Applications

These components of the rCluster system will be described in more detail in subsequent articles, and here I will briefly outline each component.

Scheduling

We implemented container scheduling in the rCluster ecosystem using Admiral software written. Admiral talks to Docker daemons across a range of physical machines to understand their current state of operation. The user makes the request via HTTPS by sending the above JSON (Admiral is used to update the state required for the relevant container), and then it continuously scans the cluster for activity state and required state to find out what action needs to be taken. Finally, Admiral calls the Docker daemon again to start and stop the container to converge to the desired state.

If a container crashes, Admiral can detect the difference between the real-time state and the expected state and launch the container on another host to correct it. This flexibility makes it easier to manage servers because we can seamlessly "drain" them, maintain them, or re-enable them to handle workloads.

Admiral is similar to Marathon, an open source tool, so we are looking at porting efforts to leverage Mesos, Marathon and DC/OS. If this work bears fruit, we will discuss it in a future article.

Networking with Docker

Once the container is up and running, we need to provide network connectivity between the Loot application and the rest of the ecosystem. To do this, we leveraged OpenContrail to provide a dedicated network for each application and let our development team manage its policies themselves using JSON files in GitHub.

Loot Server Network:

Loot Cache Network:

When an engineer changes this configuration in GitHub, a transformation job is run and API calls are made in Contrail to create and update policies for their application's private network.

Contrail uses a technology called Overlay Networking to implement these private networks. In our case, Contrail uses GRE tunnels between compute hosts and gateway routers to manage traffic entering and leaving overlay tunnels and heading to the rest of the network. The OpenContrail system is inspired by standard MPLS L3V P N and is conceptually very similar to standard MPLS L3V P N. Further architectural details can be found here. (Note: opencontrail has now been renamed TF)

In implementing this system, we must address some key challenges:

Integration between Contrail and Docker allows seamless access to new overlay networks by other parts of the network (outside rCluster) Allows applications in one cluster to communicate with another Cluster Runs overlay networks on AWS Build edge-oriented applications in overlays HA Load Balancer Continuous Delivery

For Loot applications, the CI flow is as follows:

The overall goal here is that when the Master repo changes, a new application container will be created and deployed to the QA environment. With this workflow, teams can quickly iterate through their code and see the changes reflected in the actual game. Tight feedback loops make it possible to quickly improve the experience, which is the main goal of Riot's "player focus" project.

Running dynamic applications

So far, we've talked about how to build and deploy features like Hextech Crafting, but if you spend a lot of time working on such container environments, that's not the problem.

In the rCluster model, containers have dynamic IP addresses and are constantly jumping. This is a radical departure from our previous static server and deployment approach, and requires new tools and processes that work.

Some of the key issues are as follows:

How do we monitor an application if its capacity and endpoints are constantly changing? If one application keeps changing, how does it know the endpoint of another application? If you can't ssh into containers and reset logs every time you start a new container, how do you classify your application's problems? If the container is baking at build time, how do you configure things like database passwords, or what options are toggled between "Turkey" and "North America"?

To solve these problems, we must build a microservices platform that handles things like service discovery, configuration management, and monitoring. In the final part of this series, we'll delve into more details about the system and the problems it solves.

conclusion

Hopefully this article has given you an overview of the various issues we're trying to solve to make Riot easier to deliver player value. As mentioned earlier, we'll focus on rCluster's use of scheduling, networking with Docker, and running dynamic applications in subsequent articles.

If you are on a similar journey or would like to participate in the discussion, you are welcome to contact us.

Follow WeChat: TF Chinese Community

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Network Security

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report