Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use UCloud, a resource orchestration tool based on Terraform

2025-02-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article introduces how to use UCloud, a resource scheduling tool based on Terraform, which is very detailed. Interested friends can use it for reference. I hope it will be helpful to you.

Background

With the exponential growth of users' resource consumption on UCloud, the traditional API/SDK manual script resource management method has been unable to meet their needs. To this end, the UCloud R & D team wrote a set of resource orchestration tools based on Terraform to help users reduce the cost of managing resources on the cloud, provide them with a secure, reliable and highly consistent product experience, and eliminate the risk of migrating to the cloud as much as possible.

Terraform represents the cutting-edge technologies and standards in the industry. Based on this and with UCloud CLI and other tools, we have written a new generation of UCloud resource scheduling tools to further expand the functions of Terraform and achieve infrastructure programmability. In a case where traffic is offloaded to the CVM through ULB, compared with the traditional method, the construction time under the new scheme is reduced from 3 minutes and 20 seconds to 43 seconds, and the efficiency, stability and descriptivity of choreography have been significantly improved.

What is Terraform?

Terraform is an open source multi-cloud resource orchestration tool of Hashicorp, which has formed a complete ecology and has established cooperation with a number of mainstream cloud vendors.

Users describe the infrastructure through a specific configuration language (HCL, Hashicorp Configuration Language). Terraform tools uniformly parse, build the relationship between resources, generate execution plans, and complete the management of the entire infrastructure lifecycle by calling UCloud public cloud API.

Compared with other cloud resource management methods, the main features of Terraform are:

It has extensive compatibility and has been supported by more than 40 public cloud vendors at home and abroad, including 4 domestic cloud vendors including UCloud, and more than 200 software service providers.

Based on the design of IaC (Infrastructure is Code, Infrastructure as Code), the infrastructure can be described in a domain-specific language, which eliminates the semantic ambiguity in the description of infrastructure automation and reduces the uncertain impact caused by human factors.

Terraform generates a readable implementation plan before performing choreography actions, and critical infrastructure changes can be fully reviewed to ensure the reliability of the infrastructure.

Based on DAG (directed acyclic graph, Directed Acyclic Graph) to describe the relationship between resources and resources, because of the good topological properties of DAG, when the relationship between resource attributes and resources changes, the change action will be fully executed in parallel.

(the picture is from Terraform)

The following figure is a comparison between resource scheduling and traditional resource management:

Table 1: comparison between resource arrangement and traditional resource management

It can be seen that in the automated DevOps environment, resource orchestration has obvious advantages over traditional resource management. At present, it has covered the core products of IaaS layer, but with the passage of time, UCloud resource choreography will support more products in the future.

Application scenario

Users can easily benefit from Terraform, because it will cost a lot of time to initialize cloud services without resource orchestration tools, and complex change logic is often needed to ensure the security of infrastructure.

The UCloud resource orchestration tool can solve the following common problems:

-CI/CD automates resource management

-Application of shrinking and expanding capacity during peak periods

-deploy complex resource topologies (such as two-location, three-center application architecture)

For example, Post Krypton, a provider of SaaS solutions, has integrated its UCloud Terraform orchestration system into its own business.

The following figure is a schematic diagram of the business architecture of post krypton. It uses a number of cloud services at the same time, and needs a unified resource management platform for multi-cloud management, while the independent development of a resource management platform needs to interface with various cloud vendors, and at the same time, R & D personnel need to have an in-depth understanding of the product details of various cloud services, which will undoubtedly increase the R & D costs and operating costs of enterprises.

When dealing with SaaS business, Terraform can adjust resources flexibly and dynamically, users only need to adjust some parameters, they can use templates for very fast resource management. Compared with self-built management platform, UCloud Terraform can greatly save users' operating costs and efficiency.

Life cycle

Take the creation of UCloud cloud resources by executing Terraform for the first time as an example. The lifecycle of this resource orchestration action is shown below:

Figure: Terraform Lifecycle

The cubes in the figure are as follows:

-Terraform core process: responsible for resource definition files, building directed acyclic graphs, managing state storage

-Provider process: a process that provides resource orchestration capabilities, including capabilities implemented by cloud vendors (such as UCloud's resource orchestration implementation) and applications (such as TLS self-signed certificates)

-Provisioner process: a process that provides post-processing operations for resource orchestration, such as executing Shell commands, uploading files, etc.

With the central directed acyclic graph as the dividing line, the left part is the capabilities provided by Terraform itself, and the right side is the capabilities provided by cloud vendors.

The good abstraction of Terraform core ensures the security and stability of resource arrangement, and provides a solid engineering foundation for UCloud resource arrangement.

Practice of UCloud Resource arrangement

In a resource scheduling system of a production environment, we often have to rely on a large number of cloud resource background management services. In the engineering implementation of resource allocation, the fundamental demands of the following aspects need to be guaranteed first:

-ensure the success rate of resource scheduling in a complex terminal environment. This is the most basic and core demand.

-ensure the consistency of the product. So that users can migrate smoothly and change without perception.

-ensure the engineering quality of the products. As a way of access to critical infrastructure, resource scheduling itself needs to be stable and reliable.

Below, we will share in detail some of UCloud's practices in fault tolerance, access capability and engineering capability optimization in the development of Terraform-based resource orchestration tools.

Fault tolerance optimization

Fault tolerance is an important dimension to measure system availability. As the entrance of UCloud service, resource scheduling must be stable enough to have the ability to deal with faults reasonably, including the ability to tolerate upstream service anomalies and the ability to correct input anomalies.

First of all, the killer feature of Terraform is the separation of execution plan and process. Before performing the real resource orchestration action to change the existing network infrastructure, users can make an execution plan, compare the difference between the resource definition file and the current resource state, and check the changes of critical infrastructure.

In the process of implementing resource scheduling, UCloud customizes the Diff process of some resources with the help of the CustomDiff feature of Terraform execution plan. For example, there can be only one high-speed channel (UDPN) between the two regions. If a high-speed channel (UDPN) already exists before the choreography action is performed, it will prevent all choreography actions from being carried out in the planning stage and improve the efficiency of end-users.

Figure: customize Diff to check input in the execution plan

For error handling, the UCloud orchestration tool combs the whole life cycle of the orchestration workflow, strictly compresses the error information in the formal quad (verbs, additional actions, resource names, ID), and converts it into human-readable description information to feed back to the user. For input exceptions, it can accurately locate the source code line on the premise of providing a certain interactive error correction ability.

Figure: sample natural language representation of error message quad

Secondly, through the API consistency project described below, UCloud identifies the idempotent nature of all operations (that is, whether the operation has side effects, resulting in real resource creation), and performs automatic retry on all idempotent (no side effects) operations, which greatly improves the fault tolerance of the orchestration tool and ensures that the automatic retry mechanism is truly safe. For non-idempotent operations, thanks to Terraform's state management mechanism, you can simply reexecute the orchestration plan and retry only the failed creation process.

The UCloud orchestration tool also provides synchronous encapsulation for asynchronous operations. Using the built-in waiting mechanism of Terraform, after creating a resource, the orchestration tool will poll and wait for the resource to complete before the query is returned successfully, ensuring the atomicity of the operation and the consistency of the resource state.

Finally, for the above retry or wait mechanism, we use the interval of exponential growth (Exponential Backoff) and the graceful exit (Gracefully Shutdown) scheme to further improve the fault tolerance of resource scheduling.

Access capability optimization

Terraform-based resource orchestration has some inherent limitations, such as it is more suitable for infrastructure construction and is not suitable for the temporary daily work of adhoc, such as list query and switch operation.

If you want to restart the host in batch, the way to use Terraform is to query the corresponding data using data source, define the output variable, and then pass the value of the output variable as a parameter to the external script. In such an ad hoc query scenario, there is no obvious advantage over configuration management tools such as Ansible.

Therefore, in addition to resource orchestration, UCloud has developed UCloud CLI tools to extend the ability of resource orchestration. For example, use CLI to query and restart resources created by the UCloud orchestration tool:

UCloud realizes the integration of resource orchestration and UCloud CLI, and the resource orchestration tool can directly use the permission configuration information of CLI. You can also call UCloud CLI for additional resource management operations through the features of the orchestration tool.

Figure: example of Terraform and CLI integration usage

After connecting resource orchestration with UCloud CLI, resource orchestration can reuse the ability of CLI ad hoc query, while CLI can reuse resource topology information held by resource orchestration, such as host list, network CIDR information, etc., which greatly expands the product access capabilities of both parties.

Engineering capacity optimization

From the beginning of the project, UCloud resource scheduling takes the consistency and availability of end-users as the core demand. To meet these demands, several key technical difficulties must be overcome in engineering:

Enable users to migrate smoothly across versions and clouds as much as possible.

At the same time, realize the automatic management of the basic API that the resource orchestration tool depends on, and improve the availability of the orchestration tool from the source.

As a way of access to critical infrastructure, resource scheduling itself needs adequate quality assurance measures.

Smooth migration

First of all, for the upgrade of resource orchestration tools, UCloud strictly follows Terraform's Schema change policy, and whenever there is a destructive change in the attributes of resources, it will provide the implementation of version migration, so that end users will automatically smoothly migrate their resource status to the new version when upgrading the tool.

Secondly, for the migration between cloud platforms, UCloud implements a general style conversion function, which maps the uppercase hump (Camel) of the UCloud interface to the lowercase underscore (Snake) style commonly used by Terraform, and uses the product naming method recommended by Terraform to reduce the cost of cross-cloud migration. End users only need to make minor changes to the template to smoothly access the UCloud through the resource orchestration tool.

Change automation

As an important product access mode of UCloud, resource scheduling is highly dependent on all UCloud products. A small mistake in interface change and docking may lead to destructive consequences.

Therefore, the important goal of conformance engineering is to quickly respond to changes in new features of the product, while reducing labor costs as much as possible, automating changes and reducing the occurrence of errors.

In order to manage API uniformly and prevent silo information isolation between products, UCloud built a public and unified API management platform a long time ago, converging the definition of all existing network API to a unified API registry, and using a custom format to formally describe API Schema. The API management platform abstracts the scenario of API into a test set (Test Set), a call of API into a test case (Test Case), and uses custom expression syntax to construct random parameters for execution in the use case.

Figure: schematic diagram of API management platform

Based on the API management platform, the UCloud resource orchestration team wrote the automatic generation program of API SDK, which was translated into Go SDK code through strict formal API definition. At the same time, by writing a recursive descending expression parser, the expression syntax in the test case is translated into equivalent Go code. The direct mapping between API definition and Go code is realized, and the upstream changes are synchronized at low cost.

Figure: translating API SDK code by writing API modeling tools

In addition, in this process, UCloud abstracts an intermediate layer by writing an API modeling tool between API management platform and SDK, in which the idempotent properties of API are labeled uniformly, which provides a real safe retry mechanism for resource scheduling tools.

In this way, the construction of the interface consistency project on the whole calling link is completed, the complete semantic mapping from API management platform to SDK to Terraform is realized, the development and maintenance cost of SDK is reduced, and the uncertain influence caused by human change is eliminated.

Quality engineering construction

Resource orchestration, as a recommended way of resource management on large-scale cloud, involves the operation and management of critical infrastructure, and the quality of the orchestration tool itself is crucial.

Table 2: resource orchestration continuous integration checklist

As shown in Table 2, as an open source project, the UCloud resource orchestration tool has three quality cycles

-Open source collaboration cycle, using Travis CI for code style checking and unit testing, will not initiate real API requests

-merging the main branch cycle, UCloud uses Gitlab CI on Kubernetes for style checking, unit testing and integration testing, in which the integration test invokes the existing network API to operate real cloud resources and performs Daily Regression in the early hours of each day

-release the official Release to the official Terraform warehouse cycle, and the partner Hashicorp uses TeamCity to conduct full acceptance tests. When all the tests are completed, the new version will be released.

In order to ensure that the code will not corrupt over time, and remove some hidden dangers in advance, such as spelling errors, security key disclosure, unreasonable abstraction, etc., the UCloud access product team selected three static checking tools with different dimensions to quantify code quality, including:

-GoReportCard, used for the most basic style check

-SonarCloud to find Bug and security problems in the code

-Gocyclo, which calculates the cyclomatic complexity of a function (cyclomatic complexity is a measure of the complexity of a function, which is related to the complexity of the control flow)

And through periodic code optimization, the quantitative index of code quality is always maintained at the * * A + rating. **

After a long period of development, Terraform has become a common resource scheduling tool in the industry, and in recent years, competitors at home and abroad have begun to support Terraform-based resource scheduling system, which proves the strong demand for general resource scheduling system in the industry.

UCloud made an in-depth study of the internal mechanism of Terraform, and based on this, made a deep exploration for the next generation resource scheduling system of UCloud, optimized many times in the process of research and development, opened up the basic engineering construction on the whole link, and finally guaranteed the reliability and stability of resource allocation through sufficient quality engineering practice.

About how to use Terraform-based resource orchestration tool UCloud to share here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report