How OpenStack scales out configuration 07/04 Update SLTechnology News&Howtos

How OpenStack scales out configuration

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article will explain in detail how to scale out the configuration of OpenStack. The editor thinks it is very practical, so I share it for you as a reference. I hope you can get something after reading this article.

Starting point

How to plan cloud system scalability needs to be balanced in many variable configurations. There is no solution that can meet all the expansion needs. Even so, there are still many benefits in monitoring a large number of system indicators in the actual operation and maintenance.

A common starting point for planning is the core number of cloud systems. Some basic formulas can be used to estimate the relevant capacity information. For example: (reuse factor X physical core) / virtual core required per instance: estimate the number of virtual machines that can be run; disk space X instance number: estimate the storage capacity required by the system. These estimates can be used to determine how many additional resources need to be added to the cloud.

The system defaults for OpenStack are as follows:

Name virtual core memory disk space temporary space m1.tiny1512 MB0 GB0 GBm1.small12 GB10 GB20 GBm1.medium24 GB10 GB40 GBm1.large48 GB10 GB80 GBm1.xlarge816 GB10 GB160 GB

We assume the following conditions:

Most of the instances of 200 physical cores are of m1.medium type (2 virtual cores, 50GB storage space) the default is the CPU reuse factor 16:1 (configure the cpu_allocation_ratio parameter in the nova.conf file)

Calculated according to the formula (16 X 200) / 2 = 1600, the hardware environment can support 1600 virtual machine instances and storage space that requires 80TB

For API services, databases and message queues require more information before they can be evaluated. You also need to understand how cloud systems are used. For example, compare the usage of two kinds of virtual machines, one for the web platform and the other for integration testing used in the project development process. The number of virtual machines used by the former may only increase every few months, while the latter often changes, resulting in more request loads on cloud controllers. Collect the average usage time of the virtual machine, when the metric value is high, the load of the corresponding cloud controller will be less.

With the opening and shutting down of virtual machines, we also need to consider the impact of users on service access, especially nova-api and the corresponding database. The most common thing for users is to list virtual machine instances and related information. When the cloud system has many users, operations such as listing virtual machine instances can also impose a large system load. Sometimes users are not aware of this operation (for example, open the OpenStack dashboard to enter the instance page interface, and the browser will refresh the virtual machine list information every 30 seconds).

After considering the above factors, you can roughly determine the configuration requirements of the cloud controller. Usually a server with 8 cores and 8 gigabytes of memory is sufficient to handle the compute nodes in a server rack.

In addition to the performance of the virtual machine for users, you also need to consider some metrics of the hardware itself, while balancing budget and performance requirements. Such as: storage performance (number of disks / cores), memory availability (memory space / cores), network bandwidth (Gbps/ cores) and total CPU performance (CPU/ cores).

You can check the * * Monitoring * * section to find out which metrics can be tracked to determine whether cloud system expansion is needed.

Add Controller nod

You can easily scale out the cloud system by adding more nodes. Adding a compute node is an easy task, you just need to install and configure it in the same way as before. But at the same time, for the high availability of cloud systems, you need to consider the following points when designing.

In the example in this book, multiple services will be operated and maintained on the cloud control node. Some services that communicate directly through message queues (such as nova-scheduler,nova-console) can be easily installed in other hardware environments. But other services are more troublesome.

For services accessed by users, such as dashboard, nova-api, and object storage, it is recommended to allocate access requests through load balancing. Any standard HTTP load balancing method can be used (DNS polling, load balancing hardware or software (such as Pound or HAProxy). What you need to pay attention to on the dashboard is VNC proxy. VNC proxy uses the WebSocket protocol, but there are problems with load balancing for layer 7 applications. See horizontal reply Storage (Horizon session storage) (http://docs.openstack.org/developer/horizon/topics/deployment.html#session-storage)

Some services (such as nova-api,glance-api) can support multiple processes by modifying the markings in the configuration file. Multi-processes can better allocate computing tasks on multi-core CPU hardware.

Some configurations can be used for MySQL load balancing, and RabbitMQ also supports the built-in cluster mechanism. Please refer to the operations section for more information on configuring various services.

Cloud system isolation

Use the methods in OpenStack to isolate cloud systems: cells, regions, zones, and host aggregates. Each method provides different functions, which can be described as follows:

Cells

Usage scenario: a single API endpoint or secondary scheduling is required; for example: there are multiple locations in the cloud system, and virtual machines can be scheduled to run at any location or specified location; system overhead: each node needs to fully install nova environment and nova-cell services except nova-api; shared services: Keystone,nova-api.

Regions

Usage scenario: there is no communication coordination mechanism between the Region,Region to be split, and each needs an independent API endpoint; example: the cloud system needs to distinguish multiple locations, and you can specify the scheduling and allocation of virtual machines among these locations. The underlying infrastructure is shared among multiple locations; overhead: each Region requires a separate API endpoint, and each Region needs to install a complete nova environment; shared services: Keystone.

Availability Zones

Usage scenario: to ensure hardware isolation and redundancy, nova can be logically deployed separately. For example: a cloud system in one location, where the underlying hardware has two power supply systems; system overhead: configure nova.conf configuration files; shared services: Keystone, all nova services.

Host Aggregates

Usage scenario: a group of hosts with similar characteristics are expected to be combined for virtual machine scheduling; example: scheduling and running virtual machines on a set of trusted servers; system overhead: configuring nova.conf configuration files; shared services: Keystone, all nova services.

The above isolation schemes can be divided into two categories:

Cells & regions: used to isolate the nova deployment service. Availability zones & host aggregates: used to isolate deployment sites.

Cells & Regions

OpenStack's cell is a distributed operation mode, which does not need to introduce too complex technology and will not affect the existing nova environment. Servers that support running cloud systems are grouped into groups, each group being a Cell,Cell, which is a way to organize resources in a tree-like manner. The Cell tree root node cell (API cell) service runs the nova-api service, and the nova-compute service is not running on the server. The server under the root node runs all nova-* services except nova-api. Each Cell runs its own message queue, database and nova-cells service. The nova-cells service manages communication between the root node cell and the child node cell.

The above Cell methods can run a single API service to control multiple cloud environments. In addition to the direct scheduling and selection of server resources by nova-scheduler, the form of secondary scheduling (scheduling selection cell resources) is also introduced. This indirect scheduling mode of cell brings a more flexible control mode to control the virtual machine.

Different from Cell, Region is a method with higher isolation than cell, in which different region needs to have its own API endpoint. Users need to explicitly specify Region to run virtual machine instances in different Region. There are no other additional services to run under Region.

Currently, the OpenStack dashboard service only supports a single Region, so each Region needs to run a dashboard service. Region is used to provide a way to build robust services on a shared infrastructure that can be used to establish fault-tolerant mechanisms at a higher level.

Availability Zones & Host Aggregates

Host Aggregates and Availability Zones are used to separate nova deployments. Host Aggregates and Availability Zones are similar in configuration but have different uses. Host Aggregates is used for logical grouping for load balancing and distributed virtual machine instances, and Availability Zones is used for physical redundancy and isolation grouping from other Zone (for example, server devices that separate power supplies or physically different networks). Host Aggregates can be seen as a way to further group servers in Availability Zones, for example, servers can be divided into groups, each sharing similar resources such as storage and network, or distinguishing between special resources (trusted computing hardware).

Host Aggregates is often used to group servers to provide nova-scheduler scheduling resources, such as restricting certain types of virtual machines or images to certain servers in a given subnet.

Availability zones can logically group servers used by computing services or object storage services in OpenStack. Typically, some servers with the same properties are assigned to the same zone. For example, if the power supply of some racks in the data center comes from different power lines from other racks, then these servers on separate racks need to be distributed in the same Availability zone.

Availability zones can also be used to distinguish between different grades of hardware. This is especially useful for OpenStack object storage services, where the server used to run object storage may be configured with different disks. When the user selects the configuration required for the application, you can specify a virtual host instance or disk volume on a different zone. Through independent choice, users can determine that their applications are running on different underlying hardware servers, so that they can achieve high availability of applications in the event of hardware failure.

Hardware expansion

While preparing the resources associated with the deployment and installation of OpenStack, another very important task is to prepare for extended hardware deployment in advance. This book recommends that there are free racks for use at least outside the existing running OpenStack cloud environment, and that you need to prepare a plan for when and how to expand the hardware.

Hardware purchase

The "cloud system" is always described as a flexible environment, and the service environment running on the cloud is often turned on and off at will. This description of the cloud system is correct, but it does not mean that the underlying servers that support the cloud system are constantly changing. Only the stability and correct configuration of the underlying hardware can ensure the normal and stable operation of the cloud system on it. Usually, the main work needs to be focused on establishing a stable basic hardware environment to provide users with a flexible cloud system environment.

OpenStack can be deployed on any hardware environment compatible with supported Linux distributions, such as Ubuntu 21.04 used in this book

The server hardware environment does not have to be exactly the same, but at least the same CPU type is required to support the migration of instances on different hardware environments.

The typical hardware recommended by OpenStack is commercial hardware (commodity). Commodity hardware refers to the standard hardware configuration provided by most hardware suppliers in the market. If you can purchase directly, you can directly establish a type of system (such as computing, object storage, cloud controller) to purchase. Another scenario is that the budget is small and existing servers are upgraded to meet performance and virtualization requirements, which can also support the deployment of OpenStack.

Planning capacity

OpenStack is relatively simple and easy to increase system capacity. Please refer to the scalability section, which is particularly important to consider the capacity of the cloud controller. It is recommended that you consider leaving additional capacity for compute and object storage nodes where possible. The additional new nodes do not have to maintain the same configuration and vendor as the existing nodes.

On the computing node, nova-scheduler will allocate and manage system resources according to the number of cores and memory information. At the same time, it should be noted that when using different CPU, the user experience will be inconsistent due to the difference in CPU computing speed. When adding object storage nodes, different nodes need to give weights to reflect the performance of the nodes.

Monitoring the use of resources and the growth of users can help you understand when you need to consider purchasing new resources. Detailed and useful monitoring indicators are provided in the monitoring section.

High load test

Server hardware is most prone to failure at the beginning of use and near the end of its useful life. Therefore, in order to avoid being busy dealing with hardware failures in the production environment, high-load testing can be used to test hardware failures at the initial stage. Usually high-load testing can run CPU or disk performance benchmark software for several days in a row to increase the hardware load to the limit.

This is the end of this article on "how to scale out the configuration of OpenStack". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, please share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.