How OLM manages more and more operator 07/09 Update SLTechnology News&Howtos

How OLM manages more and more operator

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article will explain in detail how OLM manages more and more operator. The content of the article is of high quality, so the editor shares it for you as a reference. I hope you will have some understanding of the relevant knowledge after reading this article.

Introduction: OLM (Operator Lifecycle Manager), as a part of Operator Framework, can help users to automatically install, upgrade and manage the life cycle of Operator. At the same time, OLM itself is installed and deployed in the form of Operator, it can be said that its working way is to use Operators to manage Operators, and it provides declarative (declarative) automatic management capabilities for Operator, which is also fully in line with the design concept of Kubernetes interaction. Let's take a look at the basic architecture and installation of OLM in the future.

OLM component model definition

The emergence of OLM is to help users who do not have domain knowledge such as big data and Cloud Monitoring to deploy and manage complex distributed applications such as etcd and big data analytics or monitoring services. Therefore, in terms of its design goals, OLM officials hope to provide general management capabilities for cloud native applications in the following directions, including:

Lifecycle management: managing operator itself and monitoring resource model upgrades and lifecycles

Service discovery: discover which operator exists in the cluster, which resource models are managed by these operators, and which operators can be installed in the cluster

Packaging capabilities: provides a standard mode for the distribution, installation and upgrade of operator and dependent components

Interaction capabilities: after standardizing the above capabilities, you also need to provide a standardized way (such as CLI) to interact with other user-defined cloud services in the cluster.

The above design goals can be summed up in the following directions:

Namespace deployment: operator and its management resource model must be deployed with namespace restrictions, which is a necessary means to achieve logical isolation and use RBAC to enhance access control in a multi-rent environment.

Define using custom resources (CR): using the CR model is the preferred way to define the read-write interaction between users and operator; at the same time, in an operator, the resource model that declares itself or is managed by other operator is also declared through CRDs; the behavior pattern configuration of operator itself should also be defined by fields in CRD

Dependency resolution: in implementation, operator only needs to care about the packaging of itself and its management resources, but not the connection to the running cluster; at the same time, it uses dynamic library definition on dependencies. Here, for example, vault-operator needs to create an etcd cluster as its backend storage when it is deployed. In this case, we should not directly include the etcd operator corresponding container in vault-operator. Instead, we should let OLM resolve the corresponding dependencies through dependency declaration. For this reason, it is necessary to have a set of dependent definition specifications in operators.

Idempotency of deployment: dependency parsing and resource installation can be repeated, while problems during application installation are recoverable

Garbage collection: in principle, we rely on Kubernetes's native garbage collection capabilities as much as possible. When deleting OLM's own extension model ClusterService, we need to clean up the associated resources in operation at the same time; at the same time, we need to ensure that other resources managed by ClusterService are not deleted.

Support tags and resource discovery.

Based on the above design goals, OLM defines the following models and components for Operator in its implementation.

First, OLM itself contains two Operator:OLM Operator and Catalog Operator. They manage the underlying CRD models extended from the following OLM architectures:

During the life cycle of Operator installation management, Deployment,Serviceaccount,RBAC-related roles and role bindings are created through OLM operator; Catalog Operator is responsible for the creation of resources such as CRDs and CSVs.

Before introducing the two Operator of OLM, let's take a look at the definition of ClusterServiceVersion. As a basic element of the OLM workflow, it defines the collection of metadata and run-time information of user business applications under OLM management, including:

Apply metadata (name, description, version definition, links, icons, tags, etc.), which we will see in the practical examples in the next chapter

Installation policies, including deployment collections and permission sets such as service accounts,RBAC roles and bindings required during Operator installation

CRDs: including the type of CRD, owning services, other K8s native resources for Operator interaction, and spec,status, fields field descriptors that contain model semantic information, etc.

With a basic understanding of the concept of ClusterServiceVersion, let's take a look at OLM Operator.

First of all, the work of OLM Operator will be based on ClusterServiceVersion. Once the dependent resources declared in CSV have been successfully registered in the target cluster, OLM Operator will be responsible for installing the application instances corresponding to these resources. Note that OLM Operator does not pay attention to the creation and registration of CRD models corresponding to dependent resources declared in CSV. These actions can be done by the user's manual kubectl operation or by Catalog Opetator. Such a design also gives users a familiar process of gradually adapting to the OLM architecture and eventually applying it. In addition, OLM Operator snooping on the custom model corresponding to dependent resources can be global all namespaces or limited to a specified namespace.

Then let's take a look at Catalog Operator, which is mainly responsible for parsing the dependent resource definition declared in CSV, and it completes the version update corresponding to CSV by listening to the version definition of channels corresponding to the installation package in catalog.

Users can set the required installation packages and update pull sources in channel by creating a Subscription model. When an available update is found, a user's corresponding InstallPlan model will be created in the corresponding namespace. Of course, users can also manually create an InstallPlan,InstallPlan instance that contains the definition of the target CSV and the relevant approval approval policy, and Catalog Operator will create the corresponding execution plan to create the dependent resource model required by CSV. Once the user has completed the approval, Catalog Operator will create the relevant resources in the InstallPlan, and the dependent resource conditions of the OLM Operator concerns just mentioned are met, and the Operator instance defined in CSV will be created by OLM Operator.

Introduction to the structure of OLM

In the previous section, we learned about the basic component model and related definitions of OLM. In this section, we will introduce its basic architecture, as shown in the following figure:

First of all, two important meta-Operator and corresponding extension resources (such as ClusterServiceVersion,InstallPlan introduced in the previous section) are provided in Operator Framework for the life cycle management of user application Operator. Various resource combinations for users to deploy Operator are defined in the custom CSV model, including how Operator is deployed, what types of custom resources are managed by Operator, and which K8s native resources are used.

In the definition in the previous section, we also learned that the custom resource model that OLM Operator requires it to manage before installing the corresponding Operator instance has been registered in the target installation cluster, and this action can be created by the cluster administrator manually kubectl, or it can be done with Catalog Operator. Catalog Operator can not only complete the registration of the target CRD model, but also be responsible for the automatic upgrade of the resource model version. Its workflow includes:

Ensure the cache and index mechanisms of CRDs and CSVs models for version control and registration of corresponding models

Listen to the unresolved InstallPlans created by the user:

Find the CSV model that satisfies the dependency condition and add it to the resolved resource

Add all CRD models managed or dependent by the target Operator to the parsing resource

Find and manage each dependent CRD corresponding CSV model

Monitor all parsed InstallPlan and create all corresponding dependent resources after user approval or automatic approval is completed

Monitor CataologSources and Subscriptions models and create corresponding InstallPlans based on their changes.

Once OLM Operator hears that the dependent resources required for installation in the CSV template have been registered or changed, it will start the installation and upgrade of the application Operator, and finally start the workflow of Operator itself to create and manage the corresponding custom resource instance model in the Kubernetes cluster.

Installation of OLM

After learning about the infrastructure of OLM, let's first take a look at the installation of OLM. In the community code, we find the templates corresponding to the deployment resources of OLM. Users can easily complete the customized OLM installation by modifying the corresponding deployment parameters.

In the official announcement, we can find the latest release and the corresponding installation instructions for each version.

Here, take version 0.13.0 as an example, execute the automated installation script with the following command:

Curl-L https://github.com/operator-framework/operator-lifecycle-manager/releases/download/0.13.0/install.sh-o install.shchmod + x install.sh./install.sh 0.13.0

The deployment template commands required to manually install OLM:

Kubectl apply-f https://github.com/operator-framework/operator-lifecycle-manager/releases/download/0.13.0/crds.yamlkubectl apply-f https://github.com/operator-framework/operator-lifecycle-manager/releases/download/0.13.0/olm.yaml

After arriving locally through the clone OLM code repository, users can execute the make run-local command to start minikube, and use the minikube to bring its own docker daemon to the local build OLM image. At the same time, the command will build and run the local OLM based on the local-values.yaml in the repository deploy directory as the configuration file. The kubectl-n local get deployments can be used to verify whether the components of OLM have been installed and run successfully.

In addition, to meet the customized installation needs of users, OLM supports the generation and installation of customized deployment templates by configuring the following template to specify parameters. The following are the template parameters that it supports configuration:

# sets the apiversion to use for rbac-resources. Change to `authorization.openshift.io` for openshiftrbacApiVersion: rbac.authorization.k8s.io# namespace is the namespace the operators will _ run_namespace: olm# watchedNamespaces is a comma-separated list of namespaces the operators will _ watch_ for OLM resources.# Omit to enable OLM in all namespaceswatchedNamespaces: olm# catalog_namespace is the namespace where the catalog operator will look for global catalogs.# entries in global catalogs can be resolved in any watched namespacecatalog_namespace: olm# operator_namespace is the namespace where the operator runsoperator_namespace: operators# OLM operator run configurationolm: # OLM operator doesn't do any leader election (yet) Set to 1 replicaCount: 1 # The image to run. If not building a local image, use sha256 image references image: ref: quay.io/operator-framework/olm:local pullPolicy: IfNotPresent service: # port for readiness/liveness probes internalPort: 808 catalog operator run configurationcatalog: # Catalog operator doesn't do any leader election (yet), set to 1 replicaCount: 1 # The image to run. If not building a local image, use sha256 image references image: ref: quay.io/operator-framework/olm:local pullPolicy: IfNotPresent service: # port for readiness/liveness probes internalPort: 8080

Users can customize the development of templates and install them in a specified cluster in the following ways:

Create a configuration template with a name such as my-values.yaml. Users can refer to the above template to configure the required parameters

Based on the above configured my-values.yaml template, use package_release.sh to generate the specified deployment template

# the first parameter is the system-compatible target version of helm chart # the second parameter is the output directory specified by the template # the third parameter is the specified configuration file path. / scripts/package_release.sh 1.0.0-myolm. / my-olm-deployment my-values.yaml

Deploy the template file in the specified directory and execute kubectl apply-f. / my-olm-deployment/templates/

Finally, the user can define the specified namespace that catalog operator listens on the global catalogs through the environment variable GLOBAL_CATALOG_NAMESPACE, and by default the installation process creates the olm namespace and deploys the catalog operator.

Dependency parsing and upgrade management

Just as apt/dkpg and yum/rpm manage system component packages, OLM encounters problems such as dependency parsing and upgrade management of running operator instances when managing Operator versions. In order to ensure the availability of all operators at runtime, OLM needs to ensure that in the dependency resolution and upgrade management process:

Do not install unregistered Operator instances that depend on APIs

If an upgrade operation for an Operator breaks the dependency conditions of its associated components, the upgrade operation is not performed.

Let's use some examples to see how the current OLM handles dependency resolution under version iterations:

First of all, let's introduce the upgrade of CRD. When a CRD to be upgraded belongs to a single CSV, OLM will immediately upgrade the CRD; when CRD belongs to multiple CSV, the upgrade of CRD needs to meet the following conditions:

All service versions currently used by CRD need to be included in the new CRD

All CR (Custom Resource) instances associated with existing service versions of CRD can be verified by the new CRD schema.

When you need to add a new version of CRD, the officially recommended steps are:

If we currently have a CRD in use and its version is v1alpha1, you want to add a new version of v1beta1 and set it to a new storage version, as follows:

Versions:-name: v1alpha1 served: true storage: false-name: v1beta1 served: true storage: true

If you need to use a new version of CRD in your CSV, we need to make sure that the CRD version referenced in the owned field in CSV is new, as shown below:

Customresourcedefinitions: owned:-name: cluster.example.com version: v1beta1 kind: cluster displayName: Cluster

Push the updated CRD and CSV to the specified warehouse directory.

When we need to deprecate or delete a CRD version, OLM does not allow an in-use CRD version to be deleted immediately, but needs to be deprecated first by setting the serverd field in the CRD to false, and then the unused version will be deleted during the next CRD upgrade. The officially recommended steps to delete or discard a specified version of CRD are as follows:

Mark the serverd field for the expired deprecated CRD version as false, indicating that the version is no longer used and will be deleted at the next upgrade, such as:

Versions:-name: v1alpha1 served: false storage: true

If the storage field in the current expiring version of CRD is true, you need to set it to false and the corresponding field in the new version of storage to true, for example:

Versions:-name: v1alpha1 served: false storage: false-name: v1beta1 served: true storage: true

Update the CRD model based on the above modifications

In the subsequent upgrade process, expired versions that are not in service will be deleted from CRD, and the final version of CRD will be:

Versions:-name: v1beta1 served: true storage: true

Note that in the process of deleting a specified version of CRD, we need to ensure that the version is also deleted in the queue in the storedVersion field in CRD status. When OLM discovers that a storedversion will no longer be used in the new version of CRD, it will help us to delete it accordingly. In addition, we need to make sure that the CRD version referenced by the association in CSV is updated when the old version is deleted.

Let's take a look at two examples that can cause upgrade failures and OLM's dependency parsing logic:

Example 1: suppose we have two different types of CRD, An and B.

Operator using A depends on B

Use Operator of B to have a subscription (Subscription)

Upgraded to the new version C using the Operator of B and abandoned the old version B.

The result of such an upgrade is that the corresponding CRD version of B does not have the corresponding Operator or APIService that uses it, and A that depends on it will not work.

Example 2: suppose we have two custom API An and B.

Operator using A depends on B

Operator using B depends on A

Operator using A wants to upgrade to A2 version and discard the old version A, while the new A2 version depends on B2

Operator using B wants to upgrade to B2 version and discard the old version B. the new B2 version depends on A2.

At this point, if we only try to upgrade An instead of upgrading B synchronously, even if the system can find the appropriate upgrade version, it will not be able to complete the corresponding Operator version upgrade.

To avoid the problems that may be encountered in the iteration of the above version, the dependency parsing logic adopted by OLM is as follows.

Suppose we have a set of operator running under a namespace:

For each subscription subscription under the namespace, if the subscription has not been checked before, OLM will look for the latest version of CSV corresponding to the subscription under source/package/channel, and temporarily create an operator; that matches the new version. If it is a known subscription, OLM will query the update of the corresponding source/package/channel.

For each API version that CSV depends on, OLM selects a corresponding operator according to the priority of sources. If a new operator is found that will also temporarily add the dependent version, if no corresponding operator is found, the dependent API will also be added.

At this point, if there is an API that does not meet the source dependency condition, the system will downgrade the dependent operator (fall back to the previous version); in order to meet the final dependency condition, the downgrade process will continue, and in the worst case, all operator under the namespace will remain the original version.

If a new operator is parsed and meets the dependency conditions, it will eventually be created in the cluster; at the same time, a related subscription will subscribe to the channel/package or source that found it to continue to see if there are any updates to the new version.

After understanding the fundamentals of OLM dependency resolution and upgrade management, let's take a look at the workflow related to OLM upgrades. First of all, we have learned from the above that ClusterServiceVersion (CSV), CatalogSource and Subscription are the three extension models in the OLM framework that are closely related to upgrades. In the ecosystem of OLM, we use CatalogSource to store operator metadata such as CVS; OLM is based on CatalogSources and uses API related to Operator warehouse to query available or upgradable operators;. In CatalogSource, operators uses channels to identify packaged different versions of the installation package.

When users want to upgrade an operator, they can subscribe to the specified version of the package in which channel needs to be installed through Subscription. If the package specified in the subscription is not already installed in the target cluster, OLM will install the latest version of operator in the download source such as catalog/package/channel.

In a CSV definition, we can declare through the replaces field that the operator,OLM that needs to be replaced will look for CSV definitions that can be installed from different channels upon receipt of the request and eventually build a DAG (directed acyclic graph). In this process, channels can be thought of as the entrance to update DAG. During the upgrade process, if OLM finds that there is an uninstalled intermediate version between the latest upgradable version and the current version, the system will automatically build an upgrade path and guarantee the installation of the intermediate version on the path. For example, we currently have a running operator with a running version of 0.1.1. After receiving the update request, OLM finds the latest upgradable version of 0.1.3 through the subscribed channel, and also finds the intermediate version of 0.1.2. In this case, OLM will first install the corresponding operator in the 0.1.2 version of CSV to replace the current version. And finally install 0.1.3 to replace the 0.1.2 version after successful installation.

Of course, in some cases, such as when we encounter an intermediate version with serious security vulnerabilities, this iterative upgrade of each version is not a reasonable and secure way. At this point, we can customize the installation path through the skips field to skip the specified intermediate version, as shown below:

ApiVersion: operators.coreos.com/v1alpha1kind: ClusterServiceVersionmetadata: name: etcdoperator.v0.9.2 namespace: placeholder annotations:spec: displayName: etcd description: Etcd Operator replaces: etcdoperator.v0.9.0 skips:-etcdoperator.v0.9.1

If you need to ignore multiple versions of the installation, we can use the following definition in CSV:

Olm.skipRange:

For the definition of version range, please see semver. An example of CSV for a skipRange is as follows:

ApiVersion: operators.coreos.com/v1alpha1kind: ClusterServiceVersionmetadata: name: elasticsearch-operator.v4.1.2 namespace: placeholder annotations: olm.skipRange: 4.1.0

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.