How to realize progressive grayscale publishing system based on Mixerless Telemetry 07/08 Update SLTechnology News&Howtos

How to realize progressive grayscale publishing system based on Mixerless Telemetry

2025-07-08 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

Based on Mixerless Telemetry how to achieve progressive grayscale release system, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain in detail for you, people with this need can come to learn, I hope you can get something.

As a member of CNCF, Weave Flagger provides continuous integration and continuous delivery capabilities. Flagger summarizes progressive publishing into three categories:

Grayscale release / Canary release (Canary): for progressive tangent to grayscale version (progressive traffic shifting)

A _ Testing B test: used to route user requests to the A _ hand B version (HTTP headers and cookies traffic routing) based on the request information

Blue-green release (Blue/Green): for traffic switching and traffic replication (traffic switching and mirroring)

This article will introduce the progressive grayscale publishing practice of Flagger on ASM.

Setup Flagger

1 deploy Flagger

Execute the following command to deploy flagger (see: demo_canary.sh for the complete script).

Alias k = "kubectl-- kubeconfig $USER_CONFIG" alias h = "helm-- kubeconfig $USER_CONFIG" cp $MESH_CONFIG kubeconfigk-n istio-system create secret generic istio-kubeconfig-from-file kubeconfigk-n istio-system label secret istio-kubeconfig istio/multiCluster=trueh repo add flagger https://flagger.apph repo updatek apply-f $FLAAGER_SRC/artifacts/flagger/crd.yamlh upgrade-I flagger flagger/flagger-namespace=istio-system\-- set crd.create=false\-- set meshProvider=istio\-- set metricsServer=http:/ / prometheus:9090\-set istio.kubeconfig.secretName=istio-kubeconfig\-set istio.kubeconfig.key=kubeconfig

2 deploy Gateway

During the grayscale publishing process, Flagger requests ASM to update the VirtualService used for grayscale traffic configuration, and this VirtualService uses a Gateway named public-gateway. To do this, we create the relevant Gateway configuration file public-gateway.yaml as follows:

ApiVersion: networking.istio.io/v1alpha3kind: Gatewaymetadata: name: public-gateway namespace: istio-systemspec: selector: istio: ingressgateway servers:-port: number: 80 name: http protocol: HTTP hosts:-"*"

Execute the following command to deploy Gateway:

Kubectl-- kubeconfig "$MESH_CONFIG" apply-f resources_canary/public-gateway.yaml

3 deploy flagger-loadtester

Flagger-loadtester is the grayscale publishing phase, which is used to detect the application of grayscale POD instances.

Execute the following command to deploy flagger-loadtester:

Kubectl-- kubeconfig "$USER_CONFIG" apply-k "https://github.com/fluxcd/flagger//kustomize/tester?ref=main"

4 deploy PodInfo and its HPA

We first use the HPA configuration that comes with the Flagger distribution (this is an operation-level HPA), and then we use the application-level HPA after completing the complete process.

Execute the following command to deploy PodInfo and its HPA:

Kubectl-kubeconfig "$USER_CONFIG" apply-k "https://github.com/fluxcd/flagger//kustomize/podinfo?ref=main" progressive grayscale release

1 deploy Canary

Canary is the core CRD for grayscale publishing based on Flagger, see How it works for details. We first deploy the following Canary profile podinfo-canary.yaml to complete the complete progressive grayscale process, and then introduce the monitoring indicators of the application dimension to further realize the perceptual progressive grayscale publishing of the application.

ApiVersion: flagger.app/v1beta1kind: Canarymetadata: name: podinfo namespace: testspec: # deployment reference targetRef: apiVersion: apps/v1 kind: Deployment name: podinfo # the maximum time in seconds for the canary deployment # to make progress before it is rollback (default 600s) progressDeadlineSeconds: 60 # HPA reference (optional) autoscalerRef: apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler name: podinfo service: # service port number port: 9898 # container port number or name (optional) targetPort: 9898 # Istio gateways (optional) gateways:-public-gateway.istio-system.svc.cluster.local # Istio virtual service host names (optional) hosts: -'*'# Istio traffic policy (optional) trafficPolicy: tls: # use ISTIO_MUTUAL when mTLS is enabled mode: DISABLE # Istio retry policy (optional) retries: attempts: 3 perTryTimeout: 1s retryOn: "gateway-error Connect-failure Refused-stream "analysis: # schedule interval (default 60s) interval: 1m # max number of failed metric checks before rollback threshold: 5 # max traffic percentage routed to canary # percentage (0100) maxWeight: 50 # canary increment step # percentage (0100) stepWeight: 10 metrics:-name: request-success-rate # minimum req success rate (non 5xx responses) # percentage (0100) thresholdRange: Min: 99 interval: 1m-name: request-duration # maximum req duration P99 # milliseconds thresholdRange: max: 500interval: 30s # testing (optional) webhooks:-name: acceptance-test type: pre-rollout url: http://flagger-loadtester.test/ timeout: 30s metadata: type: bash cmd: " Curl-sd 'test' http://podinfo-canary:9898/token | grep token "- name: load-test url: http://flagger-loadtester.test/ timeout: 5s metadata: cmd:" hey-z 1m-Q 10-c 2 http://podinfo-canary.test:9898/"

Execute the following command to deploy Canary:

Kubectl-- kubeconfig "$USER_CONFIG" apply-f resources_canary/podinfo-canary.yaml

After deploying Canary, Flagger replicates the Deployment named podinfo to podinfo-primary and expands the podinfo-primary to the minimum number of POD defined by HPA. Then gradually reduce the number of POD of this Deployment named podinfo to 0. In other words, podinfo will be the grayscale version of Deployment,podinfo-primary as the production version of Deployment.

At the same time, create three services-- podinfo, podinfo-primary, and podinfo-canary-- the first two pointing to the Deployment of podinfo-primary and the last to the Deployment of podinfo.

2 upgrade podinfo

Upgrade the version of grayscale Deployment from 3.1.0 to 3.1.1 by executing the following command:

Kubectl-kubeconfig "$USER_CONFIG"-n test set image deployment/podinfo podinfod=stefanprodan/podinfo:3.1.1

3 progressive grayscale release

At this point, Flagger will begin to implement the progressive grayscale publishing process described in the first part of this series. Here, the main process is as follows:

Gradually expand grayscale POD, verification

Progressive tangent, verification

Rolling upgrade production Deployment, verification

100% switch back to production

Scale down grayscale POD to 0

We can observe the process of this progressive tangent flow with the following command:

While true; do kubectl-- kubeconfig "$USER_CONFIG"-- n test describe canary/podinfo; sleep 10s done

The output log information is as follows:

Events: Type Reason Age From Message-Warning Synced 39m flagger podinfo-primary.test not ready: waiting for rollout to finish: observed deployment generation less then desired generation Normal Synced 38m (x2 over 39m) flagger all the metrics providers are available! Normal Synced 38m flagger Initialization done! Podinfo.test Normal Synced 37m flagger New revision detected! Scaling up podinfo.test Normal Synced 36m flagger Starting canary analysis for podinfo.test Normal Synced 36m flagger Pre-rollout check acceptance-test passed Normal Synced 36m flagger Advance podinfo.test canary weight 10 Normal Synced 35m flagger Advance podinfo.test canary weight 20 Normal Synced 34m flagger Advance podinfo.test canary weight 30 Normal Synced 33m flagger Advance podinfo.test canary weight 40 Normal Synced 29m (x4 over 32m) flagger (combined from similar events): Promotion completed! Scaling down podinfo.test

The corresponding Kiali view (optional), as shown in the following figure:

At this point, we have completed a complete progressive grayscale release process. The following is extended reading.

Application-level expansion and scaling in grayscale

On the basis of the above progressive grayscale release process, let's take a look at the configuration of HPA in the above Canary configuration.

AutoscalerRef: apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler name: podinfo

This HPA named podinfo is a configuration that comes with Flagger. When the CPU utilization of grayscale Deployment reaches 99%, the capacity is expanded. The complete configuration is as follows:

ApiVersion: autoscaling/v2beta2kind: HorizontalPodAutoscalermetadata: name: podinfospec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: 2 maxReplicas: 4 metrics:-type: Resource resource: name: cpu target: type: Utilization # scale up if usage is above # 99 of the requested CPU (100m) averageUtilization: 99

In the previous article, we described the practice of application-level scaling and scaling, and here, we apply it to the process of grayscale release.

1 aware HPA of application QPS

Execute the following command to deploy a HPA that is aware of the number of application requests, and expand the capacity when the QPS reaches 10:00 (see advanced_canary.sh for the complete script):

Kubectl-- kubeconfig "$USER_CONFIG" apply-f resources_hpa/requests_total_hpa.yaml

Accordingly, the Canary configuration is updated to:

AutoscalerRef: apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler name: podinfo-total

2 upgrade podinfo

Upgrade the version of grayscale Deployment from 3.1.0 to 3.1.1 by executing the following command:

Kubectl-kubeconfig "$USER_CONFIG"-n test set image deployment/podinfo podinfod=stefanprodan/podinfo:3.1.1

3 verify progressive grayscale publishing and HPA

Command to observe the process of this progressive tangent:

While true; do k-n test describe canary/podinfo; sleep 10sdone

During the progressive grayscale publishing process (after the Advance podinfo.test canary weight 10 message appears, see the following figure), we use the following command to initiate a request from the ingress gateway to increase the QPS:

INGRESS_GATEWAY=$ (kubectl-- kubeconfig $USER_CONFIG-n istio-system get service istio-ingressgateway-o jsonpath=' {.status.loadBalancer.ingress [0] .IP}') hey-z 20m-c 2-Q 10 http://$INGRESS_GATEWAY

Use the following command to observe the progress of progressive grayscale publishing:

Watch kubectl-kubeconfig $USER_CONFIG get canaries-all-namespaces

Use the following command to observe the change in the number of copies of hpa:

Watch kubectl-- kubeconfig $USER_CONFIG-n test get hpa/podinfo-total

As shown in the following figure, in the progressive grayscale publishing process, when the tangent reaches 30%, the number of copies of the grayscale Deployment is 4:

Application-level monitoring index in grayscale

On the basis of the application-level expansion and reduction in the above grayscale, let's take a look at the configuration of metrics in the above Canary configuration:

Analysis: metrics:-name: request-success-rate # minimum req success rate (non 5xx responses) # percentage (0100) thresholdRange: min: 99 interval: 1m-name: request-duration # maximum req duration P99 # milliseconds thresholdRange: max: 500interval: 30s # testing (optional)

1 Flagger built-in monitoring metrics

So far, the metrics configuration used in Canary has been two built-in monitoring metrics for Flagger: request success rate (request-success-rate) and request latency (request-duration). The following figure shows the definition of built-in monitoring indicators on different platforms in Flagger, in which istio uses Mixerless Telemetry-related telemetry data introduced in the first article of this series.

2 Custom monitoring metrics

In order to show the more flexibility of telemetry data in verifying grayscale environment during grayscale publishing, we take istio_requests_total as an example again to create a MetricTemplate called not-found-percentage, which calculates the proportion of 404 error codes returned by requests to the total number of requests.

The configuration file metrics-404.yaml is as follows (for the full script, see advanced_canary.sh):

| apiVersion: flagger.app/v1beta1kind: MetricTemplatemetadata: name: not-found-percentage namespace: istio-systemspec: provider: type: prometheus address: http://prometheus.istio-system:9090 query: | 100-sum (rate (istio_requests_total {reporter= "destination", destination_workload_namespace= "{namespace}}", destination_workload= "{target}}" )) / sum (rate (istio_requests_total {reporter= "destination", destination_workload_namespace= "{namespace}}") Destination_workload= "{{target}}"} [{{interval}}])) * 100

Create the above MetricTemplate by executing the following command:

K apply-f resources_canary2/metrics-404.yaml

Accordingly, the configuration of metrics in Canary is updated as follows:

Analysis: metrics:-name: "404s percentage" templateRef: name: not-found-percentage namespace: istio-system thresholdRange: max: 5 interval: 1m

3 final verification

Finally, we execute the complete experimental script at one time. The script advanced_canary.sh is shown below:

#! / usr/bin/env shSCRIPT_PATH= "$(cd" $(dirname "$0") > / dev/null 2 > & 1 pwd-P) / "cd" $SCRIPT_PATH | | exitsource configalias k = "kubectl-- kubeconfig $USER_CONFIG" alias m = "kubectl-kubeconfig $MESH_CONFIG" alias h = "helm-kubeconfig $USER_CONFIG" echo "# I Bootstrap #" echo "1 Create a test namespace with Istio sidecar injection enabled:" k delete ns testm delete ns testk create ns testm create Ns testm label namespace test istio-injection=enabledecho "2 Create a deployment and a horizontal pod autoscaler:" k apply-f $FLAAGER_SRC/kustomize/podinfo/deployment.yaml-n testk apply-f resources_hpa/requests_total_hpa.yamlk get hpa-n testecho "3 Deploy the load testing service to generate traffic during the canary analysis:" k apply-k "https://github.com/fluxcd/flagger//kustomize/tester?ref=main"k get pod Svc-n testecho "." sleep 40secho "4 Create a canary custom resource:" k apply-f resources_canary2/metrics-404.yamlk apply-f resources_canary2/podinfo-canary.yamlk get pod,svc-n testecho "." sleep 120secho "# III Automated canary promotion #" echo "1 Trigger a canary deployment by updating the container image:" k-n test set image deployment/podinfo podinfod=stefanprodan/podinfo:3.1.1echo "2 Flagger detects that the deployment revision changed and starts a new rollout:" while true Do k-n test describe canary/podinfo; sleep 10sdone

Execute the complete lab script using the following command:

Sh progressive_delivery/advanced_canary.sh

The experimental results are as follows:

# I Bootstrap # 1 Create a test namespace with Istio sidecar injection enabled:namespace "test" deletednamespace "test" deletednamespace/test creatednamespace/test creatednamespace/test labeled2 Create a deployment and a horizontalpodautoscaler: deployment.apps/podinfo createdhorizontalpodautoscaler.autoscaling/podinfo-total createdNAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGEpodinfo-total Deployment/podinfo / 10 (avg) 150 0s3 Deploy the load testing service To generate traffic during the canary analysis:service/flagger-loadtester createddeployment.apps/flagger-loadtester createdNAME READY STATUS RESTARTS AGEpod/flagger-loadtester-76798b5f4c-ftlbn 0 Running 2 Init:0/1 0 1spod/podinfo-689f645b78-65n9d 1 Running 0 28sNAME TYPE CLUSTER-IP EXTERNAL -IP PORT (S) AGEservice/flagger-loadtester ClusterIP 172.21.15.223 80/TCP 1s.4 Create a canary custom resource:metrictemplate.flagger.app/not-found-percentage createdcanary.flagger.app/podinfo createdNAME READY STATUS RESTARTS AGEpod/flagger-loadtester-76798b5f4c-ftlbn 2 Running 0 41spod/podinfo-689f645b78-65n9d 1 Running 0 68sNAME TYPE CLUSTER-IP EXTERNAL-IP PORT (S) AGEservice/flagger-loadtester ClusterIP 172.21.15.223 80/TCP 41s.#### III Automated canary promotion # 1 Trigger a canary deployment by updating the container image:deployment.apps/podinfo image updated2 Flagger detects that the deployment revision changed and starts a new rollout : Events: Type Reason Age From Message-Warning Synced 10m flagger podinfo-primary.test not ready: waiting for rollout to finish: observed deployment generation less then desired generation Normal Synced 9m23s (x2 over 10m) flagger all the metrics providers are available! Normal Synced 9m23s flagger Initialization done! Podinfo.test Normal Synced 8m23s flagger New revision detected! Scaling up podinfo.test Normal Synced 7m23s flagger Starting canary analysis for podinfo.test Normal Synced 7m23s flagger Pre-rollout check acceptance-test passed Normal Synced 7m23s flagger Advance podinfo.test canary weight 10 Normal Synced 6m23s flagger Advance podinfo.test canary weight 20 Normal Synced 5m23s flagger Advance podinfo.test canary weight 30 Normal Synced 4m23s flagger Advance podinfo.test canary weight 40 Normal Synced 23s (x4 over 3m23s) flagger (combined from similar events): is it helpful for Promo to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.