How to implement KubeSphere Log backup and recovery practice 07/12 Update SLTechnology News&Howtos

How to implement KubeSphere Log backup and recovery practice

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article introduces you how to achieve KubeSphere log backup and recovery practice, the content is very detailed, interested friends can refer to, hope to be helpful to you.

Why do you need log backup

The KubeSphere log system uses the log collection and storage scheme of Fluent Bit + ElasticSearch, and realizes the life cycle management of Index through Curator, and cleans up remote logs regularly. For scenarios with log audit and disaster recovery requirements, KubeSphere's default 7-day log retention policy is far from enough. Backing up ElasticSearch data disks alone does not guarantee data recoverability and integrity.

The ElasticSearch open source community provides SnapShot API to help us achieve long-term storage snapshots and recoveries. This article describes how to transform the built-in ElasticSearch (version 6.7.0) component of KubeSphere (version 2.1.0) and practice log backup to meet the needs of audit and disaster recovery.

Note: if it is a log export scenario with a small amount of data and query conditions, you can use the KubeSphere one-click export feature, or try using the elasticsearch-dump tool. KubeSphere users with an external commercial version of ElasticSearch can also directly turn on the SnapShot Lifecycle Management function provided in ElasticSearch X-Pack.

prerequisite

Before performing the storage snapshot, we need to register the repository where the snapshot files are stored in the ElasticSearch cluster. A snapshot repository can use a shared file system, such as NFS. Other storage types, such as AWS S3, require separate repository plug-in support.

Let's take NFS as an example. The shared snapshot repository needs to be mounted to all master and data nodes of the ElasticSearch, and the path.repo parameter is configured in the elasticsearch.yaml. NFS supports the ReadWriteMany access mode, so it is appropriate to use NFS.

As a first step, we first prepare a NFS server, such as the QingCloud vNAS service used in this tutorial, with a shared directory path of / mnt/shared_dir.

Then prepare a StorageClass of type NFS on the KubeSphere environment, which we will use later when we apply for Persistent Volume for the snapshot repository. The environment for this article has been configured with NFS storage at installation time, so no additional action is required. For readers with installation needs, please refer to the KubeSphere official documentation, modify the conf/common.yaml and re-execute the install.sh script.

1. ElasticSearch Setup

In KubeSphere, the ElasticSearch master node is the stateful replica set elasticsearch-logging-discovery, and the data node is elasticsearch-logging-data. The environment of this tutorial is one master and two slaves:

$kubectl get sts-n kubesphere-logging-systemNAME READY AGEelasticsearch-logging-data 2 kubesphere-logging-systemNAME READY AGEelasticsearch-logging-data 2 18helasticsearch-logging-discovery 1 Acer 1 18h

The first step is to prepare the persistence volume for the ElasticSearch cluster snapshot repository:

Cat elasticsearch-logging-discovery.yml

Modify the yaml file, take the elasticsearch-logging-data.yml generated above as an example, and modify the yaml file of the main node as well.

ApiVersion: apps/v1kind: StatefulSetmetadata: labels: # because of the space, the #... Name: elasticsearch-logging-data namespace: kubesphere-logging-system #-- # comment or delete non-labels, name, Namespace meta information field #-- # resourceVersion: "109019" # selfLink: / apis/apps/v1/namespaces/kubesphere-logging-system/statefulsets/elasticsearch-logging-data # uid: 423adffe-271f-4657-9078-1a75c387eedcspec: #... Template: #... Spec: #... Containers:-name: elasticsearch #... VolumeMounts:-mountPath: / usr/share/elasticsearch/data name: data #-- # add backup Volume mount #-mountPath: / usr/share/elasticsearch/backup Name: backup-mountPath: / usr/share/elasticsearch/config/elasticsearch.yml name: config subPath: elasticsearch.yml #... InitContainers:-name: sysctl #...-name: chown #-- # modify command Adjust the owner of the snapshot repository folder #-- command:-/ bin/bash-- c-| set-e Set-x; chown elasticsearch:elasticsearch / usr/share/elasticsearch/data; for datadir in $(find / usr/share/elasticsearch/data-mindepth 1-maxdepth 1-not-name ".snapshot"); do chown-R elasticsearch:elasticsearch $datadir; done; chown elasticsearch:elasticsearch / usr/share/elasticsearch/logs; for logfile in $(find / usr/share/elasticsearch/logs-mindepth 1-maxdepth 1-not-name ".snapshot"); do chown-R elasticsearch:elasticsearch $logfile; done Chown elasticsearch:elasticsearch / usr/share/elasticsearch/backup; for backupdir in $(find / usr/share/elasticsearch/backup-mindepth 1-maxdepth 1-not-name ".snapshot"); do chown-R elasticsearch:elasticsearch $backupdir; done #. VolumeMounts:-mountPath: / usr/share/elasticsearch/data name: data #-- # add backup Volume mount #-mountPath: / usr/share/elasticsearch/backup Name: backup #... Tolerations:-key: CriticalAddonsOnly operator: Exists-effect: NoSchedule key: dedicated value: log volumes:-configMap: defaultMode: 420 name: elasticsearch-logging name: config #-# specify the PVC # created in the first step -- name: backup persistentVolumeClaim: claimName: elasticsearch-logging-backup volumeClaimTemplates:-metadata: name: data spec: accessModes:-ReadWriteOnce resources: requests: storage: 20Gi storageClassName: nfs-client volumeMode: Filesystem#- -# comment or delete the status field #-- # status:# phase: Pending# status:#...

After the modification, you can delete the ElasticSearch StatefulSet and reapply the new yaml:

Kubectl delete sts-n kubesphere-logging-system elasticsearch-logging-datakubectl delete sts-n kubesphere-logging-system elasticsearch-logging-discoverykubectl apply-f elasticsearch-logging-data.ymlkubectl apply-f elasticsearch-logging-discovery.yml

In the last step, after waiting for all the nodes of ElasticSearch to start, call Snapshot API to create a repository named ks-log-snapshots, and enable the compression feature:

Curl-X PUT "elasticsearch-logging-data.kubesphere-logging-system.svc:9200/_snapshot/ks-log-snapshots?pretty"-H 'Content-Type: application/json'-d' {"type": "fs", "settings": {"location": "/ usr/share/elasticsearch/backup", "compress": true}}'

Return "acknowledged": true indicates success. At this point, the preparations for the ElasticSearch cluster snapshot feature are ready. After that, you only need to call Snapshot API regularly to achieve incremental backup. ElasticSearch automated incremental backups can be done with Curator.

two。 Timed snapshots using Curator

ElasticSearch Curator helps manage ElasticSearch indexes and snapshots. Next, we use Curator to automate scheduled log backups. By default, the KubeSphere logging component includes Curator (deployed as a CronJob that executes at 1: 00 a.m.) to manage indexes, and we can rely on the same Curator. The execution rules for Curator can be found in ConfigMap.

Here we need to add two action:snapshot and delete_snapshots to the value of the action_file.yml field. And raise the priority of this rule before delete_indices. This configuration specifies that the snapshot is created and named as snapshot-%Y%m%d%H%M%S, and the 45-day snapshots is retained. For more information, please see Curator Reference.

Actions: 1: action: snapshot description: >-Snapshot ks-logstash-log prefixed indices with the default snapshot name pattern of 'snapshot-%Y%m%d%H%M%S'. Options: repository: ks-log-snapshots name: 'snapshot-%Y%m%d%H%M%S' ignore_unavailable: False include_global_state: True partial: False wait_for_completion: True skip_repo_fs_check: False # If disable_action is set to True Curator will ignore the current action disable_action: False filters:-filtertype: pattern kind: prefix # You may change the index pattern below to fit your case value: ks-logstash-log- 2: action: delete_snapshots description: >-Delete snapshots from the selected repository older than 45 days (based on creation_date), for 'snapshot' prefixed snapshots. Options: repository: ks-log-snapshots ignore_empty_list: True # If disable_action is set to True Curator will ignore the current action disable_action: False filters:-filtertype: pattern kind: prefix value: snapshot- exclude:-filtertype: age source: name direction: older timestring:'% Y% m% d% H% M% S' unit: days unit_count: 45 3: action: delete_indices # the original content remains unchanged.

3. Log recovery and viewing

When we need to review the log from a few days ago, we can restore it through a snapshot, such as the log from November 12. First we need to check the latest Snapshot:

Curl-X GET "elasticsearch-logging-data.kubesphere-logging-system.svc:9200/_snapshot/ks-log-snapshots/_all?pretty"

The index of the specified date is then restored with the latest Snapshot (or you can choose to restore all). This API will restore the log index to the data disk, so make sure that the storage space of the data disk is sufficient. In addition, you can directly back up the corresponding PV (the storage volume corresponding to the Snapshot repository can be directly used for backup), mount it to other ElasticSearch clusters, and restore the logs to other ElasticSearch clusters.

Curl-X POST "elasticsearch-logging-data.kubesphere-logging-system.svc:9200/_snapshot/ks-log-snapshots/snapshot-20191112010008/_restore?pretty"-H 'Content-Type: application/json'-d' {"indices": "ks-logstash-log-2019.11.12", "ignore_unavailable": true, "include_global_state": true,}'

Depending on the size of the log, the waiting time varies from several minutes. We can view the log through the KubeSphere log Dashboard.

Meetup advance notice

KubeSphere (https://github.com/kubesphere/kubesphere) is an open source application-centric container management platform that supports deployment on any infrastructure and provides easy-to-use UI, which greatly reduces the complexity of daily development, testing, operation and maintenance, and aims to solve the storage, network, security and ease of use of Kubernetes itself. Help enterprises easily cope with business scenarios such as agile development and automatic monitoring and maintenance, end-to-end application delivery, micro-service governance, multi-tenant management, multi-cluster management, service and network management, image warehouse, AI platform, edge computing and so on.

On how to achieve KubeSphere log backup and recovery practice is shared here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.