Longhorn full parsing and Quick start Guide 04/25 Update SLTechnology News&Howtos

Longhorn full parsing and Quick start Guide

2025-04-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

The Longhorn project has now been officially released! This is a new approach to distributed block storage based on cloud and container deployment. Following the principle of micro-service, Longhorn uses containers to build small independent components into distributed block storage, and uses container choreography to coordinate these components to form an elastic distributed system.

Why Longhorn?

Today, cloud-and container-based deployments are expanding, distributed block storage systems are becoming more and more complex, and the number of volume on a single storage controller is increasing. In the early 2000s, there were only a few dozen volume on storage controllers, but modern cloud environments required tens of thousands to millions of distributed block storage volumes. The storage controller becomes a highly complex distributed system.

Distributed block storage itself is simpler than other forms of distributed storage, such as file systems. Each volume can only be mounted by a single host, regardless of how many volume there are in the system. For this reason, can we imagine whether the large block storage controller can be split into multiple smaller storage controllers? To split up like this, we need to make sure that these volume are still built from public disk pools, and we need a way to orchestrate these storage controllers so that they work together.

To take this idea to the limit, we created the Longhorn project. This is a direction we think is worth exploring. There is only one volume on each controller, which will greatly simplify the design of the storage controller. Because the failure domain of the controller software is limited to a single volume, if the controller crashes, it will only affect one volume.

Longhorn takes full advantage of the core technologies of recent years on how to orchestrate a large number of containers and virtual machines. For example, instead of building a highly complex controller that can scale to 100000 volume, Longhorn created 100000 separate controllers for the sake of making the storage controller simple and portable. Then, we can use state-of-the-art choreography systems such as Swarm, Mesos and Kubernetes to schedule these independent controllers, share resources in a set of disks, and work together to form a flexible distributed block storage system.

Longhorn's microservice-based design has many other advantages. Because each volume has its own controller, upgrading the controller and replica container of each volume will not cause significant interruptions to IO operations. Longhorn can create a long-running job to orchestrate all live volume upgrades while ensuring that it does not disrupt what the system is doing. To ensure that the upgrade does not cause unexpected problems, Longhorn can choose to upgrade a small portion of the volume and roll back to the old version if a problem occurs during the upgrade process. These practices have been widely used in modern micro-service applications, but they are not common in storage systems. We hope that Longhorn can help more applications of microservices in the field of storage.

Overview of Longhorn function

Form a pool of shared resources from local disks or network storage installed in compute or dedicated storage hosts.

Create block storage volumes for containers and virtual machines. You can specify the size of volume, the requirements for IOPS, and the number of synchronous replica you want across hosts (hosts here are those that provide storage resources for volume). Replica is configured thinly on the underlying disk or network storage.

Create a dedicated storage controller for each volume. This is probably the most distinctive feature of Longhorn compared to most existing distributed storage systems. Most of the existing distributed storage systems usually use complex controller software to serve hundreds to millions of volume. Unlike Longhorn, however, there is only one volume,Longhorn on each controller that turns each volume into a microservice.

Schedule multiple replica across computing or storage hosts. Longhorn monitors the health of each replica, repairs the problem, and regenerates the replica if necessary.

Operate the storage controller and replica as a Docker container. For example, a volume has three replica, which means there are four containers.

Multiple storage "front ends" are allocated for each volume. Common front ends include Linux kernel devices (mapped to / dev / longhorn) and iSCSI targets. Linux kernel devices are suitable for supporting Docker volume, while iSCSI targets are more suitable for supporting QEMU / KVM and VMware volume.

Create volume Snapshots (snapshot) and AWS EBS style backups. You can create up to 254 snapshots per volume, which can be backed up one by one to NFS or S3-compliant secondary storage. Only changed bytes are copied and stored during the backup operation.

Specify a schedule for periodic snapshot and backup operations. You can specify the frequency of these operations (hourly, daily, weekly, monthly, and yearly), the exact time when these operations are performed (for example, 3:00 every Sunday morning), and how many circular snapshots and backup sets are retained.

Quick start Guid

Longhorn is easy to install and use. You just need to make sure that Docker is installed and the open-iscsi package is installed, and you can set up everything you need to run Longhorn on a single Ubuntu 16.04 server.

Run the following command to set up Longhorn on a single host:

Git clone https://github.com/rancher/longhorncd longhorn/deploy./longhorn-setup-single-node-env.sh

The script will pull up and start several containers, including the etcd key value store, Longhorn volume manager, Longhorn UI, and Longhorn docker volume plug-in containers. When this script is complete, the following output is generated:

Longhorn is up at port 8080

You can use UI by connecting to http://: 8080. The following is a screen view of the details of volume:

You can now create a persistent Longhorn volume from Docker CLI:

Docker volume create-d longhorn vol1docker run-it-- volume-driver longhorn-v vol1:/vol1 ubuntu bash

Running the single-host Longhorn installer etcd and all copies of volume on the same host is not suitable for use in a production environment. The Longhorn GitHub page has more instructions on how to set up production-level multi-host usage, which will use separate etcd servers, Docker swarm mode clusters, and separate NFS servers for storing backups.

Longhorn and other storage systems

As an experiment, we wrote Longhorn, and with containers and micro services, Longhorn built a distributed block storage system. Longhorn is neither to compete with or replace existing storage software and storage systems for the following reasons:

Longhorn focuses only on distributed block storage. On the other hand, distributed file storage is more difficult to build. Storage systems such as Ceph, Gluster, Infinit (acquired by Docker), Quobyte, Portworx, and StorageOS, as well as storage systems from NetApp, EMC, etc., provide distributed file systems, a unified storage experience, enterprise data management, and many other enterprise-class features that Longhorn does not support.

Longhorn requires NFS shares or S3-compatible objects to store volume backups. Therefore, it must be used with network file storage from NetApp, EMC Isilon, or other vendors and S3-compatible object storage endpoints from AWS S3, Minio, SwiftStack, Cloudian, and so on.

Longhorn lacks enterprise-class storage features such as deduplication, compression, and auto-tiering, as well as the ability to strip large volumes into smaller chunks. As a result, Longhorn volumes is limited by the size and performance of a single disk. The iSCSI target runs as a user-level process. We can see in distributed storage products such as Dell EqualLogic,SolidFire and Datera that it lacks the performance, reliability, and multipathing support of enterprise-class iscsi systems.

We built Longhorn to make it simple, hoping it would test our idea of using containers and microservices to build storage. It is written entirely by Go (often called golang) and is the preferred language for modern system programming.

Below we will continue to describe Longhorn in detail so that you can have a rough preview of the functional design of Longhorn at this stage. Currently, although the functions described have not been fully implemented, we will continue our efforts to make the vision of the Longhorn project a reality.

Volume as a micro-service

The Longhorn volume manager container runs on each host in the Longhorn cluster. Using Rancher or Swarm terminology, the Longhorn Manager container is a global service. If you use Kubernetes,Longhorn volume Manager, it is considered DaemonSet. The Longhorn volume manager handles making API calls from UI or from volume plug-ins in Docker and Kubernetes. You can find a description of Longhorn API here. The following figure shows the control path of Longhorn in Docker Swarm and Kubernetes.

When the Longhorn manager is asked to create a volume, it creates a controller container on the host to which the volume is attached and on the host where the copy is placed. Copies should be placed on different hosts to ensure maximum availability.

In the following figure, there are three containers with Longhorn volumes. Each Docker volume has a dedicated controller that runs as a container. Each controller has two copies, each of which is a container. The arrows in the figure represent the read / write data flow between the Docker volume, the controller container, the replica container, and the disk. By creating a separate controller for each volume, if one controller fails, the functionality of the other volums will not be affected.

For example, during a large-scale deployment of 100000 Docker volumes, each volume has two replicas, meaning that there will be 100000 controller containers and 200000 replication containers. In order to schedule, monitor, coordinate, and repair all these controllers and replicas, a storage orchestration system is needed.

Storage choreography

Storage orchestration is responsible for scheduling controllers and replicas, monitoring various components, and recovering from errors. The Longhorn volume manager performs all the storage orchestration operations required to manage the volume lifecycle. You can find details of the storage orchestration performed by the Longhorn volume Manager here.

The function of the controller is similar to that of a typical mirror RAID controller, reading and writing copies and monitoring the health of the copies. All writes are replicated synchronously. Because each volume has its own dedicated controller, and the controller resides on the same host attached to the volume, we do not need a high availability (HA) configuration of the controller.

The Longhorn volume manager is responsible for picking the host where the copy is located. Then check the health of all copies and, if necessary, perform the appropriate action to rebuild the wrong copy.

Copy operation

Longhorn replicas is built from Linux distributed files, which supports streamlined configuration. Currently, we do not retain additional metadata to indicate which block to use. The block size is 4K.

When you take a snapshot, you will create a differential disk. As the number of snapshots grows, the differential disk chain may be quite long. To improve read performance, Longhorn retains a read index that records valid data for each 4K block saved by the differential disk. In the following figure, the volume has eight block. The read index has eight entries and is lazily populated when the read operation occurs. The write operation resets the read index to point to real-time data.

Every 4K block consumes one byte when reading an index held in memory. The byte size of the read index means that you can take up to254 snapshots per volume.

Reading the index for each copy consumes a certain amount of memory data structures. For example, the 1TB volume consumes 256MB's memory to read the index. Therefore, we will consider putting the read index in the memory-mapped file in the future.

Replica reconstruction

When the controller detects that a copy has failed, it marks the copy as in an error state. The Longhorn volume Manager is responsible for initiating and coordinating the reconstruction of the error copy, as follows:

The Longhorn volume manager creates a blank copy and invokes the controller to add the blank copy to its replica set.

To add a blank copy, the controller must do the following:

Pause all read and write operations

Add a blank copy in WO (write only) mode

Take a snapshot of all existing copies and immediately have a blank differential disk

Unpauses reads and writes all operations, sending only writes to the newly added copy

Start the background process to synchronize all (except the latest) differential disks from a good copy to a blank copy

After the synchronization is completed, the data of all replicas are consistent, and the volume manager sets the new replica to RW (read-write) mode.

The Longhorn volume manager calls the controller to remove the wrong copy from its replica set.

Rebuilding the copy is not very effective. We can improve rebuild performance by trying to reuse the remaining scattered files in the failed copy.

Backup snapshot

I like the way Amazon EBS works-every snapshot is automatically backed up to S3. There is nothing in the primary storage. However, we decided to make snapshots and backups of Longhorn more flexible. Perform snapshot and backup operations separately. Simulate an EBS-style snapshot by taking a snapshot, backing up the differences between this snapshot and the previous snapshot, and deleting the previous snapshot. We have also developed a regular backup mechanism to help you automate such operations.

By detecting and transferring the changed block between snapshots, we achieved efficient incremental backups. This task is relatively easy because each snapshot is a difference file and only the changes in the last snapshot are stored. To avoid storing a large number of small block, we use 2MB block to perform backup operations. This means that if any 4K block in the 2MB boundary changes, we will have to back up the entire 2MB block. But we think this provides a balance between manageability and efficiency.

In the following figure, we have backed up snap2 and snap3. Each backup retains its own set of 2MB block, and the two backups share a green block and a blue block. Each 2MB block is backed up only once. This means that when we delete a backup from secondary storage, we cannot delete all the block it uses. Instead, we perform garbage collection periodically to clean up unused block from secondary storage.

Longhorn stores all backups of a given volume in a public directory. The following is a simple view that describes how Longhorn stores backups of volume. Volume-level metadata is stored in volume.cfg. The metadata files for each backup, such as snap2.cfg, are relatively small because they contain only the offset and checksum of all 2MB block in the backup. The 2MB block of all backups that belong to the same volume are stored in a public directory, so they can be shared across multiple backups. 2MB block (.blk file) is compressed. Because checksums are used to process 2MB block, we remove a certain degree of duplicate data from the 2MB block of the same volume.

Two deployment models

The Longhorn volume manager performs the task of scheduling copies to the node. We can adjust the scheduling algorithm to place controllers and replicas in different ways. The controller should always be placed on the host connected to the volume. On the other hand, replicas can be made on the same set of computing servers running the controller or on a set of dedicated storage servers. The former constitutes a hyperaggregation deployment model, while the latter constitutes a dedicated storage server model.

Firmly believe that open source is the future of technology, always adhering to the concept of open source Rancher Labs, the launch of Longhorn is still 100% open source software. You can download Longhorn on GitHub.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.