Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use GCS offloader to unload data stored in BookKeeper

2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

Today, I will talk to you about how to use GCS offloader to unload the data stored in BookKeeper. Many people may not know much about it. In order to make you understand better, the editor has summarized the following for you. I hope you can get something according to this article.

For large amounts of data that do not need to be accessed quickly, it is recommended that you use Apache Pulsar's built-in feature-tiered storage. It is also the native advantage of Pulsar sharding architecture.

With tiered storage, you can unload data from Apache BookKeeper to scalable, infinitely cheap cloud native storage (such as Google Cloud Storage, AWS S3) or file systems, build high-performance message clusters, and reduce operation and maintenance costs.

Google Cloud Storage (GCS) offloader is a Pulsar plug-in hosted on StreamNative Hub.

Describes how to unload data stored in BookKeeper to GCS through GCS offloader.

Installation

Follow these steps to install GCS offloader.

? Preparatory work

Apache jclouds:2.2.0 or later

? Sudden pacing

1. Choose any of the following ways to download the Pulsar package:

Download from Apache mirror:

Https://archive.apache.org/dist/pulsar/pulsar-2.5.1/apache-pulsar-2.5.1-bin.tar.gz

Download from the Pulsar download page

Https://pulsar.apache.org/download

Download through the wget command

Https://www.gnu.org/software/wget

Wget https://archive.apache.org/dist/pulsar/pulsar-2.5.1/apache-pulsar-2.5.1-bin.tar.gz

two。 Download and extract the Pulsar offloaders installation package.

Wget https://downloads.apache.org/pulsar/pulsar-2.5.1/apache-pulsar-offloaders-2.5.1-bin.tar.gz

Tar xvfz apache-pulsar-offloaders-2.5.1-bin.tar.gz

Note:

When running Pulsar in a bare metal cluster, you need to make sure that the unzipped installation file `offloaders` is available in the Pulsar directory where each broker resides.

When running Pulsar in Docker or deploying Pulsar using Docker image (such as K8S, DCOS), you can use `apachepulsar/pulsar- all` image instead of `apachepulsar/ pulsar` image. `apachepulsar/pulsar- all`image has been bundled with tiered storage offloaders.

3. Create an offloader folder in your local Pulsar directory and copy the unzipped Pulsar offloaders file here.

Mv apache-pulsar-offloaders-2.5.1/offloaders apache-pulsar-2.5.1/offloaders

Ls offloaders

? Output

As shown in the output below, Pulsar supports GCS and AWS S3 through Apache jclouds.

Tiered-storage-file-system-2.5.1.nar tiered-storage-jcloud-2.5.1.nar

Use

The following are the detailed steps for using GCS offloader in Pulsar.

Step 1: configure GCS offloader driver

Before using GCS offloader, you need to configure some properties for GCS offloader driver. For more information on how to configure the GCS offloader driver property, see:

Https://hub.streamnative.io/offloaders/gcs/2.5.1/#configuration

This example assumes that the following configuration has been made in `standalone.conf` and Pulsar is running in stand-alone mode.

ManagedLedgerOffloadDriver=google-cloud-storagegcsManagedLedgerOffloadBucket=pulsar-topic-offload-1gcsManagedLedgerOffloadRegion=europe-west3gcsManagedLedgerOffloadServiceAccountKeyFile=/Users/user-name/Downloads/affable-ray-226821-6251d04987e9.jsonoffloadersDirectory=offloadersmanagedLedgerMinLedgerRolloverTimeMinutes=2 managedLedgerMaxEntriesPerLedger=5000

-

Step 2: create a GCS storage partition

1. Go to the Google Cloud console (https://console.cloud.google.com/)) and select Storage in the left sidebar.

two。 Select a browser and click create Storage Partition.

To ensure that broker can access the storage partition, you need to set up Storage Object Creator and Storage Object Viewer for the service account.

3. Sets the name of the storage partition.

The Bucket name should be the same as the `gcsManagedLedgerOffloadBucket` value configured in step 1.

For more information about `gcsManagedLedgerOffloadBucket`, please see:

Https://hub.streamnative.io/offloaders/gcs/2.5.1/#step-1-configure-gcs-offloader-driver

4. Sets the storage partition.

The storage partition should be the same as the `gcsManagedLedgerOffloadRegion` value configured in step 1.

For more information about `gcsManagedLedgerOffloadRegion`, please see:

Https://hub.streamnative.io/offloaders/gcs/2.5.1/#step-1-configure-gcs-offloader-driver

5. Click create.

You have now successfully created a GCS storage partition.

-

Step 3: create a GCS service account

1. Go to the Google Cloud console and select IAM and Management in the left sidebar.

two。 Select the service account and click create Service account.

3. Set the service account name.

After a service account is created, the service account ID will be created automatically.

4. Click create.

5. To authorize the service account, click next.

6. Click create key.

7. Select JSON and click create, and then select Save the generated JSON file locally.

The JSON file should be consistent with the `gcsManagedLedgerOffloadServiceAccountKeyFile` value configured in step 1.

For more information about `gcsManagedLedgerOffloadServiceAccountKeyFile`, please see:

Https://hub.streamnative.io/offloaders/gcs/2.5.1/#step-1-configure-gcs-offloader-driver

8. Copy the key ID from the JSON file to the key ID dialog box, and then click finish.

-

Step 4: assign permissions to the GCS service account

1. On the IAM and Administration pages, click IAM, and then click add.

two。 Fill in the name of the GCS service account created in step 3.

3. Set Storage Object Creator and Storage Object Viewer permissions for the service account.

4. Click Save.

-

Step 5: unload data from BookKeeper to GCS

The following commands are executed in the same folder as the local Pulsar (for example, `~ / path/to/apache-pulsar- 2.5.1`).

1. Start Pulsar (stand-alone mode).

. / bin/pulsar standalone-a 127.0.0.1

two。 To ensure that the generated data is not deleted immediately, it is recommended that you set a retention policy.

Https://pulsar.apache.org/docs/en/next/cookbooks-retention-expiry/#retention-policies

The retention policy can be set to a size limit or a time limit, and the higher the value you set, the longer the data is retained.

. / bin/pulsarctl namespaces set-retention public/default-- size-10G-- time 3D

For more information about the `pulsarctl namespaces set-retention options` command (including flags, description, default values, keyboard shortcuts, etc.), please see:

Https://streamnative.io/docs/pulsarctl/v0.4.0/#-em-set-retention-em-

3. Use pulsar-perf production data.

. / bin/pulsar-perf produce-r 1000-s 2048 test-topic

4. The uninstall operation will not begin until Ledger is switched over. To ensure that the uninstall operation is performed successfully, it is recommended that you wait for a few more ledger to switch. The retention policy configured above is also to ensure that when ledger is switched, the data will not be deleted by broker.

To view ledger information, you can use the `pulsarctl topics internal-stats option` command.

. / bin/pulsarctl topics internal-stats test-topic

? Output

The following output shows that ledge has been switched: ledger 10, ledger 11, and ledger 12 already exist.

"entriesAddedCounter": 107982, "numberOfEntries": 107982, "totalSize": 508276193, "currentLedgerEntries": 1953, "currentLedgerSize": 9167863, "lastLedgerCreatedTimestamp": "202005-12T00:07:27.273+08:00", "waitingCursorsCount": 0, "pendingAddEntriesCount": 1, "lastConfirmedEntry": "12currentLedgerEntries 1951", "state": "LedgerOpened" "ledgers": [{"ledgerld": 10, "entries": 52985, "size": 249500259, "offloaded": false}, {"ledgerld": 11, "entries": 53045, "size": 249614295, "offloaded": false}, {"ledgerId": 12, "entries": 0 "size": 0, "offloaded": false},] "cursors": {}

For more information about the `pulsarctl topics internal-stats options` command (including flags, description, default values, keyboard shortcuts, etc.), please see:

Https://streamnative.io/docs/pulsarctl/v0.4.0/#-em-internal-stats-em-

5. After the Ledger switch, you can manually trigger the uninstall operation (shown below).

In addition, you can also set the uninstall operation to be triggered automatically. For more information on how to set up an automatic uninstall operation, see:

Https://hub.streamnative.io/offloaders/gcs/2.5.1/#configure-gcs-offloader-to-run-automatically

. / bin/pulsarctl topics offload-- size-threshold 10m public/default/test-topic

? Output

Offload triggered for persistent://public/default/test-topic for messages before 12VOUR 0RHI 1

For more information about the `pulsarctl topics offload options` command (including flags, description, default values, keyboard shortcuts, etc.), please see:

Https://streamnative.io/docs/pulsarctl/v0.4.0/#-em-offload-em-

6. Check the status of the uninstall operation.

. / bin/pulsarctl topics offload-status-w public/default/test-topic

The uninstall operation may take some time.

? Output

Offload was a success

For more information about the `pulsarctl topics offload-status options` command (including flags, description, default values, keyboard shortcuts, etc.), please see:

Https://streamnative.io/docs/pulsarctl/v0.4.0/#-em-offload-status-em-

After the operation is complete, the data is successfully unloaded to GCS.

Video demonstration

Click to see a step-by-step demonstration of how to use GCS offloader in Pulsar.

Want to keep abreast of Pulsar's research and development progress, user cases and hot topics? Come and follow Apache Pulsar and StreamNative Wechat official account, we are here for the first time to share everything about Pulsar.

After reading the above, do you have any further understanding of how to use GCS offloader to unload data stored in BookKeeper? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report