How to implement hadoop heterogeneous Storage 04/06 Update SLTechnology News&Howtos

How to implement hadoop heterogeneous Storage

2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article mainly explains "how to achieve hadoop heterogeneous storage". The content of the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn how to achieve hadoop heterogeneous storage.

1. What is the heterogeneous storage of hadoop

Hadoop introduced a new feature in version 2.6.0-heterogeneous storage. The key to heterogeneous storage lies in heterogeneous two words. Heterogeneous storage can give full play to its own advantages according to the different read and write characteristics of each storage medium, such as the most common Disk disk. As for hot data, it can be stored by SSD, which can ensure efficient read performance and even achieve ten or a hundred times the speed of ordinary disk read and write.

two。 Currently, hadoop supports the following storage types:

* RAM_DISK is stored in memory

* SSD is stored on solid state disk

* DISK (default) is stored on ordinary disk (default)

* ARCHIVE

According to the order of RAM_DISK- > SSD- > DISK- > ARCHIVE, the access speed is from fast to slow, and the cost of single bit storage is from high to low.

2.1 use of the command

0. Hdfs storagepolicies-listPolicies (view supported storage policies)

1.hadoop fs-mkdir / data/ssddata (create directory)

2.hdfs storagepolicies-setStoragePolicy-path / data/ssddata-policy One_SSD

(set the directory to be stored on ssd, other copies on ordinary disk, and later block copies of files stored under this folder will be stored on ssd.)

3.hdfs storagepolicies-getStoragePolicy-path / data/ssddata (view the storage policy of this directory, the result is as follows)

4. If you unset the storage policy and default to unspecified when creating the file, the command can change it to the default value.

Hdfs storagepolicies-unsetStoragePolicy-path / data/normal/ip2.txt

Hdfs mover [- p |-f]

3. Indexes

In data retrieval, adding index to data is an important link. Traditionally, relational databases mostly achieve the effect of rapid retrieval by building indexes, and OLTP databases are mostly B or B + tree indexes. As you all know, lucene is indexed inverted (the concept is no longer described here), and different participles make its full-text retrieval very powerful. Such as the well-known Elasticsearch is also using lucene as the index engine. However, it is a pity that ES does not support heterogeneous storage, and ES will have a bottleneck when facing a large amount of data in a single shard.

Thank you for your reading, the above is the content of "how to achieve hadoop heterogeneous storage". After the study of this article, I believe you have a deeper understanding of how to achieve hadoop heterogeneous storage, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.