Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to analyze Hitachi Content Platform

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

How to carry out Hitachi Content Platform analysis, in view of this problem, this article introduces the corresponding analysis and solutions in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible method.

Today, I'm going to share with you the technical architecture of HCP.

HCP actually stands for Hitachi Content Platform, Hitachi's content management platform. In fact, the earliest object storage was called CAS, or content addressable storage (Content-addressable storage). Therefore, from the name HCP, it is assumed that HCP must have originated from the early CAS.

Congratulations, you're right. HCP comes from Archivas, a startup in 2003.

Archivas was founded in Massachusetts in 2003, and three rounds of venture capital raised a total of US $28 million.

Archivas's flagship product ArC (Archivas Cluster) software is a content addressable storage (CAS,content addressed storage) product. When faced with the competition from EMC's Centera, the early Archivas chose the strategy of opening up and low price.

He started OEM cooperation with HDS in 2006 and was acquired by HDS in 2007 with a purchase value of about US $120 million.

The latest version of HCP is 8.3, and the architecture is relatively old. I heard from netizens of former HDS that code refactoring was planned and 9.0 of the new architecture would be released in 19 years, but unfortunately I haven't seen it yet. If you know something, please let us know when it will be launched.

Object-based storage HCP, and there are three derivatives. HCP Anywhere is equivalent to a network disk, HCP Anywhere Edge is a cloud file gateway, and HCI is an intelligent data mining platform. This article only discusses HCP.

HCP emphasizes ecology and works closely with ISV. In 2018, Hitachi Vantara advertised that it had more than 4000 clusters and more than 2000 customers worldwide, and claimed to be the number one on-Prem object storage solution (that EMC estimated that it was dissatisfied, ).

HCP started as a CAS and has many successful cases in the compliance of financial institutions. Because it has many features required for compliance.

We see features like Shredding that I don't see in other object stores. Also, we can see that HCP actually supports re-deletion, while object storage natively supports re-deletion, and HCP is the only one among the mainstream objects.

Although HCP supports pure software, it generally recommends an all-in-one deployment model. There are two kinds of nodes in an all-in-one machine, G node and S node. Both types of nodes support a maximum of 80 nodes.

G is the access node (single control, local disk RAID 6 protection in the node), at least four are needed. S node is a high-density storage node (dual control, and G10 is connected by Ethernet), and erasure size=64MB is used in the node, which is optional.

The G node also supports FC SAN, NFS, and public cloud as back-end storage, so like MinIO, it is more like an object gateway.

Metadata and indexes are stored on G10, usually with 2 copies, so as to support the high availability of G nodes.

G10 local storage or shared FC storage is directly used for high performance data, and S nodes are generally used for those with low performance requirements.

When using shared SAN storage and S node to save data, you can choose 1 copy of the data, G10 can switch over HA,G node failure through multi-path, S node itself is fully redundant, so there is no single point of failure.

HCP also supports virtual machines (up to 40 nodes) and public cloud deployment (no node restrictions). On-prem deployment natively supports SMB and NFS, which can be written by any protocol and can be read by other protocols.

Hitachi Vantara has recently released the all-flash node G10 All Flash. S nodes are dual-controlled and fully redundant. The network is full of 10 Gigabit interfaces. S nodes can also be expanded to a maximum of 80.

Let's take a look at its software architecture.

The functions of HCP are basically implemented in the access node, with advanced features such as multi-protocol access, attribute reverse search, multi-tenant, multi-site, cross-site EC, re-deletion, WROM and so on. We can see that the search for this piece of HCP uses the older open source software Solr/Lucene, and the performance is not good if the object is searched while writing. Nowadays, many object vendors generally adopt Elasticsearch for better performance.

HCP advertises its key values as above, and I think a lot of other products have them as well.

HCP supports 8-1.25 billion indexes per node, the whole system supports a maximum of 100 billion objects, and a single object has a maximum 5TB. Judging from the specifications, it is still quite powerful. The links to the latest specifications of HCP products are as follows:

Https://knowledge.hitachivantara.com/Documents/Storage/Content_Platform/8.3.x/Release_Notes/01_Content_Platform_v8.3.0_Release_Notes_-_Customer#Specifications

Since HCP comes from CAS, it natively supports file protocols and POSIX semantics.

The above is the background service of HCP, and its features are still very rich, especially the re-deletion, which can hardly be seen in other object stores.

After all, HCP's multi-tenant function is perfect, and Gartner believes that this HCP is the best in the industry.

HCP's multisite is different from EMC ECS in that the replication of its original data is asynchronous. However, it can give priority to copying metadata. Therefore, the time for data inconsistency is relatively short. Also, you can only copy metadata so that the data does not have to go abroad and is more compliant.

HCP's cross-site EC (GEO-EC) supports up to 6 sites, but only one site failure is allowed.

HCP's Geo-EC supports three implementation methods of delayed deletion correction, which users can choose according to their needs to meet the needs of different scenarios.

HCP uses a two-layer EC scheme, although hard disk reconstruction does not need to cross the network, but the space utilization is not as good as IBM.

We can see that IDC is more likely to recognize HCP, but the recognition of Gartner is general.

From the evaluation of Gartner in the 2019 report "Critical Capabilities for Object Storage", we can see several characteristics of HCP:

1. The ecology is good, with 100 ISV,200 certified applications

two。 Difficult to deploy and upgrade

3. Slow product update and lack of in-depth monitoring tools

4. No unified multi-site management (with unified monitoring), and the management interface is difficult to use.

From a scoring point of view, Gartner gave HCP the lowest score in the industry for its management ability. Friends who have used HCP come out to say, is it really difficult to use?

The score of the scene is right, but there is still a long way to go from IBM.

Generally speaking, although the architecture of HCP is relatively old, it has a long history, so it is still rich in functions. The more distinctive features are

Re-delete

Reconstruction within hard disk failure node

Multi-tenant

Cross-site EC

Bucket asynchronous replication

Layer by layer, cloud, etc.

However, compared with other modern object storage architectures, there are also some shortcomings, such as:

Inflexible data protection settings for replicas and EC

Management and expansion are complicated.

Without synchronous replication, the extended cluster of RPO=0 cannot be supported.

No small file merging function, no load balancing function

There is no high performance HDFS dedicated client, etc.

The above is the editor for you to share how to carry out Hitachi Content Platform analysis, if you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

The answers to the questions on how to analyze Hitachi Content Platform are shared here. I hope the above content can be of some help to you. If you still have a lot of doubts to be solved, you can follow the industry information channel to learn more about it.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report