Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to make Hadoop run faster

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly shows you "how to make Hadoop run faster", the content is easy to understand, clear, hope to help you solve doubts, the following let the editor lead you to study and learn "how to make Hadoop run faster" this article.

Hadoop solves the speed problem in the following ways:

1 using a distributed file system: this makes the load sharing and expands the system

2 optimize write speed: in order to achieve faster write speed, the Hadoop architecture is designed to write records first and then process them

Use batch processing (Map/Reduce) to balance data transfer speed and processing speed.

Challenges posed by batch processing

The challenge of batch processing is that the data must enter intermittently to ensure the normal operation of the process, and if the data source is entered continuously, it will cause the system to crash.

If we increase the batch window, the result will increase the time of the data processing process, so that the relevant data analysis reports will be delayed to fall into our hands. In many systems, they choose to batch data during off-peak hours, which is very limited. As the volume of data expands, the time it takes to process data increases. If this goes on, there will be a backlog of data that needs to be processed. The end result may not be enough data to be processed in a day.

Increase speed through flow processing

The concept of stream processing is very simple. We do not need to wait until all the data has been recorded before processing, we can record and process at the same time.

Take the production line as an analogy, we can wait until all the components are complete before we begin to assemble the car, or we can pack the components at the factory and then send them to a specific production line and assemble them immediately. Needless to say, you know which one is faster.

Data processing is like a production line, while the streaming process is to package the data and send it to a specific "production line". In traditional industries, even if the manufacturer pre-assembles all the parts, we still need a production line to assemble. Similarly, stream processing is not intended to replace Hadoop, it is only used to reduce a lot of work on the system, thereby increasing the processing speed of the system.

Curt Monash pointed out in his research that traditional databases will eventually end in RAM that stream processing between memory can create a better stream processing system. The following is a real-time analysis case of big data, and uses Twitter to demonstrate the corresponding processing of the data.

Faster processing for Google: replacing Map/Reduce with streaming

Due to the lack of alternative solutions at that time, many big data systems still had to use this technology even if the performance of Map/Reduce was poor. An example of a * * application is the use of this technology to maintain global search indexes. Now Google greatly reduces the use of Map/Reduce in index processing, but adds real-time processing mode, which reduces the index speed to 1% of the original.

In the network, some types of data are expanding. This is why HBase counts trigger processing, while Twitter will deal with larger streams in the future.

The above is all the contents of the article "how to make Hadoop run faster". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report