In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
In this issue, the editor will bring you how to analyze big data's technology. The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.
Main technologies of data acquisition and transmission
It is divided into two categories, one is offline batch processing, the other is real-time data acquisition and transmission.
The most famous offline batch processing is Sqoop, and the most commonly used real-time data acquisition and transmission are Flume and Kafka.
Sqoop: an open source offline data transfer tool, mainly used for data transfer between Hadoop (Hive) and traditional databases (Mysql, Oracle).
Flume: real-time log collection platform, a highly available, highly reliable, distributed mass log collection, aggregation and transmission system.
Kafka: generally speaking, the speed of Flume collecting data is usually out of sync with the downstream processing data, so the real-time platform architecture uses a message middleware to buffer. Kafka,Kafka is undoubtedly the most widely used distributed messaging system, which is widely used because of its horizontal scalability and high throughput. It is based on a message publish-subscribe system. Message middleware products similar to kafka also include RabbitMQ, ActiveMQ, ZeroMQ and so on.
Main techniques of data processing
MapReduce: running and complex parallel computing processes on large clusters are highly abstracted into two functions: map and reduce.
Hive: a layer of SQL abstraction based on the Hadoop architecture
Spark: scalable, memory-based computing and other features, can read and write data in any format on Hadoop.
Strom: real-time data processing framework, with low latency, distributed, scalable, high fault tolerance and other characteristics, can ensure that messages are not lost (diu).
Flink: an open source computing platform for both distributed real-time streaming and batch data processing, it can support both streaming and batch applications based on the same Flink runtime.
Beam: based on Flink, we hope to unify not only batch processing and streaming processing, but also big data processing paradigm and standard.
Main technology of data storage
HDFS: distributed file system.
Hbase: a distributed, column family-oriented storage system built on HDFS. In scenarios such as real-time reading and writing and random access to very large data sets, Hbase is currently the mainstream technology choice in the market.
Data application technology
Drill: real-time big data distributed query engine. Drill is compatible with ANSI SQL syntax as an interface to support querying local files, HDFS, Hive, HBase and MongeDB as storage data. The file format supports schemaless data such as Parquet, CSV, TSV and JSON, all of which can be queried in real time as fast as table queries in traditional databases.
R: data analysis language
TensorFlow: a processing framework based on data flow graph. Tensorflow nodes represent data operations and edges represent data interactions between operation nodes.
The above is how to analyze the big data technology shared by the editor. If you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.