Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Three major technical directions in big data's field

2025-01-14 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

There are three major technical directions in big data's field:

1. The development direction of Hadoop big data

2. Data mining, data analysis & the direction of machine learning

3. Big data Operation and maintenance & Cloud Computing Direction

What does big data study?

Here I still want to recommend the big data Learning Exchange Group I built myself: 529867072, all of them are developed by big data. If you are studying big data, the editor welcomes you to join us. Everyone is a software development party. Irregularly share practical information (only related to big data software development), including the latest big data advanced materials and advanced development tutorials sorted out by myself. Welcome to join us if you want to go deep into big data.

Python:Python 's ranking has continued to rise with the help of artificial intelligence since last year, and now it has become the number one language ranking.

The grammar is simple and clear, and the bottom layer is well encapsulated. It is a high-level language that is easy to use.

In the field of big data and data science, any cluster architecture software supports Python,Python and has a rich database of data science, so Python has to learn.

Linux: better understand the running environment and network environment configuration of big data software such as hadoop, hive, hbase, spark, etc., and learn shell to understand scripts so that it is easier to understand and configure big data cluster.

Hadoop:Hadoop includes several components: HDFS, MapReduce and YARN,HDFS are the places where data is stored, just like the hard disk of our computer. Files are stored on this. MapReduce processes and calculates data. YARN is an important component that embodies the concept of Hadoop platform. With it, other software of big data ecosystem can run on hadoop. In this way, we can make better use of the advantages of HDFS large storage and save more resources. for example, we don't have to build a separate spark cluster, let it run directly on the existing hadoop yarn.

Zookeeper:ZooKeeper is a highly available, high-performance and consistent open source coordination service designed for distributed applications. It provides a basic service: distributed locking service. Because of the open source feature of ZooKeeper, our developers found other ways to use it on the basis of distributed lock: configuration maintenance, group service, distributed message queue, distributed notification / coordination and so on.

Sqoop: this is used to import data from Mysql into Hadoop. Of course, you can not use this, directly export the Mysql data sheet to a file and then put it on HDFS, of course, the use of the production environment should pay attention to the pressure of Mysql.

Hive: for those who know SQL grammar, it is an artifact. It makes it easy for you to deal with big data without having to write MapReduce programs.

Hbase: this is the NOSQL database in the Hadoop ecosystem, its data is stored in the form of key and value, and key is unique, so it can be used for data weight, it can store a much larger amount of data than MYSQL. So he is often used as the storage destination after big data's processing is completed.

The overall architecture of Kafka:Kafka is very simple and is explicitly distributed. Producer, broker (kafka), and consumer can all have multiple. Producer,consumer implements the interface for Kafka registration, and data is sent from producer to broker,broker to act as an intermediate cache and distribution. Broker distributes consumer registered with the system. The role of broker is similar to caching, that is, caching between active data and offline processing systems. The communication between client and server is based on TCP protocol, which is simple, high-performance and independent of programming language. A few basic concepts.

Spark: it is used to make up for the shortcomings of data processing speed based on MapReduce, which is characterized by loading data into memory for computing rather than reading slow, slow-evolving hard drives. It is especially suitable for iterative operations, so algorithm streams are particularly fond of it. It is written in scala. Either the Java language or Scala can operate on it because they all use JVM.

Machine learning (Machine Learning, ML): is a multi-domain cross-discipline, involving probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and other disciplines. It is the core of artificial intelligence and the fundamental way to make computers intelligent. It is widely used in all fields of artificial intelligence. It mainly uses induction, synthesis rather than deduction. The machine learning algorithm is basically fixed, and it is relatively easy to learn.

Deep learning (Deep Learning, DL): the concept of deep learning originates from the research of artificial neural network, and it has developed rapidly in recent years. Examples of deep learning applications include AlphaGo, face recognition, image detection and so on. It is a scarce talent at home and abroad, but deep learning is relatively difficult, and the algorithm is updated relatively fast, so we need to learn from experienced teachers.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report