In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article is about how to carry out the application of Spark Streaming framework in 5G, the editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article.
This time I'd like to share some of our views on the current streaming engine and its applicability in 5G and IoT scenarios.
In the run-up to the development of 5G and IoT scenarios, Ericsson studied a variety of scalable and flexible flow processing frameworks to solve data pipelining problems and improve overall performance. We use machine learning stream data for adaptive learning and intelligent decision-making to achieve automation in various fields. It is a great challenge to use machine learning algorithms to learn models and obtain information from stream data step by step.
We will discuss the challenges of AI in streaming data and how to use the streaming framework (mainly the Spark Streaming framework) to solve these problems.
Spark Streaming framework
The following is divided into input, processing (ETL and ML) and output phases. We will also introduce a variety of machine learning and data analysis techniques used in the flow processing framework for efficient control and optimization.
Input phase:
Although there are different input sources (such as files, databases, and various endpoints), what is important at this stage is how to use Apache Kafka efficiently within the Spark Streaming framework. In addition to the default receiver-based approach, there is also a direct technique that solves performance and repetition problems. In our telecom field, the data rate of network detection can reach 1TB/ seconds, and direct solves this problem very well. In addition to performance, we also need a simple way to maintain distribution technology in complex telecommunications systems and meet 99.9999% accuracy, which also puts forward great requirements for failure situations. On the other hand, direct technology can reduce the complexity of dealing with faults and reduce the amount of maintenance of cross-system duplicate data.
Processing phase:
Extract, convert and load (ETL):
In the past, when we practiced flow processing, we usually talked about Bolts running in parallel on executors, and our main task was to determine the deployment topology to achieve uniform distribution and maximum utilization of available resources. Then we began to discuss micro-batches and its better efficiency and fault tolerance compared to pure flow processing. In addition, we often talk about Lambda schemas that combine batch and streaming into a single query. At present, due to the increasing popularity of Spark Streaming framework, the industry has begun to turn to Structured Stream Querying, which even treats wide tables as streaming data and processes them incrementally. Structured Stream Querying allows us to process newly arrived data with a higher priority in response to queries.
In the field of telecommunications, we have a variety of transformations, such as digital mapping, cleanup, null substitution, variable conversion, and so on. Since no micro-batch operations are involved, we use Apache Flink to handle all of these operations in a pure stream. For operations such as missing value substitution, the average of the last N values, and so on (anything that requires historical data), we use Spark Streaming's Structural Querying.
Machine Learning (ML):
In our telecommunications field, we need to create training models and test data in a streaming manner. We try various methods to update the model when new data flows in, and find that the hierarchical model is easier to achieve incremental model update. These hierarchical data models can be easily deployed using the Spark Streaming framework because it internally supports micro-batch processing prepared for these models. We also learned that using the flexibility and pure flow characteristics of Apache Flink, the implementation of reinforcement learning is easy to complete, and compared with other frameworks, the performance indicators of these implementations are very competitive.
Sink phase:
After the data processing layer, we can store the data in various options, such as permanent data storage, distributed memory, returning to the message bus, or just visual data points. In our internal research, we stored the processed data in Cassandra (No-SQL data Store), which pays more attention to availablity than partition tolerance. Given the experience of using Apache Cassandra in communication applications, we find that it can be fine-tuned to meet consistency and availability scenarios. When you can't improve the usability of hbase, you can try to use Cassandra and do so by adjusting consistency.
We also need to store the data in the "best" site. The resource may be created by the executor on site An in Storage A, but the client application always queries it from site B, which will require us to determine where to store the resource on site B to ensure the locality of the data, which we do through the internal optimization of Sink Level.
The above is how to apply the Spark Streaming framework in 5G. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.