In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article introduces the relevant knowledge of "what are the basic knowledge of Flink". In the operation of actual cases, many people will encounter such a dilemma. Then let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
Apache Flink is an open source stream processing framework developed by the Apache Software Foundation. Its core is a distributed stream data stream engine written in Java and Scala. Flink executes arbitrary streaming data programs in data parallelism and pipelining, and the pipelined runtime system of Flink can execute batch and streaming programs. In addition, the Flink runtime itself supports the execution of iterative algorithms.
At present, Apache Flink provides support for the key business of many companies and enterprises around the world. Many domestic and foreign front-line companies and IT companies have adopted Flink to do big data analysis and real-time processing of real-time data solutions, such as Ali, Huawei, Telecom, Mobile, Kuaishou, Xiaomi, oppo, VIPSHOP, etc., so Flink is a technical application framework that big data must learn in real-time analysis and processing.
Official introduction
Apache Flink is a framework and distributed processing engine for state computing on unbounded and bounded data streams. Flink can already run in all common cluster environments and calculate at the speed of in-memory and of any size.
Dealing with unbounded and bounded data
Any type of data can form a stream of events. User interaction records on credit card transactions, sensor measurements, machine logs, websites or mobile applications, all of which form a stream.
Data can be treated as unbounded or bounded streams.
1. An unbounded flow defines the beginning of the flow, but not the end of the flow. They generate data endlessly. The data of the unbounded stream must be processed continuously, that is, it needs to be processed immediately after the data is ingested. We can't wait until all the data arrives, because the input is infinite and the input will not be completed at any time. Processing unbounded data usually requires events to be ingested in a specific order, such as the order in which events occur, so that the integrity of the results can be inferred.
two。 A bounded flow defines the beginning of the flow and the end of the defined flow. The bounded flow can be calculated after all the data has been ingested. All data in a bounded stream can be sorted, so there is no need for orderly ingestion. Bounded flow processing is often referred to as batch processing
Apache Flink is good at dealing with unbounded and bounded data sets of precise time control and state so that the Flink runtime (runtime) can run any application that deals with unbounded flows. Bounded flows are internally processed by algorithms and data structures specially designed for fixed-size datasets, resulting in excellent performance.
Deploy applications anywhere
Apache Flink is a distributed system that requires computing resources to execute applications. Flink integrates all the common cluster resource managers, such as Hadoop YARN, Apache Mesos, and Kubernetes, but can also run as a stand-alone cluster.
Flink is designed to work well in each of these resource managers, which is achieved through a resource manager specific (resource-manager-specific) deployment pattern. Flink can interact in a manner appropriate to the current explorer.
When you deploy a Flink application, Flink automatically identifies the required resources based on the parallelism configured by the application and requests them from the resource manager. In the event of a failure, Flink replaces the failed container by requesting new resources. All communication to commit or control the application is done through REST calls, which simplifies the integration of Flink with various environments.
Run applications of any size
Flink is designed to run stateful streaming applications on any scale. As a result, the application is parallelized into possibly thousands of tasks, which are distributed and executed concurrently in the cluster. So applications can take full advantage of endless CPU, memory, disk, and network IO. And Flink makes it easy to maintain very large application states. Its asynchronous and incremental checkpoint algorithm has the least impact on the processing delay and ensures the consistency of the precise primary state.
Flink users reported some impressive scalability figures in their production environment
Deal with trillions of events every day.
The application maintains a state of several TB sizes.
The application runs on thousands of kernels.
Take advantage of memory performance
Stateful Flink programs are optimized for local state access. The state of the task is always kept in memory, and if the state size exceeds the available memory, it is saved in a disk data structure that can be accessed efficiently. The task does all its calculations by accessing the local (usually in-memory) state, resulting in very low processing latency. Flink ensures accurate state consistency in fault scenarios by persisting local states periodically and asynchronously.
This is the end of the content of "what are the basics of Flink". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.