Why can Apache Flink become the new generation of big data computing engine? 10/18 Update SLTechnology News&Howtos

Why can Apache Flink become the new generation of big data computing engine?

2025-10-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

As we all know, Apache Flink (hereinafter referred to as Flink) was first born in Europe and was donated to the Apache Foundation by its founding team in 2014. Like other early-stage projects, it is fresh, it is open source, and it adapts to the speed and flexibility that is more valued in a fast-changing world. Big data era has posed a new challenge to human data control ability, and the birth of Flink provides unprecedented space and potential for enterprise users to obtain faster and more accurate computing power. As a recognized new generation of big data computing engine, with what charm does Flink become the first choice for domestic and foreign well-known companies such as Ali, Tencent, Didi, Meituan, byte beat, Netflix, Lyft and so on to build stream computing platforms?

Listen to what Flink core contributors have to say! From November 28 to 30, Flink Forward Asia 2019 Apache Flink core contributors and industry senior experts will take you to unlock the unique technical charm of Flink in all directions. Surprise link: ASK ME ANYTHING spoiler in advance: the conference site will invite Apache Flink core contributors to set up Ask Me Anything surprise session, Flink SQL, Runtime, Hive and other technical questions about Flink can be asked on the spot! Flink's father, Stephan, may also participate in the live interaction. If you are curious about why Flink's logo is a squirrel, you can raise your hand to ask questions face to face. "using Apache Flink as an integrated data processing platform" Cui Xingcan, a postdoctoral student of Apache Flink Committer,York University, has been widely used in many real-time job scenarios. We find that after several iterations in recent versions, it has some potential to become an integrated data processing platform, which can be used to deal with both dynamic and static data, and to carry out distributed and centralized computations. and support operational and interactive tasks. In this presentation, we aim to show you some exploratory attempts to use Apache Flink as an integrated back-end platform for a general data processing process. Specifically, we will first introduce this general data processing process and briefly describe the characteristics of each stage. Later, we will explain in detail how to "shape" Flink without touching its core to meet a variety of data processing needs. During this period, there will also be a partial explanation of the operation mechanism of Flink. Finally, based on the goal of building Flink into a real integrated data processing platform, we will make some prospects for the future work. "Bring Cross DC and Cross Data Source SQL Engine to Apache Flink" Zhang Shaoquan, Tencent Senior Engineer Drift Computing SuperSQL is a high-performance big data SQL engine developed by Tencent big data across data centers, clusters and data sources, which meets the needs of data federation analysis / real-time query for different types of data sources located in different data centers / clusters. Solve the problem of data isolated island in big data, reduce the barriers to the use of data, improve the efficiency of data use, and maximize data value. In this lecture, we will introduce the details of the drift computing SuperSQL project, including: the background of drift computing and the main technical challenges of positioning drift computing, the overall architecture of drift computing, the technical details of drift computing, the performance of drift computing future planning "New Flink source API: Make it easy" Qin Jiangjie, Apache Flink PMC,Apache Kafka PMC, Alibaba senior technical experts Flink already have a rich connector ecology. However, in order to create a production usable connector for Flink, we still need to consider a series of problems, including multi-concurrent collaboration, consistency semantics, thread model, fault tolerance and so on, and Source is more complex than Sink. In order to make it easier for users to achieve high-quality connector,Flink community, a new Flink Source API has been introduced into FLIP-27 to help users solve the above series of complex problems, so that users can quickly write a high-quality connector. This presentation will introduce the design ideas of the new Flink Source API and how to use the new Source Connector API to quickly create a production-ready Flink source connector. "in-depth exploration of Flink SQL stream batch unified query engine and best practices" Wu Li, Apache Flink Committer, Alibaba technical expert Li Jinsong, Apache Beam Committer, Alibaba technical expert Flink SQL, as the core module of Apache Flink, has gained more and more user's attention, and plays a more and more important role in production practice with its easy-to-use API and high-performance SQL engine. Starting from the newly released Flink SQL, the presentation will focus on sharing the technical details and tuning experience of the core functions of Flink SQL from the perspective of streaming and batch processing, and the audience will gain a deeper understanding of Flink SQL and how to tune Flink SQL assignments.

The organizing committee of the conference has also carefully prepared training courses for developers who use Flink and want to learn deeply. At that time, Flink experts from Alibaba and Ververica will lead developers to carry out a day and a half of deep learning. Apache Flink PMC leads the team, super luxurious lineup, Alibaba and senior technical experts of the founding team of Flink serve as training lecturers to develop a comprehensive learning system for developer training courses. The course can meet different learning needs, whether it is entry-level or advanced, developers can choose the course content according to their own basis to achieve the accumulation and improvement of technology and application ability. The main outline of the course is as follows: intermediate one: training for Apache Flink developers this course is a practical introduction to Apache Flink for Java and Scala developers who want to learn to build streaming applications. The training will focus on core concepts such as distributed data flow, event time and state. The exercise will give you the opportunity to understand how the above concepts are embodied in API and how to combine these concepts to solve practical problems. Introduce the basics of flow computing and Apache FlinkDataStream API to prepare for Flink development (including exercises) stateful flow processing (including exercises) time, timers and ProcessFunction (including exercises) connect multiple streams (including exercises) tests (including exercises) instructions: no knowledge of Apache Flink is required. Middle-level 2: Apache Flink operation and maintenance training this course is a practical introduction to the deployment and operation of Apache Flink applications. The target audience includes developers and operators responsible for deploying Flink applications and maintaining Flink clusters. The demo focuses on the core concepts involved in Flink operation, as well as the main tools for deploying, upgrading, and monitoring Flink applications. Introduction to flow computing and Flink distributed architecture in Apache Flink data center introduction to containerized deployment (including actual operation) state back-end and fault-tolerant (including actual operation) upgrade and state transition (including actual operation) indicators (including practice) capacity planning description: no prior knowledge of Apache Flink is required. Middle level 3: SQL developer training Apache Flink supports SQL as a unified API for streaming and batch processing. SQL can be used in a variety of scenarios and will be easier to build and maintain than the underlying API,SQL that uses Flink. In this training, you will learn how to realize the full potential of using SQL to write Apache Flink assignments. We will examine different cases of streaming SQL, including join flow data, dimension table association, window aggregation, maintenance of materialized views, and pattern matching using MATCH RECOGNIZE clauses (a new standard proposed by SQL 2016). This paper introduces that SQL on Flink uses SQL query dynamic table to join dynamic table schema matching with match_recognition ecosystem & write external table description: no prior knowledge of Apache Flink is needed, but basic SQL knowledge is needed. Advanced: Apache Flink tuning and troubleshooting over the past few years, we have learned a lot about the challenges of slowly transitioning streaming computing jobs from the early PoC phase to the most common production process while working with many Flink users. In this training, we will focus on introducing these challenges and helping you to eliminate them. We will provide a useful set of troubleshooting tools and introduce best practices and techniques in areas such as monitoring, watermarking, serialization, status backend, and so on. During the gap of the practical course, participants will have the opportunity to use the newly learned knowledge to solve some problems caused by abnormal Flink assignments. At the same time, we will also summarize the common reasons why the job does not progress or the throughput does not meet expectations, or the job is delayed. Time and watermark state processing and state back-end Flink fault-tolerant mechanism checkpoint and SavePoint DataStream API and ProcessFunction. The series of training courses are high-quality small class teaching, the number is limited, the entrance will be closed at full booking, students with related training needs can make an appointment as soon as possible. -to participate in the training, please choose to purchase the VIP package. Buy VIP package 1 for middle-level training and VIP package 2 for high-level training. VIP package 1 can participate in all intermediate courses, and VIP package 2 can participate in all courses, including high-level and intermediate training. If you are also curious about the main exploration direction of Flink in the future, how to use Flink to push big data and computer to the extreme, and what are the new scenes, new plans and best practices of Flink? come to the scene! I believe that this group of technical experts from the front line will certainly refresh your understanding of Apache Flink.

The original link to this article is the original content of Yunqi community and may not be reproduced without permission.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.