What are the Flink-related interview questions? 04/05 Update SLTechnology News&Howtos

What are the Flink-related interview questions?

2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article introduces the relevant knowledge of "what are the Flink-related interview questions?" in the operation of actual cases, many people will encounter such a dilemma, and then let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

1. The submission process of Flink Job

The Flink Job submitted by the user will be transformed into a DAG task to run, that is, the interaction between JobManager and TaskManager,JobManager and Client in StreamGraph, JobGraph, and ExecutionGraph,Flink is based on the Akka toolkit and is message-driven. The submission of the entire Flink Job also includes the creation of ActorSystem, the startup of JobManager, the startup and registration of TaskManager.

2. What are the so-called "three-tier diagrams" in Flink?

The DAG generation calculation diagram of a Flink task roughly goes through the following three processes:

StreamGraph

The computing topology is closest to the logical level expressed by the code, and the StreamTransformation is added to the StreamExecutionEnvironment according to the execution order of the user code to form a flow graph.

JobGraph

Generated from StreamGraph, nodes that can be merged in series are merged, edges between nodes are set, resource sharing slot slots are arranged and associated nodes are placed, files required for tasks are uploaded, checkpoint configuration is set, and so on. It is equivalent to a task graph that has been partially initialized and optimized.

ExecutionGraph

Transformed from JobGraph, it contains the content needed for the specific execution of the task, and is the execution diagram closest to the underlying implementation.

What role does JobManger play in the cluster?

JobManager is responsible for the task scheduling and resource management of the entire Flink cluster, obtains the submitted application from the client, and then allocates the corresponding TaskSlot resources to the submitted application according to the use of TaskSlot on the TaskManager in the cluster and commands TaskManager to start the application obtained from the client.

JobManager is equivalent to the Master node of the whole cluster, and there is only one active JobManager in the whole cluster, which is responsible for task management and resource management of the whole cluster.

JobManager and TaskManager communicate with each other through Actor System to obtain the execution of the task and send the execution of the application to the client through Actor System.

At the same time, during the execution of the task, Flink JobManager will trigger the Checkpoint operation. After each TaskManager node receives the Checkpoint trigger instruction, it completes the Checkpoint operation. All the Checkpoint coordination processes are completed in Fink JobManager.

When the task is completed, the Flink will feed back the information of the task execution to the client, and release the resources in the TaskManager for the next task submission.

4. What role does JobManger play in the process of cluster startup?

The main responsibility of JobManager is to receive Flink jobs, schedule Task, collect job status and manage TaskManager. It contains an Actor and does the following:

RegisterTaskManager: it is sent by a TaskManager that wants to register with JobManager. Successful registration will Ack through the AcknowledgeRegistration message.

SubmitJob: sent by the Client that submits the job to the system. The submitted information is job description information in the form of JobGraph.

CancelJob: request to cancel the job with the specified id. CancellationSuccess will be returned for success, otherwise CancellationFailure will be returned.

UpdateTaskExecutionState: sent by TaskManager to update the status of the execution node (ExecutionVertex). True is returned if successful, otherwise false is returned.

RequestNextInputSplit: the Task on TaskManager requests the next input split. If it succeeds, it returns NextInputSplit, otherwise it returns null.

JobStatusChanged: it means a change in the status of the job (RUNNING, CANCELING, FINISHED, etc.). This message is sent by ExecutionGraph.

What role does TaskManager play in the cluster?

TaskManager is equivalent to the Slave node of the whole cluster, which is responsible for the execution of specific tasks and the resource application and management of the corresponding tasks on each node.

The client compiles and packages the written Flink application and submits it to JobManager, and then JobManager assigns the task to the TaskManager node with resources according to the resources of TaskManager registered in JobManager, and then starts and runs the task.

TaskManager receives the tasks that need to be deployed from JobManager, and then uses Slot resources to start Task, establish a network connection for data access, receive data, and start data processing. At the same time, the data exchange between TaskManager is carried out through the way of data flow.

It can be seen that the task of Flink runs in a multi-threaded way, which is very different from the way of MapReduce multi-JVM. Flink can greatly improve the efficiency of CPU use, share system resources through TaskSlot between multiple tasks and Task, and manage resources effectively by managing multiple TaskSlot resource pools in each TaskManager.

6. What role does TaskManager play in the process of cluster startup?

The startup process for TaskManager is relatively simple:

Startup class: org.apache.flink.runtime.taskmanager.TaskManager

Core startup method: selectNetworkInterfaceAndRunTaskManager

7. How is the scheduling of Flink computing resources realized?

The finest-grained resource in TaskManager is Task slot, which represents a fixed-size subset of resources, and each TaskManager distributes its resources equally to its slot.

By adjusting the number of task slot, users can define how task are isolated from each other. Each TaskManager has a slot, which means that each task runs in a separate JVM. If each TaskManager has more than one slot, that is, multiple task are running in the same JVM.

Task in the same JVM process can share TCP connections (based on multiplexing) and heartbeat messages, which can reduce the network transmission of data, but also share some data structures, reducing the consumption of each task to a certain extent.

Each slot can accept either a single task or a pipeline composed of multiple contiguous task. As shown in the following figure, the FlatMap function occupies a taskslot, while the key Agg function and the sink function share a taskslot:

8. Briefly describe the data abstraction and data exchange process of Flink.

In order to avoid the inherent defects of JVM, such as the low storage density of java objects and the influence of FGC on throughput and response, Flink manages memory independently. MemorySegment is the memory abstraction of Flink. By default, a MemorySegment can be thought of as an abstraction of a large block of memory in 32kb. This memory can be either a byte [] in JVM or out-of-heap memory (DirectByteBuffer).

On top of the abstraction of MemorySegment, Flink uses Buffer in the process of transferring data from data objects in operator to TaskManager and preparing to be sent to the next node.

The intermediate object that docks from a Java object to a Buffer is another abstract StreamRecord.

9. How is the distributed snapshot mechanism in Flink implemented?

The core of the fault-tolerant mechanism of Flink is to make consistent snapshots of distributed data flows and operator states. These snapshots act as consistent checkpoint and the system can roll back in the event of a failure. The mechanism that Flink uses to make these snapshots is described in "lightweight asynchronous snapshots of distributed data streams." It is inspired by the standard Chandy-Lamport algorithm of distributed snapshots and is tailored to the execution model of Flink.

The barriers is injected into the parallel data stream at the data stream source. Where the barriers of snapshot n is inserted (we call it Sn) is the largest location of the data contained in the snapshot in the data source. For example, in Apache Kafka, this location will be the offset of the last record in the partition. Report the location Sn to the checkpoint coordinator (JobManager of Flink).

Then barriers flows downstream. When an intermediate operator receives the barriers of snapshot n from all its input streams, it issues a barriers for snapshot n into all its output streams. Once the sink operator (the end of the streaming DAG) receives barriers n from all its input streams, it confirms to the checkpoint coordinator that the snapshot n is complete. After all sink confirms the snapshot, it means that the snapshot has been completed.

Once the snapshot ndiary job is completed, the pre-Sn records will never be requested from the data source, because these records (and their subsequent records) will have passed through the entire data flow topology, that is, they have been processed.

10. How is FlinkSQL implemented?

Flink hands over SQL checking, SQL parsing, and SQL optimization to Apache Calcite. Calcite is also used in many other open source projects, such as Apache Hive, Apache Drill, Apache Kylin, Cascading. Calcite is at the heart of the new architecture, as shown in the following figure.

The task of building an abstract syntax tree is left to Calcite. The SQL query is transformed into a SQL node tree by a Calcite parser, which is validated and built into an abstract syntax tree of Calcite (that is, the Logical Plan in the figure). On the other hand, the calls on Table API construct the abstract syntax tree of Table API and transform it into the abstract syntax tree of Calcite through the RelBuilder provided by Calcite. It is then converted into a logical execution plan and a physical execution plan in turn.

After the task is submitted, it is distributed to each TaskManager to run, and at run time, the code is compiled using the Janino compiler and then run.

This is the end of the content of "what are the Flink-related interview questions"? thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.