Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use the state of Flink operator

2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces how to use the state of the Flink operator, the article introduces in great detail, has a certain reference value, interested friends must read it!

1. Operator state classification

The scope of the operator state is limited to the operator parallel subtask. This means that all data processed by the same parallel subtask can be accessed to the same state, which is shared for the same subtask. The operator state cannot be accessed by another parallel subtask of the same or different operator.

Flink provides three basic data structures for operator state. It mainly introduces how to allocate operator state when restarting from save point when parallelism is changed (capacity expansion):

List status (List state): represents a state as a list of sets of data.

Operators with operator list status will reallocate the entries in the list when they are scaled up or down. In theory, the list entries for all parallel operator tasks are uniformly collected and then evenly distributed over fewer or more tasks. If the number of list entries is less than the parallelism newly set by the operator, the state of some tasks may be empty at startup.

The joint list state (Union list state) also represents the state as a list of data. It differs from the regular list state in that it recovers when a failure occurs when the application is launched from the SavePoint (savepoint). If the parallelism changes, the operator with the operator joint list state broadcasts all the entries of the status list to all tasks when scaling down, and then it is up to the task to decide which entries should be retained and which should be discarded.

For the same operator, if the previous parallelism is 2, then there will be two subtasks, that is, two states. If the parallelism is changed to 3, then send a copy of the previous two states to each parallel subtask, so that there are all states on each parallel subtask, and then it is up to the parallel subtask to decide which state to use.

Broadcast state (Broadcast state): unlike a normal operator state, each parallel subtask has the same state. But it is still each parallel subtask that accesses its own state, but the state is the same. If an operator has multiple tasks and each parallel subtask state is the same, then this special case is most suitable for the application of broadcast state.

Operators with operator broadcast state copy the state to all new tasks when scaling down, because the broadcast state ensures that all tasks are in the same state. In the case of capacity reduction, since the status will not be lost after replication, we can simply stop the extra tasks.

two。 Operator state using public class StateTest1_OperatorState {public static void main (String [] args) throws Exception {StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment (); env.setParallelism (1); / socket text stream DataStream inputStream = env.socketTextStream ("localhost", 7777); / / converted to SensorReading type DataStream dataStream = inputStream.map (line-> {String [] fields = line.split (",") Return new SensorReading (fields [0], new Long (fields [1]), new Double (fields [2]);}); / / define a stateful map operation to count the number of current partition data SingleOutputStreamOperator resultStream = dataStream.map (new MyCountMapper ()); resultStream.print (); env.execute () } / / Custom MapFunction public static class MyCountMapper implements MapFunction, ListCheckpointed {/ / defines a local variable as operator state private Integer count = 0; @ Override public Integer map (SensorReading value) throws Exception {count++; return count;} @ Override public List snapshotState (long checkpointId, long timestamp) throws Exception {return Collections.singletonList (count) @ Override public void restoreState (List state) throws Exception {for (Integer num: state) count + = num;}

The definition of operator state is the same as that of ordinary member variables, but the corresponding operator handler function inherits the corresponding interface, such as ListCheckpointed, the logic of custom state snapshot and recovery.

The above is all the content of the article "how to use Flink operator state". Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report