Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to realize the bulkIteration iterative operation of Flink

2025-02-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article introduces the relevant knowledge of "how to realize Flink's bulkIteration iteration operation". In the actual case operation process, many people will encounter such difficulties. Next, let Xiaobian lead you to learn how to deal with these situations! I hope you can read carefully and learn something!

Iterative algorithms are used in many areas of data analysis, such as machine learning or graph computation. In order to extract useful information from big data, iterative calculation is often needed in the process of processing. There are many big data processing frameworks, such as spark, mr. In practice, these iterative implementations are difficult.

The magic of Flink is that it directly supports iterative computation. Flink's idea of iteration is also very simple, that is, to implement a step function, and then embed it into the iterative operator. There are two iterative operators:Iterate and Delta Iterate. Both operators call the step function until they receive a signal to terminate iteration.

This section is mainly about theory.

Iterative operators include simple iterative forms: for each iteration, the step function consumes the full amount of data (the result of this iteration and the previous iteration) and computes the output of the next iteration (e.g. map, reduce, join, etc.)

1. Iteration Input

The initial input to the first iteration may come from the data source or previous operators.

2. Step function

Each iteration executes the step function. It is a data flow composed of map, reduce, join and other operators, customized according to the business.

3. Next Partial Solution:

At each iteration, part of the output of the step function returns to participate in further iterations.

4. maximum number of iterations

If there are no other termination conditions, it will terminate when the number of aggregations reaches this value.

5. Custom Aggregator Convergence:

Iteration allows you to specify custom aggregators and convergence criteria, such as sum aggregates the number of records to emit (aggregator) and terminates if this number is zero (convergence criteria).

Case: Cumulative Count

This example is basically given data input, increment by one at a time, output the result.

Iterative Input: Input numbers from 1 to 5.

Step function: Add one to a number.

Part of the result: it's actually a map function.

Iteration result: The maximum number of iterations is ten, so the final output is 11-15.

code operation

When programming, this iteration method mentioned in this article is called bulk Iteration, which needs to call iterate(int). The function returns an IterativeDataSet. Of course, we can perform some operations on it, such as map. The only argument to the Iterate function is the maximum number of iterations.

Iteration is a loop. As you can see from the previous figure, we need to perform closed-loop operation, so we need to use closeWith(Dataset) operation at this time. The parameter is the dataset that needs loop iteration. Optionally, you can specify a termination criterion, such as closeWith(DataSet, DataSet), to terminate the iteration by determining whether the second dataset is empty. If no termination condition is specified, the iteration terminates after the maximum number of iterations.

Here's an example of computing pi iteratively.

package Streaming.iteration;

import org.apache.flink.api.common.functions.MapFunction;

import org.apache.flink.api.java.DataSet;

import org.apache.flink.api.java.ExecutionEnvironment;

import org.apache.flink.api.java.operators.IterativeDataSet;

public class IteratePi {

public static voidmain(String[] args) throws Exception{

final ExecutionEnvironmentenv = ExecutionEnvironment.getExecutionEnvironment();

// Create initialIterativeDataSet

IterativeDataSet initial= env.fromElements(0).iterate(100);

DataSet iteration= initial.map(new MapFunction(){

@Override

public Integermap(Integer i) throws Exception{

double x = Math.random();

double y = Math.random();

return i + ((x * x + y * y < 1) ? 1 : 0);

}

});

// Iterativelytransform the IterativeDataSet

DataSet count = initial.closeWith(iteration);

count.map(new MapFunction(){

@Override

public Double map(Integercount) throws Exception {

return count /(double) 10000 * 4;

}

}).print();

// execute theprogram

env.execute("IterativePi Example");

}

}

"Flink's bulkIteration operation how to achieve" content is introduced here, thank you for reading. If you want to know more about industry-related knowledge, you can pay attention to the website. Xiaobian will output more high-quality practical articles for everyone!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report