In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces "how to use CoGroup of Flink". In daily operation, I believe many people have doubts about how to use CoGroup of Flink. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts about "how to use CoGroup of Flink". Next, please follow the editor to study!
CoGroup operator: two data streams are grouped into group according to key, and the data streams are partitioned according to key to form a single data stream (unlike join, whether key is associated or not, it will eventually be merged into one data stream)
Sample environment
Java.version: 1.8.xflink.version: 1.11.1
Sample data source (project code cloud download)
Building Development Environment and data of Flink system example
CoGroup.java
Package com.flink.examples.functions;import com.flink.examples.DataSource;import com.google.gson.Gson;import org.apache.flink.api.common.eventtime.SerializableTimestampAssigner;import org.apache.flink.api.common.eventtime.WatermarkStrategy;import org.apache.flink.api.common.functions.CoGroupFunction;import org.apache.flink.api.java.functions.KeySelector;import org.apache.flink.api.java.tuple.Tuple3;import org.apache.flink.streaming.api.TimeCharacteristic;import org.apache.flink.streaming.api.datastream.DataStream Import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;import org.apache.flink.streaming.api.windowing.assigners.TumblingEventTimeWindows;import org.apache.flink.streaming.api.windowing.time.Time;import org.apache.flink.util.Collector;import java.time.Duration;import java.util.Arrays;import java.util.List / * * @ Description CoGroup operator: two data streams are grouped according to key and partitioned according to key to form a single data stream (different from join, regardless of whether key is associated or not, it will eventually be merged into one data stream) * / public class CoGroup {/ * two data streams are inlined and assigned to the same window Merge and print * @ param args * @ throws Exception * / public static void main (String [] args) throws Exception {final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment () Env.setParallelism (1); env.setStreamTimeCharacteristic (TimeCharacteristic.EventTime); / / watermark automatic watermarking scheduling time / / env.getConfig () .setAutoWatermarkInterval (200); List tuple3List1 = DataSource.getTuple3ToList () List tuple3List2 = Arrays.asList (new Tuple3 ("Wu Qi", "girl", 18), new Tuple3 ("Wu Ba", "man", 30) / / Datastream 1 DataStream dataStream1 = env.fromCollection (tuple3List1) / / add watermark window. If not, the time window will always wait for the watermark event time, and will not execute apply .assignTimestampsAndWatermarks (WatermarkStrategy. For BoundedOutOfOrthood (Duration.ofSeconds (2)). WithTimestampAssigner ((element, timestamp)-> System.currentTimeMillis () / / Datastream 2 DataStream dataStream2 = env.fromCollection (tuple3List2) / / add watermark window. If not, the time window will wait for the watermark event time. Apply .signTimestampsAndWatermarks (WatermarkStrategy. For BoundedOutOfOrthodox (Duration.ofSeconds (2)) .withTimestampAssigner (new SerializableTimestampAssigner () {@ Override public long extractTimestamp (Tuple3 element, long timestamp) {return System.currentTimeMillis ()) will not be executed })) / / A pair of dataStream1 and dataStream2 data streams are associated without association and / / Datastream 3 DataStream newDataStream = dataStream1.coGroup (dataStream2) .where (new KeySelector () {@ Override public String getKey (Tuple3 value) throws Exception {return value.f1) ) .equalTo (T3-> t3.f1) .window (TumblingEventTimeWindows.of (Time.seconds (1) .apply (new CoGroupFunction () {@ Override public void coGroup (Iterable first, Iterable second)) Collector out) throws Exception {StringBuilder sb = new StringBuilder () Gson gson = new Gson (); / / data stream collection for (Tuple3 tuple3: first) {sb.append (gson.toJson (tuple3)) .append ("\ n") for datastream1 } / / data flow collection for (Tuple3 tuple3: second) {sb.append (gson.toJson (tuple3)) .append ("\ n");} out.collect (sb.toString ()) ); newDataStream.print (); env.execute ("flink CoGroup job");}}
Print the result
{"f0": "Zhang San", "F1": "man", "f2": 20} {"f0": "Wang Wu", "F1": "man", "f2": 29} {"f0": "Wu Ba", "F1": "man", "f2": 30} {"f0": "Wu Ba", "F1": "man", "f2": 30} {"f0": "Li Si", "F1": "girl", "f2": 24} {"f0": "Liu Liu" "F1": "girl", "f2": 32} {"f0": "Wu Qi", "F1": "girl", "f2": 18} {"f0": "Wuqi", "F1": "girl", "f2": 18} so far The study on "how to use Flink's CoGroup" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.