Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Which is more useful than spark or flink?

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article introduces the relevant knowledge of "which is easy to use between spark and flink". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

At the beginning of the sentence, spark starts with batch processing and develops stream processing, so micro-batch processing has priority and can be selected.

Flink starts with real-time processing, and then batch processing, so it is more suitable for real-time scenarios.

So does production really require such high real-time performance?

For example, 10wqps data, if real-time processing, using flink,sink is mysql, real-time high, event-driven, each to insert or update the database, obviously unreliable, because the database can not bear.

If you want to add batch processing to flink's sink, you can definitely improve performance, which reduces real-time performance, and there is also a problem:

If the business is migrated to a new topic or kafka cluster, after the data is migrated, migrate the flink task. You will find that if the last batch does not reach the batch size threshold, the data will not be brushed out and the data will not be lost, because no new data will be written and the sink will not be triggered to refresh.

In this kind of scenario, you still need to add a timeout detection thread to brush out the data for a certain period of time.

Isn't it troublesome.

So, in fact, real-time may not be so important in many cases.

In addition, spark streaming is already extremely stable, and flink has more bug.

Take a bug of kafkajsontablesource, that is, if the data format is json, you can directly deserialize and parse and register as row, but if there is a piece of data that is not json, then it will cause the flink task to fail, because the flink internal operator implements only one processing, and will not stop processing this data. Spark won't show up.

Some of them will not be listed.

But for R & D, mastering all is the best, and flink is really excellent in the field of stream processing.

This is the end of the content of "which is easy to use between spark and flink". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report