Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the timers in big data's development?

2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article is to share with you what the timer has in big data's development. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

Preface

What is a timer, to put it bluntly, is to specify a delay time and execute when it expires, just like the alarm we set in the morning, reminding us to get up at a fixed point every day; of course, it is everywhere in our systems, such as regularly backing up data, regularly pulling files, regularly refreshing data, and so on; timer tools also emerge in endlessly, such as Timer,ScheduledExecutorService,Spring Scheduler,HashedWheelTimer (time wheel), Quartz,Xxl-job/Elastic-job, etc. This article will make a brief introduction and comparison of these timer tools, in which scenarios they are used.

Timer

Timer can be said to be the earliest timer provided by JDK. It is easy to use and the function is relatively simple. It can be triggered after a fixed time, at a fixed time and at a fixed frequency.

Timer timer = new Timer (); timer.schedule (new TimerTask () {@ Overridepublic void run () {System.out.println (System.currentTimeMillis () + "= task1");}, 1000, 1000)

The default time is millisecond, which means that the task is executed after a delay of one second, and the task is executed with a frequency of 1 second. TaskQueue is used to store tasks internally in Timer, and TimerThread single thread is used to execute tasks:

Private final TaskQueue queue = new TaskQueue (); private final TimerThread thread = new TimerThread (queue)

Inside TimerThread is a while (true) loop that constantly fetches task execution from TaskQueue; of course, each task added to TaskQueue is sorted and sorted through nextExecutionTime, so that TimerThread can get the most recently executed task every time.

Timer has two major disadvantages:

An uncaught exception occurred in TimerTask, affecting the Timer

Because it is a single thread to execute a task for too long, it will affect the accuracy that other people think.

Because of some shortcomings of Timer, JDK1.5 has a new timer ScheduledExecutorService.

ScheduledExecutorService

JDK1.5 provides the function of thread pool. ScheduledExecutorService is an interface class. The specific implementation class is that ScheduledThreadPoolExecutor inherits from ThreadPoolExecutor.

ScheduledExecutorService scheduledExecutorService = Executors.newScheduledThreadPool (2); scheduledExecutorService.scheduleAtFixedRate (new Runnable () {@ Overridepublic void run () {System.out.println (Thread.currentThread () + "= =" + System.currentTimeMillis () + "= task1");}, 1000, 1000, TimeUnit.MILLISECONDS)

Compared to Timer, you can configure multiple threads to execute scheduled tasks, while abnormal tasks do not interrupt ScheduledExecutorService. There are several core configurations of thread pool:

CorePoolSize: the number of core threads. If there are less than this number of threads in the thread pool, they are created when the task is executed and the core threads are not recycled.

MaximumPoolSize: the maximum number of threads allowed. When the number of threads reaches corePoolSize and the workQueue queue is full, threads will be created.

KeepAliveTime: more than corePoolSize idle time

The unit of unit:keepAliveTime

WorkQueue: when a thread exceeds corePoolSize, new tasks are queued to wait

ThreadFactory: factory class for creating threads

Handler: thread pool rejection policy, including: AbortPolicy,DiscardPolicy,DiscardOldestPolicy,CallerRunsPolicy policy. Of course, you can also extend it yourself.

The added task in ScheduledExecutorService is packaged as a ScheduledFutureTask class, and putting the task in the DelayedWorkQueue queue is a BlockingQueue; similar to Timer, which is sorted according to the trigger time of joining the task, and then the Worker in the thread pool will go to the Queue to get the task execution.

Spring Scheduler

Spring provides xml and annotation methods to configure scheduling tasks, such as the following xml configuration:

Spring provides support for cron expressions, and can be directly configured to execute specified methods in specified classes, which is more convenient and simple for users, but the ScheduledThreadPoolExecutor thread pool is still used internally.

HashedWheelTimer

A timer provided by Netty is used to send the heartbeat regularly, using the time wheel algorithm; HashedWheelTimer is a ring structure, which can be compared to a clock, the whole ring structure is composed of small cells, each small cell can store many tasks, with the passage of time, the pointer rotates, and then executes the task in the current specified grid; the task determines which grid it should be placed in by taking the module, which is a bit similar to hashmap

HashedWheelTimer hashedWheelTimer = new HashedWheelTimer (1000, TimeUnit.MILLISECONDS, 16); hashedWheelTimer.newTimeout (new TimerTask () {@ Overridepublic void run (Timeout timeout) throws Exception {System.out.println (System.currentTimeMillis () + "= executed");}}, 1, TimeUnit.SECONDS)

The three initialization parameters are:

TickDuration: the length of each frame

The unit of unit:tickDuration

TicksPerWheel: how many bars are there in the time wheel?

For example, the parameters configured in the above example are 1 second in length and 16 frames in total. If the execution is delayed by 1 second, it will be put in the grid numbered 1, starting with 0; if the delay is 18 seconds, it will be put in the grid numbered 2, and the remainingRounds=1 will be specified at the same time, indicating the round to be called, and each round of remainingRounds-1 will not be executed until remainingRounds=0 is executed.

Quartz

The timers described above are all in-process scheduling, while Quartz provides distributed scheduling. All scheduled tasks can be stored in the database. Each business node acquires the tasks that need to be executed in a preemptive way. A problem in one of the nodes does not affect the scheduling of tasks.

For more information about Quartz, please refer to my previous article:

Spring integrates Quartz distributed scheduling

Quartz database table analysis

Analysis of Quartz scheduling Source Code

Scheduling Analysis based on Netty+Zookeeper+Quartz

Of course, Quartz itself has its own shortcomings: the underlying scheduling depends on the pessimistic lock of the database, which will lead to node load imbalance; and the scheduling and execution are coupled together, causing the scheduler to be affected by the business.

Xxl-job/Elastic-job

Because Quartz has many shortcomings, distributed scheduling solutions based on Quartz have appeared, including Xxl-job/Elastic-job and so on.

The overall idea: the scheduler and the executor are split into different processes, and the scheduler still depends on the scheduling mode of Quartz itself, but the scheduling is not a specific business QuartzJobBean, but a unified RemoteQuartzJobBean, in which the executor is called remotely to execute the specific business Bean. The specific actuator is registered with the registry (such as Zookeeper) at startup, and the scheduler can obtain the actuator information in the registry (such as Zookeeper) and specify the specific executor to execute through the relevant load algorithm.

It also provides an operation and maintenance management interface to manage tasks, such as xxl-job:

Of course, there are many other functions, which are not introduced here. You can check the official website directly.

Choose the right timer

In fact, the whole can be divided into two categories: in-process timers and distributed schedulers.

In-process timer: Timer,ScheduledExecutorService,Spring Scheduler,HashedWheelTimer (time wheel)

Distributed scheduler: Quartz,Xxl-job/Elastic-job

So first of all, according to the need, only the timer in the process is needed, or distributed scheduling is needed.

Secondly, Timer can be eliminated in the process, and ScheduledExecutorService can be used instead. If the system uses Spring, then of course Spring Scheduler should be used.

Let's focus on the internal use of DelayedWorkQueue in ScheduledExecutorService and HashedWheelTimer,ScheduledExecutorService, where the addition and deletion of tasks will lead to performance degradation; while HashedWheelTimer is not limited by the number of tasks, so if there are many tasks and the task execution time is short, such as heartbeat, then HashedWheelTimer is the best choice; HashedWheelTimer is single-threaded, if there are not many tasks and the execution time is too long, affecting the accuracy, while ScheduledExecutorService can use multithreading, it is better to choose ScheduledExecutorService.

Finally, Quartz and Xxl-job/Elastic-job in the distributed scheduler will choose Quartz only if they do not have high requirements for distributed scheduling, otherwise they should both choose Xxl-job/Elastic-job.

Thank you for reading! This is the end of this article on "what are the timers in big data's development?" I hope the above content can be of some help to you, so that you can learn more knowledge. If you think the article is good, you can share it for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report