In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
This article introduces the basic knowledge of Python timed task framework APScheduler, the content is very detailed, interested friends can refer to, hope to be helpful to you.
1. What is the timed task framework APScheduler?
In the actual development, we will encounter repetitive or periodic tasks, such as crawling the data of a website at a certain time every day, running the code training model at a certain time, and so on. Such tasks usually need to be set or scheduled manually so that they can be run within a set time.
It can be done manually by scheduling tasks on Windows, but more related operations about crontab are often used on Linux systems. But manual management is not a good choice, if there are more than a dozen different scheduled tasks to manage, and it is a bit clumsy to intervene manually each time, then it is really "artificial intelligence".
The scheduling coding of these scheduled tasks can be freed from this manual management of pure human operation. Some operations for scheduled tasks in the Python ecology:
1. Schedule: a third-party module, which is suitable for lightweight scheduling tasks, but not for complex time scheduling.
2. APScheduler: the third-party timed task framework, which imitates and transplants the Java third-party timed task framework Quartz, can provide more complex application scenarios than schedule, and various components are modular and easy to use and secondary development.
3. Celery Beat: a scheduled task component under celery, a third-party library of distributed task queues. If you use message queuing suites such as RabbitMQ or Redis, it takes a certain amount of time to build the environment, but Windows is no longer supported in high versions.
In order to meet the relatively complex time conditions and do not need to spend a lot of time on the construction of the previous environment, choosing APScheduler to manage scheduled tasks or scheduled tasks is a cost-effective choice.
II. Concepts and components of APScheduler
The simplicity of this framework. Before using APScheduler, you need to have a simple understanding of some of the concepts of this framework, including the following: triggers, task persistence, executors, and schedulers.
(1) trigger (trigger)
Trigger A component used to trigger a scheduled task. In APScheduler, it mainly refers to time triggers, and there are three main types of time triggers available:
Date: date trigger. Date trigger is mainly called when a task is running on a certain date and time, and it is the simplest trigger in APScheduler. It is also usually suitable for one-time task or job scheduling.
Interval: interval trigger. Interval trigger extends the setting of time part, such as hour, minute, second, day and week, on the basis of date trigger. A common scheduler used to set or schedule repetitive tasks. After the time part is set, the task will be executed at the set time from the start date.
Cron:cron expression trigger. Cron expression triggers are equivalent to crontab on Linux and are mainly used for more complex date-time settings. It should be noted, however, that APScheduler does not support 6-bit or more cron expressions up to 5 bits.
(II) Task persistence (job stores)
APScheduler provides a variety of ways to persist tasks. The default is to use memory, in the form of memory, but memory is not the best way to persist. The best way is to write scheduled tasks to disk through a carrier such as a database, and the data can be recovered as long as the disk is not damaged. The main commonly used databases supported by APScheduler are:
Sqlalchemy form of database, here mainly refers to a variety of traditional relational databases, such as MySQL, PostgreSQL, SQLite and so on.
Mongodb unstructured Mongodb database, which is often used to store or manipulate unstructured or version structured data, such as JSON.
Redis in-memory database is usually used as data cache, of course, through some master-slave replication and other ways to achieve the persistence or preservation of the data.
Typically, you can create an instance of Scheduler when you create it, or specify it separately for the task. The way to configure is relatively simple, you only need to specify the corresponding database link.
(3) Actuator (executor)
As the name implies, the executor is the object that executes the task. In the computer, either CPU schedules the task or maintains a separate thread to run the task. So the executors in APScheduler are usually thread pools such as ThreadPoolExecutor or ProcessPoolExecutor and process pools. Of course, if it is related to collaborative or asynchronous task scheduling, you can also use the corresponding AsyncIOExecutor, TwistedExecutor and GeventExecutor executors.
(IV) dispatcher (scheduler)
The choice of scheduler mainly depends on your current program environment and the use of APScheduler. Depending on the purpose, APScheduler provides the following schedulers:
BlockingScheduler: blocks the scheduler, which is used when there is nothing in the program running in the main process.
BackgroundScheduler: background scheduler, used only when you do not use any of the later schedulers and want to start in the background when the application is running, such as if you have already opened a Django or Flask service.
AsyncIOScheduler:AsyncIO scheduler, which is used if the code operates asynchronously through the asyncio module.
GeventScheduler:Gevent scheduler, if the code is coordinated through the gevent module, use this scheduler
TornadoScheduler:Tornado Scheduler, used in the Tornado framework
TwistedScheduler:Twisted Scheduler for use in Twisted-based frameworks or applications
QtScheduler:Qt scheduler, which is used in building Qt applications.
In general, if it does not coexist with Web projects or application integration, then it is often preferred to operate with BlockingScheduler scheduler, which starts the corresponding thread in the current process to schedule and process tasks; on the contrary, if it coexists with Web projects or applications, then you need to choose BackgroundScheduler scheduler, because it will not interfere with the current application thread or process status. Based on the understanding of the above concepts and components, we can basically understand the running process of APScheduler:
1. Set up a scheduler to overall plan the scheduling and scheduling of tasks.
2. Set the corresponding trigger on the corresponding function or method and add it to the scheduler
3. If there is a need for task persistence, you need to set the corresponding persistence layer, otherwise memory storage tasks are used by default.
4. When the trigger is triggered, the task is handed over to the executor for execution
On the Python timed task framework APScheduler basic knowledge shared here, I hope that the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.