periodtask

Scheduled tasks with timezones and configurable e-mail alerts.

Features

  • cron-like expressions to define a schedule
  • Timezone in cron expressions
  • Multiple cron schedule for each task
  • Configurable e-mail alerts
  • Different policies (SKIP, DELAY, RUN) to handle situations when an instance of a task to start is already running

Topics

Quick reference

#!/usr/bin/env python3

import logging
import os
import signal

from periodtask import TaskList, Task, SKIP
from periodtask.mailsender import MailSender


logging.basicConfig(level=logging.DEBUG)

# we send STDOUT and STDERR to these loggers
stdout_logger = logging.getLogger('periodtask.stdout')
stderr_logger = logging.getLogger('periodtask.stderr')


# this function will be called with (subject, message, html_message),
# but MailSender needs to know more to send
send_success = MailSender(
    os.environ.get('EMAIL_HOST'),  # the SMTP server host
    int(os.environ.get('EMAIL_PORT')),  # the SMTP port
    os.environ.get('SUCCESS_EMAIL_FROM'),  # the sender
    os.environ.get('SUCCESS_EMAIL_RECIPIENT'),  # the list of recipients
    timeout=10,  # connection timeout in seconds
    use_ssl=False,  # ... and some SMTP specific parameters
    use_tls=False,
    username=None,
    password=None
).send_mail


tasks = TaskList(
    Task(
        'lister',  # name of the task
        ('ls', '-hal'),  # the command to run; see Popen
        # a list of cron-like expressions:
        # sec min hour day month year timezone
        # defaults: 0 */5 * * * * UTC
        ['0 * * * * * UTC'],  # sec min hour day month year timezone
        mail_success=send_success,  # e-mail sending function or None
        mail_failure=None, mail_skipped=None, mail_delayed=None,
        wait_timeout=5,  # killing after 5 seconds if the process still runs
        stop_signal=signal.SIGTERM,  # after sending this signal
        max_lines=10,  # length of STDOU, STDERR head and tail (None for all)
        run_on_start=True,  # we may want to run the task on startup
        policy=SKIP,  # we skip the schedule if still running
        template_dir='/tmp',  # e-mail template dirs (list or string)
        stdout_logger=stdout_logger,
        stdout_level=logging.DEBUG,  # send STDOUT logs to this level
        stderr_logger=stderr_logger,
        stderr_level=logging.WARNING,  # send STDERR logs to this level
        cwd=None,  # run the command in this directory (None to keep current)
        email_limitation=True  # send only one skipped or delayed message
    ),
    Task(  # you can specify more than one task
        'catter', ('cat', 'README.rst'), '5 20,40 7-19 MON-FRI *',
        run_on_start=True
    )
)

tasks.start()  # blocking... the process will exit on SIGINT and SIGTERM

Cron format reference

The cron expression is made up of 7 parts separated by one or more spaces. If fewer parts are presented in an expression, the missing ones will be substituted with defaults. Each part represents

seconds minutes hours days months years timezone

respectively. The default is 0 */5 * * * * UTC.

Given a second (total seconds since EPOCH), this will be converted to a timestamp: 2018-08-07 16:57:30 WED based on the given timezone. This timestamp matches the cron expression if the second (30 in the example) matches the seconds part, the minute (57) matches the minutes part etc.

Each part is made up of atoms separated by a comma. The part matches if any of the atoms matches.

An atom can be:

*
Any possible value matches.
an integer (12)
The given value matches.
a range (12-23)
Any value in the range (inclusive) matches. Range lower or upper bounds can be empty: -12, 2- or even -. (Note that * is equivalent to -.)
a range (or *) with step (2-10/3)
In case of 2-10/3 the following values match: 2, 5, 8.

In case of the days part, normally the day of month is taken into account. If the atom is given in a ‘day of week’ format, the weekday (WED in the example) and the day of month both considered. The day of week atom can be:

a weekday in upper or lowercase (mon)
Valid values are MON, TUE, WED, THU, FRI, SAT, SUN (and the lowercase variants)
a range of weekdays (MON-FRI)
To construct a valid range it is important to know that MON is considered the first day of the week while SUN is the last one.
a range (or a single value) with a ‘first-last clause’ (sun/L)

L means ‘7 days from now is not in the current month’, LL means ‘14 days from now is not in the current month’, etc.

F means ‘7 days before now was not in the current month’, FF means ‘14 days before now was not in the current month’, etc.

Given the above rule, sun/L means ‘last Sunday in the month’.

Examples

  • Last Sunday of each month at 21.00 according to UTC time: 0 0 21 sun/L (note that ‘each month’, ‘each year’ and ‘UTC’ are dfaults)
  • Everey weekday at 18.15 in Budapest time: 0 15 18 mon-fri * * Europe/Budapest

API

class periodtask.TaskList(*args)[source]

Defines the tasks to run and starts the sceduler. Pass in Task instances to schedule.

start()[source]

Start The scheduler. This will block until SIGTERM or SIGINT received.

class periodtask.Task(name, command, periods='', run_on_start=False, mail_success=None, mail_failure=None, mail_skipped=None, mail_delayed=None, send_mail_func=None, wait_timeout=10, max_lines=50, stop_signal=<Signals.SIGTERM: 15>, policy=0, template_dir=[], stdout_logger=<Logger periodtask.stdout (WARNING)>, stdout_level=20, stderr_logger=<Logger periodtask.stderr (WARNING)>, stderr_level=20, cwd=None, skip_delayed_email_threshold=5, failure_email_threshold=5)[source]

Represents a task to schedule.

Parameters:
  • name (str) – The name of the task, will apear in logs and emails.
  • command (tuple) – See args param of the Popen constructor.
  • periods (list/str) – A cron expression (str) or a list of them. See Cron format reference for more information. By default (when set to an empyt string) this will be equivalent to 0 */5 * * * * UTC.
  • run_on_start (bool) – Indicates weather the task should run when the scheduler starts no matter what was given in periods. Useful for manually testing the task.
  • mail_success (func/bool) –

    If set to a truthy value an email will be sent after the task run successfully. If this is a function, this function will be used to send out the email (if send_mail_func does not override it). The signature of the function is

    def send_mail(subject, message, html_message=None)
    
  • mail_failure (func/bool) – Controls emails sent when the task fails. Otherwise it is the same as mail_success.
  • mail_skipped (func/bool) – Controls emails sent when the task is skipped due to the defined policy.
  • mail_deleyed (func/bool) – Controls emails sent when the task is delayed due to the defined policy.
  • send_mail_func (func) – If set, this must be a function. This function will be used to send emails, no matter what was set in mail_… params.
  • wait_timeout (number) – After sending stop_signal to the task process, we wait this many seconds for the process to stop. If the timeout expires, we kill the process.
  • max_lines (int/tuple) –

    STDOUT and STDERR are collected from the task process. To avoid haevy memory usage we only store this many lines in memory. More precisely STDOUT head and tail, STDERR head and tail are list of lines. This parameter controls the maximum length of these lists.

    examples:

    parameter stdout stderr
    value head tail head tail
    2 2 2 2 2
    (2, 3) 2 2 3 3
    (10, (2,3)) 10 10 2 3
    ((1, 2), (3, 4)) 1 2 3 4
  • stop_signal (int) – This signal will be sent to the task process when we want to stop it gracefully.
  • policy (int) –

    Available values are periodtask.SKIP, periodtask.DELAY and periodtask.RUN.

    SKIP
    If a process is (still) running and the task is scheduled, this new process will be skipped. If requested, an email will be sent.
    DELAY
    If a process is (still) running and the task is scheduled, this new process will be delayed and will run immediatelly when the actual process terminates. If requested, an email will be sent.
    RUN
    Tasks will always run when scheduled.
  • template_dir (list/str) – Directories to look for email templates in.
  • stdout_logger (logging.logger) – The logger to use for the STDOUT of the task process.
  • stdout_level (int) – The STDOUT of the task process will be logged to this level.
  • stderr_logger (logging.logger) – The logger to use for the STDERR of the task process.
  • stderr_level (int) – The STDERR of the task process will be logged to this level.
  • cwd (str) –

    The task process will run with cwd as the working directory. See the Popen constructor.

  • skip_delayed_email_threshold (int/None) – In havaria situations instead of sending SKIP or DELAYED emails forever, only this much consecutive emails will be send. When the send queue becomes empty a NO BLOCK email will be sent. None means no threshold.
  • failure_email_threshold (int/None) – When a task fails more than this in a row no new FAILURE email will be sent. When the task runs successfully a RECOVER email will be sent. None means no threshold.

Release Notes

0.8.0

  • Support 3.8.10 python

0.7.0

  • Former email_limitation is changed to email thresholds. This is a breaking change!

0.6.0

  • Improved handling of max_lines.
  • Added task param failure_email_limitation.

0.5.5

  • Improved wording of builtin e-mail templates.
  • Added the email_limitation parameter to Task.

0.5.3

  • Bugfix: mail_success, mail_failure, mail_skipped, mail_delayed parameters of Task were not handled correctly.

0.5.2

  • template_dir given to Task extends the default template dir, so templates can be overridden individually.