Mortar has joined Datadog, the leading SaaS-based monitoring service for cloud applications. Read more about what this means here.

Basic Luigi Python Tasks

By using Luigi's generic Task class (as opposed to a subclass customized for a specific purpose), you can integrate any Python code that you need into your Luigi pipeline.

luigi.Task

Use a Python script to perform data checks, transformations, or any iterative process you may need.

Example Usage

class ExamplePythonTask(luigi.Task):

    def requires(self):
        """
        Which other Tasks need to be complete before
        this Task can start? 
        """
        return [OtherTask()]

    def output(self):
        """
        Where will this Task produce output?
        """
        return S3Target('s3://my-output-bucket/my-example-tasks-output')

    def run(self):
        """
        How do I run this Task?
        """
        # We can do anything we want in here, from calling python
        # methods to running shell scripts to calling APIs

The template Task above includes the only three methods required of a Luigi Task:

  • requires tells Luigi which Task(s) must complete before this Task can run.
  • output tells Luigi where to check for output data or tokens that indicate the Task has completed.
  • run tells Luigi what the Task actually entails. Any Python code in the run method will be executed once the Task's dependencies are satisfied, so long as the Task's output doesn't already exist.

Example in Context

Mortar's Luigi tutorial includes a Task that runs a custom Python script to perform a quick sanity test on output data.