Use a Python script to perform data checks, transformations, or any iterative process you may need.
class ExamplePythonTask(luigi.Task): def requires(self): """ Which other Tasks need to be complete before this Task can start? """ return [OtherTask()] def output(self): """ Where will this Task produce output? """ return S3Target('s3://my-output-bucket/my-example-tasks-output') def run(self): """ How do I run this Task? """ # We can do anything we want in here, from calling python # methods to running shell scripts to calling APIs
The template Task above includes the only three methods required of a Luigi Task:
requirestells Luigi which Task(s) must complete before this Task can run.
outputtells Luigi where to check for output data or tokens that indicate the Task has completed.
runtells Luigi what the Task actually entails. Any Python code in the
runmethod will be executed once the Task's dependencies are satisfied, so long as the Task's output doesn't already exist.