Luigi is a powerful, versatile data-pipeline framework developed and open-sourced by Spotify. With Luigi, Mortar users can easily build and manage complex data pipelines comprising numerous Mortar jobs as well as other data-processing steps. Luigi is extremely flexible—anything that can be written in Python can be a Luigi Task.
We've created a quick Luigi tutorial to walk you through the basics of building a data pipeline on Mortar. The tutorial demonstrates how Luigi Tasks work (Tasks are the basic units that make up a Luigi pipeline), as well as how they connect when one Task has dependencies on another.
The official Luigi documentation has an overview of the Luigi API and the core components of Luigi Tasks, as well as a detailed list of the contents of the Luigi package. (Note that some actions on Mortar, such as running a Hadoop job, are better managed using our Mortar-specific Tasks than with generic Luigi tasks.)
Erik Bernhardsson of Spotify, one of the principal authors of Luigi, gave a great talk at PyData NYC 2013 (embedded below). Erik's talk nicely sums up what Luigi does and how it has made complex data pipelines at Spotify much more stable and scalable.
The Luigi user group mailing list is a good place to read up on common issues and keep tabs on upcoming features.