The first step for running outside of the Mortar platform is to export a copy of your code. To export your Mortar Project code follow the steps at Exporting Your Mortar Projects. To export your Web Project code follow the steps at Exporting Your Web Projects.
To run your Pig scripts, you'll need to start a Hadoop cluster. The easiest way to do that is to use Amazon's Elastic MapReduce service.
You can start a EMR cluster in a number of ways, including from the AWS Management Console, AWS CLI, and any of the AWS SDKs. See the AWS EMR documentation for more information.
To mimic the EMR clusters that Mortar launches, you can use the following settings:
Mortar does not install any applications (Pig, Luigi, etc) onto its EMR clusters, but rather runs those from a separate Luigi/Pig server (see next section).
Mortar also runs several EMR bootstrap actions to setup python properly on EMR cluster instances. We have open-sourced these at https://github.com/mortardata/mortar-luigi-example/tree/master/emr-bootstrap-actions. The bootstrap scripts are:
To run your code, you'll need to setup a server with Luigi and Pig installed. The recommended packages and versions are:
Other optional system packages you may want to install (if you use them) are:
Optional python packages are:
While MortarProjectPigscriptTask ran your Pig job through the Mortar service, PigTask will run your Pig job on your Luigi/Pig server. To ensure your Pig jobs continue to work as they did with Mortar we have created a version of the PigTask that you can use to run your Pig job with the same libraries and settings previously used by Mortar.
The first class MortarStylePigTask is a common class you can extend for all of your specific PigTasks. This sets the Pig properties, parameters, and libraries that were used by Mortar.
The second class ExcitePigTask is an example task class that shows you how to run a Pig script from your Mortar project.
You can copy the MortarStylePigTask class directly into your existing Luigi script and then replace each MortarProjectPigscriptTask with a class similar to ExcitePigTask. For each of these tasks you will need to:
When running your Luigi script you will also need to set a Luigi parameter "mortar-project-root" which is the absolute path to the root of your Mortar project which must be checked out somewhere on your Luigi/Pig server.