Mortar Projects make it easy to develop a large Hadoop project with a team. Mortar Projects give you:
Pig and Hadoop on Your Computer: When you create a Mortar Project, you get a local installation of Pig and Hadoop ready to use, without needing to install anything yourself. That means faster development, and better testing.
Version Control and Code Sharing: Mortar Projects are backed by source control, either through Mortar or your own system, so you can collaborate with team members on a project.
1-Button Deployment: When you're ready to run your project on a Hadoop cluster, a single command is all that's needed to deploy and run in the cloud.
Mortar Projects come in two flavors: Public and Private. Public Mortar Projects can be viewed and forked by anyone. Private Mortar Projects are only accessible to users in your Mortar account. When creating a new project you will need to decide if this is code that you would like to share with the world or if this is code that you want to keep within your organization.
Mortar has an example project that contains sample code for different styles of analysis. We'll use Mortar to create your own copy of this project.
Mortar project names share one global namespace so you'll need to pick a unique name. For this tutorial you can prepend your Mortar handle to "mortar-examples" in order to generate a unique name.
mortar projects:fork email@example.com:mortardata/mortar-examples.git <your-handle>-mortar-examples --public cd <your-handle>-mortar-examples
In order to take a quick look at how to run Mortar Projects, let's use the
Open your project in your favorite dev environment and look at the
coffee_tweets.pig script. This script calculates the
percentage of tweets in each state that indicate "coffee snobbery."
The best way to understand what this script is doing is to run an illustrate on it. This will let us see the data flowing through every alias to figure out what is changing at each step.
Rather than send our code out to the cloud and wait for the response to come back, we can get a much faster result by using Mortar's local mode. This uses Pig and Hadoop installed locally to perform an illustrate, and thus can run very quickly.
Run a local illustrate.
# Uses read-only example AWS keys - use your own keys for your data export AWS_ACCESS_KEY="AKIAJ54D5RAJFAYAEFZQ" export AWS_SECRET_KEY="frNw2FM1UqE1VmTRe8TZ7AloIpLeugdRCBW74pJX" mortar local:illustrate pigscripts/coffee_tweets.pig
The first time you run a local command, Mortar will download and install dependencies. Once that's done for a project, you should see very fast illustrate results. Run the command again to see a speedy illustrate.
Sometimes when developing, illustrate doesn't give you enough feedback about how your script is working. In these cases you can try running your code against a small subset of your data to see how it works. To avoid the time and cost of running your job on a remote Hadoop cluster you can use Mortar's local mode.
mortar local:run pigscripts/coffee_tweets.pig -f params/coffee_tweets/local.small.params
-f option, we pass in a parameter file that loads and stores the tweet data on our machine.
Once we are happy with our script, we can deploy it to run on a cluster.
By default the Mortar example project uses AWS spot instances to save money. Running this example on a 2-node spot instance cluster for 1 hour should cost you approximately $0.28 in pass-through AWS costs. Before running this job you will need to add your credit card to your account. You can do that on our Billing Page.
Once your credit card has been added, all you need to do is use the
mortar jobs:run pigscripts/coffee_tweets.pig
As output of this command, you will be given a
jobs:status command to run to see the job progress.
mortar jobs:status YOUR_JOB_ID_HERE --poll
You can also check on your job status by logging into the Mortar website and viewing the Jobs Page.
Once your job is done, you can visit the Job Details page to download a list of the states with the most coffee snobs per tweet-capita.
AT THIS POINT, YOU SHOULD BE ABLE TO:
Next, let's see how to start a new Mortar Project from scratch.