Mortar has joined Datadog, the leading SaaS-based monitoring service for cloud applications. Read more about what this means here.

Connect and Integrate
Connecting to Amazon S3

Connecting to Amazon S3

Using Amazon S3 with Mortar

Most Mortar users load and store their data in Amazon S3. S3 provides secure, durable, and inexpensive storage and is accessible through many different tools and programming languages. Additionally, Mortar's Elastic MapReduce (EMR) Hadoop clusters run in the same datacenters as S3, making it very fast to pull data into and out of S3.

How Mortar Connects to Amazon S3

Mortar connects to S3 via Amazon Web Services AWS Access Keys. These keys allow your Mortar Hadoop clusters to load input data from S3 and push output data back to S3.

S3 to Hadoop

New Mortar accounts are assigned Example Data AWS Access Keys. The Example Data keys can only load data from Mortar's example S3 buckets (mortar-example-data, twitter-gardenhose-mortar, etc) and store it back to Mortar's example output bucket (mortar-example-data-output).

If you want to load and store your own S3 data, you'll need to add your AWS Access Keys to your Mortar account.

Adding Your IAM AWS Access Keys to Mortar

In order to connect to Mortar, you'll need to create AWS Access Keys with S3 permissions. It's easy to do this using the AWS Identity and Access Management (IAM) console.

Log in to your AWS IAM Console:

IAM Console

Choose the Users option from the lefthand sidebar, and click the "Create New Users" button:

Create New Users

Provide a descriptive user name, such as "mortar-iam-user", leave "Generate an access key for each User" checked, and click "Create":

Enter User Names

Open the "Show User Security Credentials" section, and write down the "Access Key Id" and "Secret Access Key ID" for your newly created user:

Show Security Credentials

While still on the Users tab in the IAM Console, click the checkbox next to the name of the user you created:

Click User

Choose the "Permissions" tab at the bottom of the screen, and click "Attach User Policy":

Attach User Policy

Scroll Down inside "Select Policy Template Until you reach "Amazon S3 Full Access" and click "Select":

Manage User Permissions

Click "Apply Policy":

Set Permissions

Add the keys you copied down to your Mortar account on the Mortar AWS Settings page:

Mortar AWS Settings

Advanced: Custom IAM Policy

If you want more fine-grained control over your key permissions, you can use a custom IAM policy

  • In the Users tab, select the "mortar-iam-user" and click "Attach User Policy."
  • Choose "Custom Policy" from the list of options, and hit "Select" to open the policy editor.
  • Name your policy "Mortar-IAM-User-Policy"

Next, you need to write the Policy Document. It is written in Amazon's JSON IAM policy format.

You can start from this policy document, which gives read access to the Mortar example datasets and write access to the Mortar example output buckets:

First, make sure that you keep the s3:ListAllMyBuckets stanza in your policy document. It provides a top-level list of the buckets available to the user, without allowing access to any of them. Most S3 libraries (and Mortar) require this permission to test whether your credentials are valid.

Add your input data buckets to the s3:ListBucket, s3:GetBucketLocation and s3:GetObject stanzas of the example document. These permissions allow your Mortar Hadoop clusters to find your bucket, list objects in the bucket, and read data from those objects.

Add your output data buckets to the s3:ListBucket, s3:GetBucketAcl, s3:GetBucketLocation stanza and the s3:GetObject, s3:GetObjectAcl, s3:DeleteObject, s3:PutObject stanza of the example document. These permissions allow your Mortar Hadoop clusters to find your bucket, list the bucket, get objects from it, delete objects from it, and put objects into it.

Additionally, the GetBucketAcl and GetObjectAcl output bucket permissions allow you to create S3-authenticated links to your output data, visible on the Mortar Job Details page. You can remove these permissions if you like, and those links will not be created.

One gotcha to be aware of—the IAM Policy editor will reject your JSON if it has any whitespace before the first curly brace, so make sure to remove that if you see an error.

Once you've added a policy to your user, you can add their AWS Access Keys to Mortar in the same way as the "Adding IAM AWS Access Keys to Mortar" above.