Mortar has joined Datadog, the leading SaaS-based monitoring service for cloud applications. Read more about what this means here.

API Version 2

This is the specification document for Mortar's REST API, version 2.

Basics

All API access is over HTTPS, accessed from the api.mortardata.com endpoint. All POSTed data is sent and received as JSON.


Authentication

Authentication happens via an API key. You can get your API key from the My Settings page.

Once you have your API key, use Basic Authentication over SSL to authenticate all API requests. Use your email as username and your API key as password. For example, to authenticate in curl as myusername@mydomain.com with an API key of 1234567890:

curl --user 'myusername@mydomain.com:1234567890' https://api.mortardata.com/v2/projects

Clusters

This section of the API lets you inspect and control Hadoop clusters.

Get Recent Clusters

Get the details of all your running and recently destroyed clusters.

GET /v2/clusters

Response

200 OK
{
  "clusters": [
    {
      "cluster_id":"4f4c4afb916eb10526111111",
      "status_code": "running",
      "status_description": "Running",
      "task_trackers": [
        {
          "public_address": "ec2-23-20-47-168.compute-1.amazonaws.com",
          "task_tracker_url": "http://ec2-23-20-47-168.compute-1.amazonaws.com:50060/"
        }
      ],
      "start_timestamp": "2012-06-14T14:55:43.392000+00:00",
      "running_timestamp": "2012-06-14T14:59:43.392000+00:00",
      "stop_timestamp": null,
      "job_tracker_url": "http://ec2-50-16-35-145.compute-1.amazonaws.com:50030/jobtracker.jsp",
      "name_node_url": "http://ec2-50-16-35-145.compute-1.amazonaws.com:50090",
      "duration": "6 mins",
      "cluster_type_code": "persistent",
      "cluster_type_description": "Multi - Job",
      "size": 2
    }
  ]
}

The "main success scenario" for a cluster passes through status_code values:

  • pending: Initial state, pending launch
  • starting: Cluster hardware is being started
  • mortar_bootstrapping: Cluster software and packages being installed and started
  • running: Cluster is ready for use
  • stopping: Cluster is in the process of shutting down
  • stopping_copying_logs: Cluster logs are being copied to user bucket
  • destroyed: Cluster has been shut down

Other states include:

  • starting_requested_stop: Cluster is starting, but has been requested to be stopped as soon as possible.
  • failed: Cluster failed to start.

Stop a Cluster

DELETE /v2/clusters/:cluster_id

Response

200 OK

Describes

Run and fetch results for a Pig DESCRIBE operation. Mortar Projects ONLY.

Run a Describe

POST /v2/describes

Parameters

  • alias: Pig alias to describe.
  • git_ref: Version of code (git hash) to use
  • project_name: Mortar project to use
  • pigscript_name: Pigscript to use (without path or extension)
  • pig_version: Pig version to use. Options are 0.9 (default) or 0.12.

Example Request Body

{
  "alias": "ordered_results",
  "git_ref": "755b260dcb3f5da001a7df52025025ca45c29598",
  "project_name": "my_project",
  "pigscript_name": "coffee_tweets"
}

Response

200 OK
{
  "describe_id": "51a81670d93ee663012c9945",
}

Get Describe Status and Results

GET /v2/describes/:describe_id

Parameters

  • exclude_result: Whether to exclude the result field (default: false)

Response

200 OK
{
  "project_name": "my_project",
  "alias": "ordered_results",
  "git_ref": "755b260dcb3f5da001a7df52025025ca45c29598",
  "script_name": "coffee_tweets",
  "describe_id": "51a81670d93ee663012c9945"
  "status_code": "SUCCESS",
  "status_description": "Success",
  "web_result_url": "https://app.mortardata.com/describes/51a81670d93ee663012c9945",
  "result": {
    "tables": [
      {
        "alias": "tweets",
        "fields": ["id_str:chararray", "in_reply_to_screen_name:chararray",
                  "in_reply_to_status_id_str:chararray"],
        "op": "LOFilter"
      },
      {
        "alias": "ordered_results",
        "fields": ["id_str:chararray", "in_reply_to_screen_name:chararray",
                   "in_reply_to_status_id_str:chararray"],
        "op": "LOSort"
      }
    ]
  }
}

The "main success scenario" for a describe passes through status_code values:

  • QUEUED: Submitted, pending execution
  • PROGRESS: Describe operation in progress
  • SUCCESS: Describe complete; results available

Other states include:

  • KILLED: Describe operation terminated by user
  • FAILURE: Syntax error in pigscript (details in error_message field)
  • GATEWAY_STARTING: Pig server starting (happens on first request in session)

Gateways

This section of the API lets you inspect the Pig server (also known as Gateway) that will run Pig jobs for you.

Get Gateway

Get the gateway server currently provisioned for your user.

GET /v2/gateways

Response

200 OK
{
    "gateway_id": "54h35ce326fdc53fds6ef7320",
    "status_code": "running",
    "start_timestamp": "2014-09-26T12:32:38.933000+00:00",
    "private_address": "ip-10-121-130-134.ec2.internal",
    "public_address": "ec2-54-194-211-224.compute-1.amazonaws.com"
}

The "main success scenario" for a gateway passes through status_code values:

  • pending: Initial state, pending launch
  • starting: Gateway hardware is being started
  • running: Gateway is ready for use; public_address and private_address available

Note, if no gateway is currently starting or running for your user, you will receive a 404 status from this call. To ensure that a gateway is running, first run a Pig operation, such as Validate, Illustrate, or Run a Job.


Illustrates

Run and fetch results for a Pig ILLUSTRATE operation. Mortar Projects ONLY.

Run an Illustrate

POST /v2/illustrates

Parameters

  • alias: Pig alias to illustrate (optional: if not provided, illustrate entire script)
  • git_ref: Version of code (git hash) to use
  • project_name: Mortar project to use
  • pigscript_name: Pigscript to use (without path or extension)
  • pig_version: Pig version to use. Options are 0.9 (default) or 0.12.

Example Request Body

{
  "git_ref": "755b260dcb3f5da001a7df52025025ca45c29598",
  "project_name": "my_project",
  "pigscript_name": "coffee_tweets"
}

Response

200 OK
{
  "illustrate_id": "51a81670d93ee663012c9945",
}

Get Illustrate Status and Results

GET /v2/illustrates/:illustrate_id

Parameters

  • exclude_result: Whether to exclude the result field (default: false)

Response

200 OK
{
  "project_name": "my_project",
  "alias": "ordered_results",
  "git_ref": "755b260dcb3f5da001a7df52025025ca45c29598",
  "script_name": "coffee_tweets",
  "illustrate_id": "51a81670d93ee663012c9945"
  "status_code": "SUCCESS",
  "status_description": "Success",
  "web_result_url": "https://app.mortardata.com/illustrates/51a81670d93ee663012c9945",
  "result": {
    "tables": [
      {
        "alias": "tweets",
        "fields": ["id_str:chararray", "in_reply_to_screen_name:chararray",
                  "text:chararray"],
        "op": "LOFilter",
        "data": [
          [
            "339768386961678338",
            "",
            "me tooooo"
          ]
        ],
        "notices": []
      }
    ]
  },
  "completeness": 79.46428680419922,
  "udf_output": "us_state: Found state IA in full_place: Center Grove, IA"
}

The "main success scenario" for a illustrate passes through status_code values:

  • QUEUED: Submitted, pending execution
  • PROGRESS: Illustrate operation has started
  • BUILDING_PLAN: Illustrate compiling plan for pig script
  • READING_DATA: Illustrate reading source data
  • PRUNING_DATA: Illustrate pruning data to minimal result set
  • FINALIZE_RESULTS Illustrate post-processing of result data
  • SUCCESS: Illustrate complete; results available

Other states include:

  • KILLED: Illustrate operation terminated by user
  • FAILURE: Syntax error in pigscript (details in error_message field)
  • GATEWAY_STARTING: Pig server starting (happens on first request in session)

Jobs

This section of the API allows you to run and monitor jobs.

Run a Job

Run a pigscript on a Hadoop cluster:

POST /v2/jobs

Parameters

For all projects:

  • cluster_size: Size of Hadoop cluster to launch (number of nodes). Only use if job should be run on a new cluster. Omit if this job should be run on an existing cluster.
  • cluster_id: The cluster_id of the cluster to run this job on. 'local' if the job should be run locally (and not on a Hadoop cluster). Omit if this job should be run on a new Hadoop cluster.
  • parameters: Pig parameters to pass to your script. Can be omitted if your script does not contain any parameters.
  • cluster_type: Which type of Hadoop cluster to launch. Only include if job should be run on a new cluster. Options include:
    • single_job: Cluster will be stopped when job is finished.
    • persistent: (Default) Cluster will remain up after job finishes, and will be stopped after being idle for 1 hour.
    • permanent: Cluster will remain up until stopped by user.
  • notify_on_job_finish: Set to true if you want to receive an email from Mortar when this job finishes. Default is false.
  • use_spot_instances: Whether to launch cluster using Spot Instances. Default is false.
  • pig_version: Pig version to use. Options are 0.9 (default) or 0.12.

For Mortar projects ONLY:

  • git_ref: Version of code (git hash or branch) to use
  • project_name: Mortar project to use
  • pigscript_name or controlscript_name: Script to use (without path or extension)
  • project_script_path: Optional. Path to the script being run relative to project root. Example: 'pigscripts/mongo'. Default is 'pigscripts' or 'controlscripts'.

For Web projects ONLY:

  • script_name: Name of web project script to run

Example Request Body

{
  "git_ref": "master",
  "project_name": "my_project",
  "pigscript_name": "coffee_tweets",
  "cluster_size": 10,
  "cluster_type": "single_job",
  "parameters": { "MY_INPUT_PARAMETER": "my_input_parameter_value",
                  "MY_INPUT_PARAMETER_2": "my other value" }
}

Response

200 OK
{
  "job_id": "4f4c4afb916eb10526000000"
}

Get Job Details

Get the details of a single job.

GET /v2/jobs/:job_id

Response

200 OK
{
  "job_id": "4f4c4afb916eb10526000000",
  "cluster_id": "4f4c4afb916eb10526111111",
  "git_ref": "master",
  "project_name": "my_project",
  "script_name": "coffee_tweets",
  "cluster_size": 10,
  "status": "success",
  "progress": 100,
  "outputs": [
    {
      "alias": "aggregates_by_stock",
      "name": "aggregates_by_stock",
      "records": 2853,
      "location": "s3n://my-output-bucket/2012-03-01/aggregates_by_stock",
      "output_blobs" [
        {
          "bucket": "my-output-bucket",
          "key": "my-output-bucket/2012-03-01/aggregates_by_stock/part-r-00000",
          "output_blob_id": "4f6749d1744ea111151399b4"
        }
      ]
    },
    {
      "alias": "aggregates_by_year",
      "records": 2853,
      "location": "s3n://my-output-bucket/2012-03-01/aggregates_by_year",
      "output_blobs" [
        {
          "bucket": "my-output-bucket",
          "key": "my-output-bucket/2012-03-01/aggregates_by_stock/part-r-00000",
          "output_blob_id": "4f6749d1744ea111151399b4"
        }
      ]
    }
  ],
  "start_timestamp": "2012-02-28T03:35:42.831000+00:00",
  "stop_timestamp": "2012-02-28T03:41:52.613000+00:00",
  "duration": "6 mins",
  "num_hadoop_jobs": 2,
  "num_hadoop_jobs_succeeded": 2,
  "script_parameters" : {
    "MY_INPUT_PARAMETER": "my_input_parameter_value",
    "MY_INPUT_PARAMETER_2": "my other value"
  }
}

The "main success scenario" for a job passes through status codes:

  • starting: Job received and queued, not yet validated.
  • validating_script: Checking the script for syntax and S3 data storage errors.
  • starting_cluster: Starting up the hadoop cluster
  • running: Running the job
  • success: Job completed successfully. Outputs are available from the "outputs" field

Error states include the following (error message will be found in the "error" field):

  • script_error, plan_error: An error was detected in the script before launching a hadoop cluster.
  • execution_error: An error occurred during the run on the hadoop cluster.
  • service_error: An internal error occurred.

Other states include:

  • stopping: User has requested that the job be stopped
  • stopped: Job is stopped
  • GATEWAY_STARTING: Pig server starting (happens on first request in session)

Since Hadoop may write the output of the job to multiple files, the output_blobs list contains one element for each output of the job.


Get Details for All Recent Jobs

Get the details of all your recent jobs. Limited by the number of days of job history retension supported by your plan.

GET /v2/jobs

Parameters

  • limit: number of jobs to return (optional: all jobs returned if not provided)
  • skip: job number to start with, ordered by descending start_timestamp (default: 0)

Response

200 OK
{
  "jobs": [
    {
      "job_id":"4f4c4afb916eb10526000000",
      "... All other job details as described above ..."
    },
    {
      "job_id":"4f4c4afb916eb10526000001",
      "... All other job details as described above ..."
    }
  ]
}

Stop a Job

Stop a running job. If launched on a single_job cluster, will cascade to stop the cluster too.

DELETE /v2/jobs/:job_id

Response

200 OK

Projects

This section of the API manages your Mortar Projects. Mortar Projects ONLY.

Create a Project

POST /v2/projects

Parameters

  • project_name: Name for the new Mortar project (should be a single word)
  • is_private: Whether the project should be private. Default is true.

Example Request Body

{
  "project_name": "my_new_project_name",
  "is_private": "True"
}

Response

200 OK
{
  "project_id": "51a81670d93ee663012c9945"
}

Get Project Details

Get the details of a single project.

GET /v2/jobs/:project_id

Response

200 OK
{
  "project_id": "51a81670d93ee663012c9945",
  "name": "my_new_project_name",
  "git_url": "git@github.com:mortarcode/123456789abcd_mortar_examples.git"
}

Get All Projects

GET /v2/projects

Response

200 OK
{
  "projects": [
    {
      "project_id": "51a81670d93ee663012c9945",
      "... All other project details as described above ..."
    },
    {
      "project_id": "11581670d93ee663012c9923",
      "... All other project details as described above ..."
    }
  ]
}

Validates

Run and fetch results for a Pig VALIDATE operation. Mortar Projects ONLY.

Run a Validate

POST /v2/validates

Parameters

  • git_ref: Version of code (git hash) to use
  • project_name: Mortar project to use
  • pigscript_name: Pigscript to use (without path or extension)
  • pig_version: Pig version to use. Options are 0.9 (default) or 0.12.

Example Request Body

{
  "git_ref": "755b260dcb3f5da001a7df52025025ca45c29598",
  "project_name": "my_project",
  "pigscript_name": "coffee_tweets"
}

Response

200 OK
{
  "validate_id": "51a81670d93ee663012c9945",
}

Get Validate Status and Results

GET /v2/validates/:validate_id

Response

200 OK
{
  "project_name": "my_project",
  "git_ref": "755b260dcb3f5da001a7df52025025ca45c29598",
  "script_name": "coffee_tweets",
  "validate_id": "51a81670d93ee663012c9945"
  "status_code": "SUCCESS",
  "status_description": "Success"
}

The "main success scenario" for a validate passes through status_code values:

  • QUEUED: Submitted, pending execution
  • PROGRESS: Validate operation in progress
  • SUCCESS: Validate complete; script is valid.

Other states include:

  • KILLED: Validate operation terminated by user
  • FAILURE: Syntax error in pigscript (details in error_message field)
  • GATEWAY_STARTING: Pig server starting (happens on first request in session)