Mortar has joined Datadog, the leading SaaS-based monitoring service for cloud applications. Read more about what this means here.

Setup a Redshift Cluster

The mortar-etl-redshift project comes with an example script to move data from an S3 bucket to a Redshift cluster. This article covers how to start up a Redshift cluster, but if you already have a running Redshift cluster you would like to use, you can skip this step and go to The Example ETL Pipeline.

Choosing Your Cluster

AWS charges by the hour for Redshift (see pricing). If you're unsure how large of a cluster you will need, start with the smallest Redshift cluster (Node type of dw2.large and cluster type of Single Node) for $0.25/hour. It is easy to upsize your Redshift cluster if you need better performance. To avoid incurring extra costs, be sure to stop your cluster when you are done with it.


Start Your Redshift Cluster

To start a Redshift cluster, follow the official AWS documentation for a step-by-step tutorial of what to do. You will need to do steps 1-3 and the first part of step 4. Be sure to place your cluster in the US East region for fast and free data transfer to Mortar's Hadoop clusters. Also, you do not need to worry about creating tables or running queries, as the Mortar ETL pipeline takes care of that for you.

The AWS documentation recommends using SQL Workbench/J to connect and query Redshift, but most other SQL clients will work as well.