Amazon S3 is the world’s most widely-used data storage service. It provides near-infinite file storage at extremely low cost, with frequent price reductions. It can store any type of file data with extreme durability (99.999999999% durable).
S3 is an excellent choice for file storage when using Mortar. Mortar’s Hadoop is powered by Amazon EMR, which is highly optimized for fast and free data transfer to S3. It’s easy to pull large data sets in from S3, crunch them in Hadoop, and then save the results back to S3 (or any other data destination).
Files in a can be loaded from S3 into Pig and stored back out again. Data can also be uploaded to Amazon S3 via many other clients, including the AWS Web Console and command-line tools like s3cmd or aws-cli, and also via SDKs in every major programming language.
Mortar’s EMR clusters run in the US-East region, so for fast and free data transfer between EMR and S3 your data should be stored in the US Standard region. This is the default location when you create a new S3 bucket.
If your data is not in the US Standard region you can run Mortar's EMR clusters in your own AWS account in any region.