The top six performance tips Mortar has to offer for your Hadoop instance on S3 are as follows:
- Organize your S3 bucket for speed
- Store fewer, larger files instead of many smaller ones
- When to and not to compress your data
- Avoid underscores in bucket names
- Stream data directly into S3 with Elastic MapReduce (EMR)
- Use partition-aware S3 keys
You can get more detailed information about these six performance tips on the original article.
To learn more about Netflix's Hadoop PaaS, Genie, and how Netflix built a performant, petabyte-scaled data center in the cloud, read about it here on their blog, or watch their presentation from Hadoop Summit 2013.