Download the Essential Cloud Buyer’s Guide to learn important factors to consider before selecting a provider as well as buying criteria to help you make the best decision for your infrastructure needs, brought to you in partnership with Internap.
from Hadoop-as-a-Service (HaaS?) platform Mortar
lists six tips to help you speed up your Hadoop instance running on the Amazon Simple Storage Service (S3). If you follow the world of Hadoop, you probably know that Netflix runs Hadoop on Amazon S3 and their own Hadoop PaaS, Genie
The top six performance tips Mortar has to offer for your Hadoop instance on S3 are as follows:
- Organize your S3 bucket for speed
- Store fewer, larger files instead of many smaller ones
- When to and not to compress your data
- Avoid underscores in bucket names
- Stream data directly into S3 with Elastic MapReduce (EMR)
- Use partition-aware S3 keys
You can get more detailed information about these six performance tips on the original article.
To learn more about Netflix's Hadoop PaaS, Genie, and how Netflix built a performant, petabyte-scaled data center in the cloud, read about it here on their blog, or watch their presentation from Hadoop Summit 2013.
The Cloud Zone is brought to you in partnership with Internap. Read Bare-Metal Cloud 101 to learn about bare-metal cloud and how it has emerged as a way to complement virtualized services.