Amazon Glacier May be the Hot Choice for Archival Data Storage
You Need Glacier
I’m going to bet that you (or your organization) spend a lot of time and a lot of money archiving mission-critical data. No matter whether you’re currently using disk, optical media or tape-based storage, it’s probably a more complicated and expensive process than you’d like which has you spending time maintaining hardware, planning capacity, negotiating with vendors and managing facilities.
If so, then you are going to find our newest service, Amazon Glacier, very interesting. With Glacier, you can store any amount of data with high durability at a cost that will allow you to get rid of your tape libraries and robots and all the operational complexity and overhead that have been part and parcel of data archiving for decades.
Glacier provides – at a cost as low as $0.01 (one US penny, one one-hundredth of a dollar) per Gigabyte, per month – extremely low cost archive storage. You can store a little bit, or you can store a lot (Terabytes, Petabytes, and beyond). There's no upfront fee and you pay only for the storage that you use. You don't have to worry about capacity planning and you will never run out of storage space. Glacier removes the problems associated with under or over-provisioning archival storage, maintaining geographically distinct facilities and verifying hardware or data integrity, irrespective of the length of your retention periods.
At this point you may be thinking that this sounds just like Amazon S3, but Amazon Glacier differs from S3 in two crucial ways.
First, S3 is optimized for rapid retrieval (generally tens to hundreds of milliseconds per request). Glacier is not (we didn't call it Glacier for nothing). With Glacier, your retrieval requests are queued up and honored at a somewhat leisurely pace. Your archive will be available for downloading in 3 to 5 hours.
Each retrieval request that you make to Glacier is a called a job. You can poll Glacier to see if your data is available, or you can ask it to send a notification to the Amazon SNS topic of your choice when the data is available. You can then access the data via HTTP GET requests, including byte range requests. The data will remain available to you for 24 hours.
Retrieval requests are priced differently, too. You can retrieve up to 5% of your average monthly storage, pro-rated daily, for free each month. Beyond that, you are charged a retrieval fee starting at $0.01 per Gigabyte (see the pricing page for details). So for data that you’ll need to retrieve in greater volume more frequently, S3 may be a more cost-effective service.
Glacier In Action
I'm sure that you already have some uses in mind for Glacier. If not, here are some to get you started:
- If you are part of an enterprise IT department, you can store email, corporate file shares, legal records, and business documents. The kind of stuff that you need to keep around for years or decades with little or no reason to access it.
- If you work in digital media, you can archive your books, movies, images, music, news footage, and so forth. These assets can easily grow to tens of Petabytes and are generally accessed very infrequently.
- If you generate and collect scientific or research data, you can store it in Glacier just in case you need to get it back later.
Get Started Now
Glacier is available for use today in the US-East (N. Virginia), US-West (N. California), US-West (Oregon), Asia Pacific (Tokyo) and EU-West (Ireland) Regions.
Interesting. Looks like a great near-line storage solution. Instead of shelves of USB drives, this looks like a much more viable storage solution. I wonder how long until a backup solution, like Carbonite or like/new service/utility, uses this? I'll bet it's weeks if not days. I think this would be a perfect place to store system/data backups for my PC's. Now if this could be hooked up to the new backup service being tested with Windows Server 2012? hmm...