Monthly Archives: August 2015

Analysis of Amazon EC2 Costs, Part 1

I work for OnPrem Solution Partners and help to provide technology leadership through their Data & Analytics practice.

We often discuss cloud service providers for various solutions with our clients, and of course the most common cloud service provider today is Amazon Web Services. Two of the services that we most commonly employ are EC2 (for server instances) and RDS (for database hosting.) EC2 allows requests to be made for servers to be made available according to various hardware characteristics (e.g. # of CPUs, GiB of memory, various storage sizes) and also according to operating system installed. RDS allows you to request an instance of various kinds of databases to be made available without having to go through the work of installing them manually yourself. Oracle (Standard Edition), MySQL, PostgreSQL, Aurora, and SQL Server (various flavors) are available. There is also Amazon Redshift. Redshift is not strictly part of RDS per se, but it still allows for DBaaS (Database as a Service) to be paid for in a sense.

These services are great for providing very quick provisioning of services. EC2 instances can be up in less than 5 minutes after submitting the request to AWS. RDS instances are usually available in the neighborhood of 30 minutes or less. However, there are always cost questions. While it is easy to spin up these instances, how much will they cost over time? And what are the various criteria that affect our cost choices?

Let’s look at raw EC2 instances first, broken down by instance type and a few other categories. We will then later look at using RDS and the relative costs of RDS vs. base EC2. I am ultimately interested in determining the relative costs of running various database platforms in RDS vs. EC2 and also the impact of bringing your own license (BYOL) vs. paying hourly to Amazon for licenses, but we will not get to all of that today — one step at a time.

I obtained this data from Amazon’s website giving pricing information here: https://aws.amazon.com/ec2/pricing/. All prices are for the U.S. West (Oregon) region, and are for the Linux/Unix option so as to remove licensing costs involved with using Windows, RHEL, or other options.

This post is not intended to be a full explanation of all EC2 options. It is a complex system and infrastructure. I purposefully am not including storage costs and any network transmission costs. Amazon does not charge to send data IN to their cloud computing environment, but you do have to pay to send data out. Databases also have additional needs such as backup, etc., and these costs are not included here.

First, raw data is in this spreadsheet. The spreadsheet was last updated from Amazon’s website as of 8/7/2015. There are three worksheets in the spreadsheet.

  1. Amazon RDS Prices – Information about costs of RDS instances, including the:
    1. Pricing model – all On-Demand for now, but I may go back later and fill in more information regarding other pricing models and costs saved if you pay for Reserved Instances or even Spot Instances.
    2. Database type (e.g. Oracle, PostgreSQL, Aurora, etc.),
    3. Hourly price for RDS,
    4. Hourly price for same hardware on EC2,
    5. Whether the instance runs in a Single Availability Zone or Multiple Availability Zone configuration, and
    6. Database Edition, which is relevant for platforms such as Microsoft SQL Server which have multiple editions that you can pay for licenses to Amazon.
  2. Amazon EC2 Prices – Information about the costs of EC2 instances, including the:
    1. Instance type,
    2. Number of virtual CPUs,
    3. Number of EC2 compute units,
    4. Memory in GiB,
    5. Instance storage in GB,
    6. Cost for usage running Linux/Unix (eliminating any license costs for the underlying OS to simplify our comparisons), and
    7. Tier of the instance (e.g. General Purpose, Compute Optimized.)
    8. Generation – Amazon packages the hardware in instance “units” that we can easily rent without having to care in detail about the underlying hardware platform (e.g., is it an Intel or an AMD processor, a Dell or an HP or other box, etc.). However, they refresh and update their hardware offerings periodically – so now the current unit available is a t2.micro, but you can still look at pricing for t1.micro and compare. It is helpful to see in some cases how costs and pricing have changed over successive generations of hardware (although this is not intended to be a full-blown analysis of generational change and costs.)
  3. Instance Hardware Detail – a lookup worksheet for the other two so I can do VLOOKUPs in Excel and not have to hand-type all of the hardware information in.

So, a first simple question. What is the range of possible costs for EC2? Let’s break this down by the various “tier size” of instances they have – that is, “Micro,” “Small,” “Extra Large,” etc. Then subdivide that by the purpose of the instance – is it optimized for general usage, computation, storage, GPU, or Memory? I have also included some previous-generation machines in this comparison as some RDS options still appear to run on previous generations of hardware (e.g. Oracle.) For comparison purposes we will therefore need to know the prices for these previous-generation EC2 instances.

Note that the chart has logarithmic axes and starts at $100.00/year as the lowest-cost instance a t2.micro – clocks in at $112.32/year ($.013/hour.) The most expensive instance is a storage-optimized i2.8xlarge at a breathtaking $58,924.80/year ($6.82/hour.) I suppose you get what you are paying for with 32 vCPUs, 244 GiB of memory, and 8 800 GB SSDs. So you have a huge range of pricing to choose from, depending on what you are interested in doing. Well, what kinds of tasks would we like to do? Which hardware instance is the best choice for us if we were to evaluate purely based on instance cost? More specifically:

  1. What price are we paying on a yearly basis for vCPU?
  2. What price are we paying on a yearly basis for ECUs?
  3. What price are we paying on a yearly basis for memory per GiB?
  4. What price are we paying for instance-attached storage?
    • This is a little bit more challenging because some Amazon instances don’t come with any built-in storage and you need to rely on EBS. In some cases EBS might actually be superior for your purposes but that is a longer discussion. Also, there are varying sizes and types of disks and varying numbers in instances. E.g. a compute-optimized instance like a c3.large comes with two 16 GB SSDs. Storage optimized instances like a d2.xlarge come with three 2TB standard HDDs (spinning disk.) It’s rather hard to compare these numerically. So I took a simpler method and simply multiplied the disks available by the amount each disk would hold for an instance with internal storage.

This visualization shows the relative costs for all four of these criteria, broken down by the Tier Size, Purpose, Generation, and instance type. Note that vCPU and ECU yearly costs are shown on the same dual-axis chart, where memory and storage costs each have their own separate pane on the chart. A few caveats:

  1. Some instance types have the number of ECUs marked as “Variable.” I left these as as NULL in my source data because you can’t compare this to the other instances which do have published numbers of ECUs. Tableau marks this as $0 in the chart.
  2. Some instance types are marked as “EBS Only” for storage. I left these as NULL values. SSDs and HDDs are marked differently.
  3. vCPUs are green and ECUs are red. Increasing intensity of color on any chart means a greater total amount of whatever is being measured. E.g., an m4.10xlarge has a total of 40 vCPUs, and so has a very dark green circle. Lower end-machines tend to have low quantities of memory, disk, and vCPUs/ECUs and therefore have relatively faint circles.

A few reflections:

  1. The lower-end machines (Mediums and below) tend to be lower then average on VCPU and ECU costs. They bounce around a bit on memory costs, and aren’t that relevant for storage costs because they generally use EBS with a couple of exceptions.
  2. Amazon really means it when they label their instance types.
    1. For example, the average GB cost/year is $31.44. Storage costs for Storage Optimized instances are much lower – as low as $.99/year for spinning disk and $9.21/year for SSD. Amazon is giving a roughly 10x premium for SSD versus spinning disk. However, regular HDDs are only available in Storage Optimized Extra Large instances, which otherwise are fairly expensive. The least expensive Storage Optimized option with regular HDDs is a d2.xlarge, at $5,961.60/year. You could get 10 m3.mediums for that (although they would still have much less storage.)
    2. The average vCPU cost/year is $738.22. However, you can get down to roughly 2/3 of that cost (~$450/vCPU/year) if you go with any of the compute optimized-instances. On the other hand, if you do not choose wisely and end up running CPU-heavy workloads on an extra-large Storage-Optimized instance, you will then end up paying between $1325 and $1840 per vCPU/year.
    3. Memory costs follow the same pattern. Memory-optimized instances are as low as $99.15/GiB/year which is half of the average of $193.02/GiB/year. Choose a compute-optimized, or even worse, a GPU-optimized instance, and you could end up paying between $240 and $375/GiB/year.
  3. Generation 3 of Compute-Optimized servers have SSDs, but Generation 4 only allows EBS connections. Seems like Amazon wants to encourage users of Compute-Optimized systems to move to EBS.
  4. vCPUs and ECUs roughly track each other by some coefficient. vCPUs vs ECUs and measurement of computing performance on any virtual machine when you are intentionally isolated from the underlying hardware is a whole other discussion — I won’t get into details here — but whoever is doing pricing and hardware allocation for Amazon is clearly trying to lay out some consistent price/performance relationship between the two.

Now, let’s look at the overall costs for EC2 instances broken down by purpose, tier size, generation, and instance type. If you are just trying to price out a few instances this is probably a more interesting chart for you.

 

I don’t think this chart has any huge surprises, although it is interesting to see how radically instance costs vary. Small and medium instances go between $112.32/year (t2.micro) to $578.88/year (m3.medium.) On the other hand, high-end systems like an m4.10xlarge can set you back $21,772.80/year, and the storage-optimized instances can be between $47,692 and $58,924.80/year!

Final conclusions:

  1. An instance is not an instance is not an instance. In other words, there is a huge differential between the smallest and the largest instances in terms of horsepower (in many different metrics) and cost.
  2. If you were to sit down and price out what real hardware would cost, Amazon would not always be competitive, particularly if you know you are going to keep doing what you are doing for the long run. Amazon does not always have the cost edge. However — you’re not just paying for hardware here — you’re also paying for the labor of setting up and configuring everything at the hardware level — racks, a lot of (although not all) network installation and configuration, redundancy if you want it — and at a much more efficient cost than any but the largest enterprises could do it at. You’re also getting quicker deployment then you could ever do in-house, and the ability to change your mind in a week or a month and pay only for what you have used.
  3. It pays to be very aware of what real processing needs your application has – CPU, disk storage, and/or memory utilization and choose instances carefully. Amazon costs for a more expensive extra-large instance could easily exceed $100/day.

Thoughts for a future blog post:

  1. Relative network performance. I did not factor this into instance choice. Lower-end instances have less capable network interfaces.
  2. RDS vs. base EC2 costs. What is the premium that you are paying for using RDS to provision your databases for you versus doing the work and maintenance yourself? And what are you saving?
  3. Varying costs of different regions. Not all regions are priced the same. For example, a t2.micro in the Asia Pacific (Tokyo) region is .02 cents/hour, and an ix.8xlarge is $8.004/hour. Compare to U.S. West (Oregon) in which the t2.micro is $0.013/hour and the ix.8xlarge is $6.82/hour. Tokyo commands a ~50% premium for the t2.micro and a ~17% premium for the ix.8xlarge.
  4. Relative impact of OS licensing on EC2 prices. What premium is being paid to license Windows vs. RHEL vs. base Linux? I purposefully used Linux/Unix for this analysis as the licensing cost is presumably zero to Amazon to be able to isolate hardware costs for them.