AWS Cost Optimisation: How We Cut a Customer's Bill by $7k

Last year we saved one of our clients $7,000 on their AWS bill through a simple AWS cost optimisation audit – no infrastructure overhaul needed.

Running your systems in a cloud environment is great. Instant scaling, built-in monitoring and security tools, managed services. All that makes it the first choice for almost every company that needs to deploy something more interesting than a WordPress website 🙂

Unfortunately, the ease of use that cloud environments provide can sometimes be a trap. Today you decide to deploy a 4-CPU, 16 GB RAM Windows Server, and at the end of the month you owe AWS $630 + VAT – and most of the time, no one can really explain why.

The 6 most common AWS cost mistakes

After working with over 80 customers and building both greenfield projects as well as maintaining legacy ones, the most common mistakes I’ve seen are:

Not reading / not understanding your AWS bill – you connect your credit card, then the development team does its magic. The app works. AWS debits your card. But if no one is scrutinising the AWS statement at the end of the month, nobody knows whether the usage is normal or abnormal. Nobody questions whether anything can be done to pay less, or which services could simply be switched off.
Not monitoring your resources and overprovisioning – developers tend to schedule more resources than are needed, because “just in case”. Of course, it is wise to assume the worst-case scenario. But it’s equally important to understand how likely it actually is. On top of this, AWS provides many autoscaling options that can increase available resources during peak demand.
Selecting configuration options without understanding the price impact – a good example of this is the Multi-AZ option when configuring a database. If you select it, your system will be more durable. There will always be a standby database instance in a different AWS region, ready to take over if the current one fails. Sounds amazing, yes? It definitely is, however, you will be paying 2x for the database if you select it and the other one will stay idle for the whole time.
Not using a CDN and/or cache – means your services will only be under load if any requests reach them. By leveraging a CDN like CloudFront and the caching services AWS supports, we can reduce the resources your application uses.
Not using AWS saving plans – if you don’t have plans to significantly change your core infrastructure setup, you should think about AWS’s savings plans. They allow you to reserve a certain capacity up front. Usually, even without an upfront payment, you’ll save around 20%.
Running non-production environments 24/7 – unless your company is an empire over which the sun never sets, your tech staff works 8-10 hours a day. So the dev and staging environments can be turned off for the night (AWS bills hourly for most services).

The AWS cost optimisation audit: what we found

So, keeping these 6 most common mistakes in mind, how did we save our customer $7k a year on AWS?

1. Database: r5.large MySQL instance with Multi-AZ deployment enabled

Monthly cost: $416.64 ($208.32 x 2 for Multi-AZ)
Changed to t4g.medium without Multi-AZ: $53/month
Storage reduced from 200GB ($26.60) to 25GB ($3.33)
Total savings: $386.92/month, $4,630.98/year

2. Unused EC2 server instance: c7i.large

Monthly running cost: $78.90
Monthly storage cost: 100GB at $11.60
Total savings: $90.50/month, $1,086/year

3. Switching ECS tasks architecture from x86 to ARM64

Total cost before: $205.27/month, $2,463.24/year
Total cost after: $164.27/month, $1,971.24/year
Total savings: $41/month, $492/year

4. Disabling enhanced monitoring and setting log retention limits in CloudWatch

Total cost before: $55/month, $660/year
Total cost after: $0
Total savings: $55/month, $660/year

After applying all of the above, the total savings are 386.915 + 90.5 + 41 + 55 = 573.415 USD/month, 6,880.98 USD/year.

Step-by-step: reduce your AWS bill

If you want to perform similar analysis, these are the steps you should start with:

Compute resources sizing – CPU. For compute resources such as EC2 instances, ECS containers and RDS databases, check the monitoring section for the respective service. Everyone of them tracks CPU usage. I usually check the charts using a 3 to 6 week time range. This gives you a nice overview of what was the actual real usage over a prolonged period of time. If I see that the CPU usage on average is not higher than 20% for the most of the time, that’s a strong signal to downsize.
Compute resources sizing – memory. With memory it is a bit more complex challenge since as a standard measure EC2 instances don’t have a monitoring graph. ECS shows used memory and RDS shows free memory. On top of that, ECS resources offer some flexibility when adjusting CPU and memory size. For EC2 and RDS, you are limited to predefined values. In general, I’d say that for ECS containers, I’d target around 20% free memory on average. For the RDS, it is better to leave some more, e.g 40% since it is the memory size that usually is more important for DB speed than CPU power.
Compute resource architecture and instance type choice – in general, ARM64 based resources are more performant (AWS claims 20% better performance). So if your code can run on ARM (JAVA, Python, Ruby, Node and many more can), then you should choose it by default. EC2 and RDS instances based on ARM usually have “g” for Graviton (which is AWS’ processor name) at the end e.g t4g.micro. Besides that, when it comes to a choice of the actual instance e.g t instances like the before mentioned t4g and the rest, the basic difference is that the t instances are good when for the most of the time you don’t use a significant amount of CPU power and there are some rare “spikes”. If this is the kind of load you have, then “t” instances are generally cheaper. Other instance types are more expensive but are expected to handle significant loads all the time.
Storage – the important thing to remember is that it’s easy to increase and hard to decrease, hence better start small. To decrease the database storage size mentioned earlier, we had to stop the system entirely. We then migrated the data to a new instance with a smaller storage setup before starting again. Stopping the system is not something you’d really want to do, especially that AWS RDS has an option to automatically scale the storage in small increments.
Your AWS bill – it’s last, but not the least and actually something you should be starting your analysis from. First look at the most expensive services and go through points 1-5 to reduce the usage. Then look for other items on the list, always question if a certain service is needed or not. Once you do your cleanup, monitor and observe if everything meets your speed/stability/cost KPI’s. When you are confident of the setup, consider AWS saving plans to reduce the bill by another 20%.

Hopefully the above guide will help you in reducing your AWS bill. It is worth remembering that these are probably the simplest things you can do, even though they are quite effective. Once you apply all of the changes, set yourself an AWS Budget Alert for the amount you expect to pay every month, if at some point the costs start growing again, you’ll be warned beforehand.

If you’d rather have someone else do the digging, we’re happy to help – get in touch with us!

Disclaimer – we based all calculations on official AWS pricing for the London region for the time the article was published.