5 steps to prevent overspending in the cloud


Mohammad Abulhouf, General Manager, Saudi & Bahrain at Nutanix gives organizations 5 tips to gain back control and optimize cloud spending

As organizations mature their multicloud strategies, moving more and more of their workloads into the cloud, they often find themselves in an uncontrollable situation of budget overruns and this is ironic, as one of the reasons for transitioning to the cloud is its superior cost efficiency with the on-demand, pay-per-use nature of the public cloud.

Cloud costs are unpredictable, unlike traditional IT infrastructure. In a Platform-as-a-Service (PaaS) public cloud, a single application stack could be made up of many different services and resources—each with its own pricing model. Costs for compute, storage, network, or other services can vary by vendor, resource type, levels of service, and usage. On top of that, provisioning of cloud services is usually distributed across the enterprise, so managing and controlling costs in a multicloud environment can quickly get out of hand.

  1. Understand Cloud Vendor Pricing and Post Purchase Analysis

The starting point before moving forward with a cloud vendor, is to make sure you understand their pricing model – including all the “hidden” costs for various API calls and other transactions so you can make a fair comparison and choose the best vendor for your use cases.

Once you start using the cloud, identify those services that are the highest contributors to your monthly spend. Rationalize usage to either justify costs or identify opportunities for optimization. Over-utilization of a service could be due to a coding error in an application. Also, since cloud services are available at the click of a mouse, some services may have been provisioned and forgotten about. These orphaned or unused resources can waste thousands of dollars. Identifying them can deliver a quick cost-savings win.

  1. Clean Up Unused Resources and Optimize

After identifying unused resources it’s time to clean them up. Make sure key resources are backed up before deleting anything. Based on the usage patterns you’ve identified, slowly begin to scale down resources to the next smallest size to ensure you don’t impact application performance. Continue with this incremental reduction until you reach a point where the workload runs at the desired performance with the minimum infrastructure size.

Archaic data can be deleted or moved to lower-cost storage tiers. This will not only reduce immediate costs but moving data from primary to archival storage will also save you money on your ongoing backup and disaster recovery costs. Initiating a data management lifecycle policy will help to ensure that the issue doesn’t bubble up again. For non-production workloads, consider limiting access to on-demand services. Automating start/stop workloads based on your developer’s needs will help clamp down on unnecessary or unauthorized service requests whose costs can add up quickly.

Lastly, since most cloud providers offer volume discounts, consolidating multiple accounts with the same provider is an easy way to collect on free money.

  1. Assign Workloads and Budget Responsibility to Respective Business Units

A good best practice is to assign workloads to the business unit (BU) or other functional area that is responsible for them. This will not only enable you to chargeback for services, but the BU can help with more accurate forecasting. Budgets can be created based on the business knowledge and requirements of the teams that are using the services. BU owners can be made accountable for ensuring that resources remain optimized and within budget.

  1. Establish your Baseline for Better Capacity Planning

Now that you’ve optimized existing services, it’s possible to get a baseline for your infrastructure requirements. Based on your usage patterns and the purpose of the workload, segregate them into categories such as stable, variable, long term, and short term.

Identify performance requirements within each category. You may have workloads that need to scale out to meet certain performance metrics. You can identify such workloads and their capacity requirements by having your engineering teams perform load and performance testing.

Once you’ve identified stable workloads for each of your environments you can plan for reserving the correct level of capacity. Reserving long-term capacity can usually save you substantial money. If you have workloads running across multiple accounts on-demand in different time zones, it may be worth looking into reserving some capacity for dynamic workloads.

  1. Define Your Cost Governance Policies and Re-evaluate Regularly

You’ll want to deploy a suitable cloud management platform to help automate much of this analysis and optimization activity, as well as other routine tasks that can help further reduce costs.

For example, by leveraging the tagging capabilities of a cloud management platform, you can easily identify all “non-essential” services and set them to automatically shut down during off hours.

Most cloud management platforms enable you to send cost summaries and billing alerts. Set up your system so that BU owners receive a cost summary daily. Set budget alerts to notify them when spend is getting close to their allotted budget to avoid overages.

Share your cost governance policies so that each BU owner knows what actions to take for proactive cost control. While each organization is different, some basic policies could include:

  • Review cost summary daily and take action on quick wins
  • Provision infrastructure only through automation and limit access to the ability to provision new services
  • Prioritize use of reserved resources before provisioning new infrastructure
  • Standardize on the tags you use to track and manage resources or create reports
  • Ensure that tags are assigned to each provisioned resource

Define all necessary usage, spend, and other customized reports required by your organization, and develop a practice of reviewing them and taking action.

Cloud cost management and control can be an overwhelming and time-consuming task. By following this “analyse, eliminate, optimize, reserve and repeat” model, you will be set up for success from the beginning to keep cloud services optimized for cost-effective business agility.

To truly stay ahead with the best technology at the least cost—and avoid cloud bloat syndrome—make sure to re-evaluate your architecture, technology stack, and your vendor partner relationships on a regular basis.

Leave a reply