60% Infrastructure Cost Reduction for a Growing SaaS Startup

Kevin Brown
Kevin Brown on
Cloud data center with some server racks fading out while others glow brighter, representing resource consolidation and optimization

Table of Contents

Client
Series B SaaS Startup
Industry
Software / SaaS
Project Type
Cost Optimization
Duration
3 months

Overview

We reduced a SaaS startup’s monthly AWS bill from $180,000 to $72,000—a 60% reduction — without sacrificing performance or reliability. The $1.3M in annual savings extended their runway by eight months and eliminated the need for the engineering layoffs the board had been discussing.

The Challenge

The client was a Series B SaaS company with 50 engineers and a product that had found strong market fit. Growth was good. The problem was that their AWS bill was growing faster than their revenue.

When I came in, they were spending $180,000 per month on AWS. That number had doubled in the past year, and nobody could explain why. The finance team had flagged it. The board was asking hard questions. The CEO was starting to talk about cutting engineering headcount to extend runway—the opposite of what a growing company should be doing.

The engineering team knew the bill was too high but didn’t have time to investigate. They were shipping features as fast as they could. Cost optimization kept getting deprioritized. And honestly, nobody had visibility into what was actually driving the spend. The AWS bill was a wall of line items that nobody understood.

The constraints were clear: we couldn’t slow down feature development, we couldn’t sacrifice performance (customers were paying for an SLA), and we needed to show results quickly. The board wanted to see progress within 90 days.

The Results

Three months later, the numbers told the story:

  • Monthly spend dropped from $180,000 to $72,000 A 60% reduction.
  • Annual savings of $1.3M Extended runway by eight months at current burn rate.
  • Zero degradation in performance P99 latency actually improved slightly due to right-sizing reducing noisy neighbor effects.
  • No reduction in reliability Uptime remained at 99.95% throughout and after the optimization work.

The breakdown of where the savings came from:

Savings breakdown by optimization category.

The engineering layoffs never happened. Instead, the company used part of the savings to hire two additional engineers.

The Approach

The first two weeks were pure discovery. I ran AWS Cost Explorer reports going back 12 months, audited resource utilization across every account, and checked the state of their tagging. The tagging audit was revealing — 40% of their spend couldn’t be attributed to any team or product because the resources weren’t tagged.

The analysis surfaced clear patterns of waste:

I structured the work in three phases:

  1. 1
    Weeks 1-2: Quick wins
    Delete the zombies. This is the easiest money you will ever save. We recovered $9,000/month just by cleaning up unused resources. No risk, no code changes, immediate impact.
  2. 2
    Weeks 3-6: Reserved capacity and right-sizing
    For workloads with predictable usage patterns, we purchased 1-year reserved instances. For everything else, we right-sized based on actual utilization data. This phase required more analysis but was still low-risk.
  3. 3
    Weeks 7-12: Architecture changes
    The bigger wins required code and infrastructure changes, moving batch processing to spot instances, consolidating dev environments, and implementing auto-scaling policies that actually worked.

The resistance came from the right-sizing phase. Engineers were nervous about smaller instances. “What if we get a traffic spike?” We addressed this by load testing the right-sized configuration before making changes. In every case, the smaller instances handled the load fine. The previous sizing had been based on guesses, not data.

The Solution

Beyond the immediate cost cuts, we built systems to prevent the problem from recurring:

  • Visibility tooling We implemented IBM Kubecost, a Kubernetes cost monitoring tool that attributes spend to namespaces and workloads, and set up custom CloudWatch dashboards for non-containerized resources. Every team could now see exactly what their services cost. Accountability changed behavior — teams started optimizing on their own once they could see the numbers.
  • Automated scaling The existing auto-scaling policies were either missing or misconfigured. We implemented target tracking scaling for the application tier and scheduled scaling for the dev environments, scaling to zero nights and weekends.
  • Spot instances for batch Their data pipeline ran nightly batch jobs that were perfect for spot instances: fault-tolerant, flexible on timing, and not customer-facing. Moving these workloads to spot cut that portion of the bill by 70%.
  • Governance We established tagging standards and enforced them through AWS Service Control Policies. New resources without proper tags could not be created. Monthly cost review meetings became part of the engineering rhythm, with each team responsible for their spend.

The dev environment consolidation deserves special mention. Eight separate environments became two shared environments with namespace isolation — a 75% reduction in infrastructure footprint (8 → 2). Developers could still work independently, but the compute, database, and networking costs dropped proportionally. Scheduled scaling meant those environments cost almost nothing outside business hours.

Key Takeaways

  • Start with visibility You can't optimize what you can't measure. The tagging audit and cost attribution work was unsexy but essential. Once teams could see their costs, optimization happened organically.
  • Quick wins build credibility Cleaning up unused resources in the first two weeks saved $9,000/month and proved we were making progress. That credibility made the harder conversations about architecture changes easier.
  • Right-sizing fears are usually unfounded Engineers oversize instances because they're worried about the unknown. Actual utilization data almost always shows massive headroom. Load test to prove it, then make the change.