Last month I watched a VP of Engineering present a beautiful cost management slide deck to his board. Twelve charts. Three trend lines. A waterfall showing allocation by business unit. Real pretty.
Their AWS bill had gone up 34% in the same quarter.
He wasn't managing costs. He was performing cost management. There's a difference, and it's the reason most companies burn cash while congratulating themselves on their FinOps maturity.
Here's what "cost management" looks like at most companies:
1. Someone sets up Cost Explorer or a third-party tool 2. A weekly report gets emailed to a distribution list nobody reads 3. Finance asks engineering "why did costs go up?" once a quarter 4. Engineering says "we're growing" and everyone nods 5. Repeat
That's not cost management. That's cost observation with extra steps.
Real cost management happens at the decision point — the moment an engineer chooses an instance type, an architect picks a storage tier, a PM scopes a feature that requires 40 new Lambda functions. By the time it shows up in your dashboard, the money is already spent.
I've seen this pattern at least 50 times. The companies that actually control their cloud spend aren't the ones with the fanciest dashboards. They're the ones who changed how decisions get made.
Let me break this down with numbers, because I'm tired of vague advice.
At a typical Series B startup spending $80K-$150K/month on AWS, here's where cost overruns actually originate:
Architecture decisions (45-55% of waste): Someone chose RDS Multi-AZ for a staging database. Someone deployed a Kubernetes cluster for three microservices. Someone picked io2 EBS volumes because "we might need the IOPS later." These decisions compound monthly, and nobody revisits them. Default configurations (25-30% of waste): The Terraform module uses m5.2xlarge because that's what the template says. CloudWatch retention is set to "never delete" because nobody thought about it. The NAT Gateway is processing 4TB/month of traffic that could route through a VPC endpoint. Defaults are where money goes to hide. Orphaned experiments (15-20% of waste): The proof-of-concept from Q3 is still running. The load test infrastructure never got torn down. Three developers each have their own EKS cluster in the dev account. Nobody's job is to clean this up, so nobody does.Notice what's not on this list? "We need more compute because the business is growing." That's the excuse, but it's almost never the real answer.
Here's what I actually do when I start working with a team. Forget the dashboard tour. I ask three questions:
1. "Who approved your last infrastructure change, and did cost come up?"Nine times out of ten, the answer is either "nobody approved it" or "yes, someone approved it, but cost wasn't part of the review." Both answers tell me the same thing: cost is not part of the decision-making process.
If cost isn't in the PR review, the architecture review, the sprint planning — then it doesn't exist as a factor. Full stop.
2. "When was the last time you downgraded something?"Upgrades happen naturally. Traffic spikes, someone bumps the instance size, and the ticket gets closed. But the reverse almost never happens. Traffic normalizes, but the instance stays big. The database got upgraded during a crunch, but nobody scheduled the right-sizing afterward.
Companies that manage costs well have a regular cadence of downgrades. Not as a cost-cutting exercise — as basic hygiene. If you've never downgraded anything, you're not managing costs.
3. "Can any engineer tell me what their service costs to run?"Not the VP. Not the FinOps team. The engineer who owns the service. If they can't answer within an order of magnitude, you have a knowledge problem, not a tooling problem.
I'm going to save you the consultant-speak. Here are the three changes I've seen move the needle more than any tool purchase:
Not a gate — a signal. When a PR adds a new resource, show the estimated monthly cost in the PR comment. When a deploy increases the resource footprint, flag it. Engineers don't ignore data that's right in front of them. They ignore data that's in a separate dashboard they have to go find.
Tools like Infracost do this. It's not hard. The hard part is actually setting it up instead of talking about setting it up.
Not a quarterly initiative. Not an annual "cost optimization sprint." A monthly, 30-minute meeting where you look at the top 20 resources by spend and ask: "Is this still the right size?"
Keep a shared doc. Track what you reviewed and what you changed. Make it boring. Make it routine. That's how it becomes culture instead of a project.
Your staging environment does not need to mirror production. I will die on this hill.
I've seen companies running $15K/month staging environments that get used 8 hours a day, 5 days a week. That's 24% utilization. Schedule it. Scale it down. Use smaller instances. Nobody's running load tests on staging at 3 AM on Saturday.
The "but we need parity" argument falls apart when you realize most staging bugs aren't caused by instance size differences. They're caused by data differences, configuration differences, or timing issues that exist regardless of whether you're running m5.xlarge or m5.large.
Stop tracking "cost per month" as your primary FinOps metric. It's almost useless without context, and it makes cost look like a thing that happens to you instead of a thing you choose.
Track cost per decision. How much does each architectural choice add to the monthly bill? How much did that "temporary" workaround cost over the six months it stayed in production? What's the ongoing cost of the default configuration nobody questioned?
When you frame costs as decisions, people start making better decisions. When you frame costs as line items on a bill, people just feel guilty about the number and move on.
Cost management isn't a dashboard. It isn't a weekly email. It isn't a quarterly review.
Cost management is what happens when an engineer is about to spin up a new service and thinks — before they write the Terraform — "what's this going to cost, and is that the right tradeoff?"
If you're not changing how decisions get made, you're not managing costs. You're just watching them.
---
Sam Greene is a FinOps practitioner and founder of FinOps Fanatics. He helps startups stop bleeding cloud cash by fixing the decisions that cause it, not just the dashboards that report it. Got a cost management horror story? Connect with Sam on LinkedIn — he's collecting them.