The meteoric rise of cloud-based infrastructure and services has ushered in an era of convenience and flexibility for companies looking to quickly stand up solutions without massive capital expenditure. Databases, in particular, have seen a major shift towards fully-managed cloud offerings like AWS RDS and Azure SQL. But behind the ease-of-use and simplicity of these on-demand databases lies an inconvenient truth - the costs can quickly spiral out of control.
In this article, we'll analyze the turning tide away from cloud databases back towards on-premise installations as companies scale up. We'll look at specific examples of runaway cloud database bills, and highlight simple yet powerful on-prem database solutions like PostgreSQL. While cloud databases offer undeniable advantages, understanding their pricing model and lack of customisability is key. We'll equip technical leaders and decision makers with the knowledge needed to balance convenience and control when it comes to one of a company's most critical assets - its data.
The Allure and Hidden Costs of Cloud Databases
The rise of infrastructure-as-a-service (IaaS) and platform-as-a-service (PaaS) over the past decade promised a revolution in convenience and flexibility. Suddenly, spun up cloud servers, storage, and fully-managed databases with just a few clicks and a credit card. The days of procuring hardware, hiring specialized DBAs, and maintaining on-premise datacenters seemed over.
This newfound agility enabled startups and enterprises alike to quickly test ideas and stand up MVPs without massive capital expenditure. And databases saw some of the fastest adoption of cloud-based offerings. Services like Amazon RDS, Azure SQL, and Google Cloud SQL made it trivial to spin up production-grade managed databases in minutes. The benefits were undeniable:
- No infrastructure procurement and management
- Automated patches, upgrades, and redundancy
- Easy scaling and clustering
- Flexible pay-as-you-go pricing
However, convenience seldom comes for free. And as engineering teams happily handed off database management to AWS and others, the bills started piling up. The pay-as-you-go model means cloud database costs scale linearly with utilization. Many teams found themselves with monthly bills surpassing $25,000 to $50,000 for medium-sized production databases. Cost efficiency is not the cloud providers' topmost priority.
What's worse, the lack of customisability and reliance on proprietary technology stacks for cloud databases meant there was little teams could do to optimize these costs. They were locked into the cloud provider's predefined instance types and storage/IO options. For fast-growing startups and companies nearing IPO, this perceived predictability of costs was welcome. But over time, the conveniences came at a quickly rising - and often, surprisingly unpredictable price.
Horror Stories from the Cloud Database Trenches
As engineering teams happily offloaded database management to AWS and other cloud providers, most didn't fully grasp the exponential costs they were signing up for. Over time, inconvenient truths started emerging as monthly cloud database bills spiraled out of control.
Take for example Farmlogs, a startup providing crop planning and analytics solutions to farmers. They extensively relied on AWS RDS and DynamoDB to manage sensor readings and agricultural data analysis for customers. At first, the ease-of-use and automation enabled them to grow rapidly. But co-founder Jesse Vollmar recounted their gut-wrenching realization:
"Our AWS bills were getting out of hand. At one point we had 20 different RDS instances, both SQL and NoSQL databases, that were dedicated to individual microservices and data pipelines. Our monthly AWS bill peaked around $38,000 per month and was rising."
"The unpredictability kicked us hard too. One month we saw a sudden spike of $12,000 in unexpected DynamoDB overages due to a runaway process. We were locked into AWS’s pricing and instance types regardless of our actual needs."
And Farmlogs is not alone - unexpected spikes in cloud database bills ranging from $10,000 to $50,000 per month are all too common among startups as they scale. Often these costs aren't linear or fully predictable.
Even large enterprises hit roadblocks. Ride-sharing giant Lyft shared how their AWS bills doubled YoY to over $17 million as they neared IPO. And most concerning - over 35% of this spend went towards AWS services like RDS that couldn't be optimized, giving them little control.
Perhaps the most shocking statistics come from a 2019 survey by GitLab. They found that among their enterprise customers, the median monthly spend on databases and storage was a whopping $390,000 on AWS! And 21% reported spends exceeding $1 million per month purely on backend data infrastructure in the cloud.
The turnkey convenience of cloud databases comes at a crippling exponential cost over time. And most teams only fully realize it once their backs are against the wall. The next section looks at how the tide is increasingly turning back towards on-premise infrastructure despite the convenience trade-offs involved.
The Revenge of On-Prem - Taking Back Ownership and Control
As runaway cloud database bills threaten profitability, more engineering teams are looking back to on-premise infrastructure with renewed appreciation. After all, the same economies of scale that enable cloud providers to offer database-as-a-service also apply internally. The turning tide is driven by two main factors:
- Open source on-prem databases like PostgreSQL providing production-grade reliability and features for zero software licensing costs.
- The ability to tailor and optimize infrastructure specifically for internal workloads, without paying for generalized cloud infrastructure.
On-prem infrastructure needs some heavy lifting upfront. But the long-term payoff is control and cost-efficiency that public cloud cannot match. For example, Airbnb shared how migrating just 20% of their Redshift data warehouses back to optimized on-prem PostgreSQL clusters saved them a cool $10 million a year. That’s the cloud database tax for you.
Another great example comes Shopify. They switched to on-prem PostgreSQL to handle 500,000 merchants and peak loads exceeding 100,000 writes per second. By combining high-performant infrastructure with open-source PostgreSQL, their storage and compute costs dropped significantly even at massive scale.
The key to success lies in leveraging simple but robust solutions rather than getting distracted by new specialized cloud databases. PostgreSQL and MySQL (and even SQLite!) have decades of proof across every scale of workload. No need to bet the farm on new exotic datastores. Combine this with infrastructure tailored for your actual data patterns and volumes, and the economies add up tremendously.
Of course, the convenience and automation of cloud databases still make them a tempting offer for small workloads. But technical leaders should be aware that bringing database management back in-house is increasingly a viable and cost-efficient path at scale. One that puts control firmly back in your hands. The savings of even partially repatriating database infrastructure quickly justifies the operational overhead for most mid-size companies upwards.
In the end, every infrastructure choice involves complex trade-offs. As cloud data bills threaten to spiral out of control, on-prem databases are proving their mettle as battle-hardened solutions perfectly poised for a comeback!