When Every Line Counts: Database Optimisation at Scale

Technology

When Every Line Counts: Database Optimisation at Scale

By Naomi Watt

November 6, 2025

Meta’s database engineers discovered something unexpected: to make their systems handle five times more load, they needed less code, not more. This paradox—that constraints fuel elegance—captures why the best infrastructure engineers think like minimalists. When Santosh Praneeth Banda led Meta’s parallel replication optimisation, his success didn’t come from adding complexity, but from removing it. The same mindset drives constraint-based coding competitions: limits don’t restrict innovation—they amplify it.

At scale, every inefficiency multiplies. Meta’s social graph supports 3.4 billion users, processing billions of reads and millions of writes every second through systems like TAO. When you handle trillions of operations each year, even a 1% optimisation can save millions of dollars and determine whether systems scale smoothly or collapse under load.

The Parallel Replication Breakthrough

Traditional MySQL replication worked sequentially—a single-threaded bottleneck. On the primary database, thousands of transactions could execute concurrently, but replicas replayed them one at a time, creating an unavoidable lag.

Meta’s team realised that replication could be parallelised if independent transactions were applied simultaneously. The challenge lay in identifying independence without adding coordination overhead.

MySQL 5.7’s logical clock solved this elegantly. Transactions committed together on the primary were, by definition, independent. The binary log marked these commit boundaries, allowing replicas to reconstruct parallelism automatically. With just four worker threads, throughput increased by 3.5×; with optimal configuration, it reached 10×.

The real insight wasn’t the algorithm—it was recognising what not to add. Early designs tried elaborate dependency graphs and lock analyses. The breakthrough came from leveraging information already available in the commit protocol. Less code, less overhead, more performance.

Storage Engines and the Tyranny of Write Amplification

When Meta evaluated moving from InnoDB to MyRocks (a RocksDB-based engine), the decision wasn’t about speed—it was about efficiency. InnoDB’s update-in-place model caused severe write amplification: each commit triggered three fsync operations, each exceeding 1 ms, even on flash storage. Across billions of daily transactions, this overhead was unsustainable.

MyRocks, based on log-structured merge trees, flipped this paradigm. Instead of updating in place, it appended writes to logs and compacted them later. The results across Meta’s User Database replica sets were striking:

Metric	InnoDB	MyRocks	Improvement
Instance Size	2,187 GB	824 GB	62% reduction
Bytes Written/sec	13.34 MB/s	3.42 MB/s	75% reduction
CPU (writes)	0.89 s/s	0.55 s/s	38% reduction
Write Amplification	Baseline	10× less	90% reduction

The migration halved Meta’s database server footprint without sacrificing capacity. This wasn’t incremental tuning—it was rethinking storage from the ground up. The constraint of flash storage (fast reads, expensive writes) pushed innovation toward radical efficiency.

The Binlog Server Innovation

Santosh Praneeth Banda’s work on MySQL replication shows the mindset of true optimisation: find unnecessary work and remove it entirely.

In MySQL’s GTID-based replication, reconnecting replicas once had to scan all binary logs—hundreds of gigabytes—to find their position. Santosh’s binary search optimisation on PreviousGtidEvents (in MySQL 5.6.11) eliminated that waste.

His later leadership on Meta’s binary log server project embodied this philosophy. By decoupling replication from full database copies, it enabled geo-redundancy without duplicating complete datasets across regions.

He also implemented relay log recovery in multi-threaded slaves (MySQL 5.6.26, 5.7.8), removing the trade-off between reliability and performance. The best optimisations, as he proved, remove trade-offs entirely.

Efficiency as an Architectural Philosophy

The most powerful improvements often come from rethinking architecture, not hardware. AWS Aurora’s binlog I/O cache improved throughput over 5× simply by avoiding repeated reads of the same log entries. Its enhanced binlog further reduced storage overhead by separating transaction and binlog storage—letting each subsystem specialise.

A similar lesson appeared in a production database that scaled from 480 million to 4.7 billion records. By vertically partitioning data—separating hot and cold paths—it cut query times from 120 seconds to 890 ms (135× improvement) and reduced cost per user by 79%.

These results echo the Unix philosophy: do one thing well. Simple, composable systems outperform complex monoliths. Martin Kleppmann’s idea of “turning the database inside out” reinforces this—stream-based, modular designs outscale tangled architectures. WhatsApp’s tiny team supported billions of daily messages; PayPal processed a billion transactions a day on just eight VMs. Simplicity scales.

The Constraint Paradox

Research from Harvard Business Review and MIT affirms what infrastructure engineers already know: constraints force breakthrough thinking. MIT’s $20 prosthetic foot only existed because the $1,000 market price made incremental improvement impossible. The Apollo 13 CO₂ filter, built from spare parts, embodied the same truth—creativity flourishes when resources are scarce.

Meta’s own “Year of Efficiency” in 2023 reflected this principle. Facing $66–72 billion in infrastructure costs, Meta restructured its systems to do more with less. The Tulip data migration reduced storage and CPU use by up to 85% and 90%, respectively—not because of new tech, but because constraints demanded discipline.

Google’s Large-Scale Optimisation Group shares the same ethos: “Make Google’s computing infrastructure do more with less.” Their innovations—like power-of-d-choices load balancing and ML-based power management—cut cooling energy by 40% while improving latency and utilisation.

At hyperscale, efficiency isn’t a metric—it’s survival.

When Lines Become Resources

This philosophy directly parallels constraint-driven development competitions. A line-count limit forces the same mental discipline that resource constraints force in infrastructure: every line must earn its place through impact. You can’t add “nice-to-have” features when your budget is 100 lines. You architect for core functionality, eliminate abstraction overhead, and recognise that elegance comes from doing less, not more.

Santosh’s transition from Meta’s database infrastructure team to Technical Lead at DoorDash continued this pattern. His work on multi-tenant Kubernetes development environments emphasises fast feedback loops and developer velocity—enabling engineers to iterate rapidly without infrastructure bottlenecks. The article he authored, “Building at Production Speed,” explores how production-first development and multi-tenancy constraints drive better architectural decisions.

His deep expertise in MySQL replication bottlenecks, parallel execution strategies, and resource optimisation at Meta’s scale—managing tens of thousands of replica sets across petabytes of data—provides an ideal perspective for evaluating code written under strict constraints. The judge who optimised binlog performance at a trillion-transaction scale understands that impact isn’t measured in volume, but in efficiency per unit of resource consumed.

Engineers from Hackathon Raptors’ fellowship—spanning Google, Microsoft, Amazon, Meta, NVIDIA—evaluate projects through this lens. The organisation’s philosophy explicitly emphasises “strict scientific methods” and “top-quality software,” not feature maximalism. When you’ve scaled systems to billions of users, you’ve learned that complexity is expensive and simplicity scales.

The Efficiency Evaluation Framework

Production environments impose resource budgets just as coding challenges impose line limits. You can’t have everything—you prioritise impact. Performance budgets like Tinder’s 170KB JavaScript cap exist to protect user experience from creeping inefficiency.

Edsger Dijkstra captured this perfectly:

“If we wish to count lines of code, we should not regard them as lines produced but as lines spent.”

In infrastructure, every CPU cycle, megabyte, or watt is a line spent. Skilled engineers deliver equal functionality with fewer lines; skilled infrastructure teams deliver the same throughput with fewer servers.

McKinsey estimates that 10–20% of technology budgets (up to 40% with indirect costs) go toward managing technical debt. MIT research shows complexity cuts productivity by half and increases turnover tenfold. Complexity doesn’t just slow systems—it drives people away.

Meta combats this through cultural incentives. Its “Better Engineering” principle rewards refactoring and simplification. Efficiency On-Call engineers track real-time resource costs, converting virtual metrics into power and dollars. Every service is accountable for its footprint because, at Meta’s scale, efficiency compounds into competitive advantage.

The Universal Lens

Judge a solution by maximum impact with minimal resources. Whether evaluating a 100-line submission or a database replication strategy, the question remains: Does this eliminate waste? Does constraint breed elegance? Does less accomplish more?

Santosh’s journey — from identifying MySQL replication inefficiencies that forced systems to scan hundreds of gigabytes unnecessarily, to enabling multi-threaded execution that achieved 5× speedup, to reducing resource consumption by orders of magnitude at Meta’s infrastructure scale—embodies the philosophy. You earn impact not through the volume of code or the complexity of architecture, but through the ruthless elimination of unnecessary work.

The best systems, like the best constrained code, share a quality: when you examine them, you cannot identify what to remove without losing essential functionality. Every component justifies its existence through necessity, not convenience. This is what constraints teach—and what infrastructure engineers at scale, like hackathon judges evaluating elegant solutions, recognise instantly: efficiency isn’t a feature, it’s the foundation of systems that endure.