Why Facebook Never Goes Down: The Two Systems Behind 3 Billion Users
Most engineers never study LogDevice and RocksDB. Here is why they should.
Every time you open Facebook, something extraordinary happens behind the scenes.
Three billion people are doing the exact same thing at the same time. Posting. Scrolling. Messaging. Watching. And somehow the whole thing just works. No crashes. No waiting. No downtime.
The answer is not more servers. It is not a bigger database. It is a fundamental design decision that most engineers never think about until their system is already on fire.
Meta treats reads and writes as two completely separate problems.
In most systems, reads and writes share the same path. The same database handles both. Which means when one gets busy, the other suffers. Your users are trying to load their feed while your pipeline is ingesting millions of new events at the same time. They are fighting over the same resources. And under load, everybody loses.
Most engineers respond to this by adding memory, scaling horizontally, or upgrading their database tier. None of that fixes the actual problem. Because the actual problem is architectural, not operational.
Meta solved it by building two completely different systems from scratch.
LOGDEVICE: BUILT FOR WRITES ONLY
LogDevice is Meta’s distributed log storage system. It was designed with one purpose: ingest data at massive speed without ever slowing down.
Every like, every message, every video view, every backend sensor ping across three billion users - LogDevice takes it all in. It uses a log-structured approach which means it writes data sequentially rather than randomly jumping around the disk. Sequential writes are dramatically faster than random writes. That is not an accident. That is a deliberate design choice made specifically to maximize write throughput.
LogDevice does not care about reads. It was never designed to serve reads efficiently. That is the point. By giving up on reads entirely, it becomes extraordinarily good at the one thing it was built for.
Most engineers designing their first production system try to pick one database that handles everything. LogDevice is the proof that this instinct, while understandable, is wrong at scale.
ROCKSDB: BUILT FOR READS WITH SURGICAL PRECISION
RocksDB started as Google’s LevelDB. Meta took it, rebuilt it, and open sourced it in 2013. Today it powers systems at Facebook, LinkedIn, Yahoo, Twitter, and hundreds of other companies running at scale.
The reason Meta built RocksDB instead of using an existing solution is the same reason they built LogDevice. Nothing on the market gave them the control they needed.
RocksDB is an embeddable key-value store that lets you tune read and write performance independently at the instance level. This is the part most engineers miss.
You can deploy one RocksDB instance configured entirely for fast point lookups optimized for the read patterns of a news feed where you need to retrieve a specific user’s data in milliseconds. You deploy another instance configured for high write throughput - optimized for the ingestion patterns of an analytics pipeline processing billions of events. Same underlying technology. Completely different configurations. Completely different jobs.
They never compete for the same resources because they were never meant to run the same workload.
RocksDB also uses a data structure called an LSM tree - Log Structured Merge tree which batches writes in memory and flushes them to disk in sorted order. This makes writes fast and keeps related data physically close together on disk. When you request data, the disk has to seek less to find it. Less seeking means faster reads. Meta takes this even further by pre-arranging the most frequently accessed bytes so they are physically adjacent on disk. The result is a feed that loads in milliseconds regardless of how many people are using it simultaneously.
WHY THIS MATTERS FOR YOUR SYSTEM RIGHT NOW
You are probably not building for three billion users. But the principle applies at every scale.
If you have a system that slows down under load, the first question to ask is not what hardware do I need. The question is are my reads and writes competing for the same resources.
I have seen this exact problem at companies with 50 engineers and companies with 5,000. A shared database handling both analytical queries and transactional writes. A single Kafka consumer group processing both real-time and batch workloads. One pipeline serving five different use cases with completely different performance requirements.
The symptom is always the same. Things work fine until load increases. Then everything degrades together because everything is coupled together.
The fix is always the same too. Separate the concern. Define the job. Build for that job specifically.
LogDevice does not try to be RocksDB. RocksDB does not try to be LogDevice. And Facebook never goes down.
Here is the three step framework I apply before designing any new data system:
Step one. Write down every read pattern your system needs to support. How frequently. What latency is acceptable. What the data shape looks like.
Step two. Write down every write pattern separately. How much volume. How fast does it need to land. What consistency guarantees do you need.
Step three. Ask honestly whether one system can serve both patterns without compromising either. If the answer is no, you already know what to do.
Separate the concern first. Then optimize. That is how you build something that survives contact with real traffic.
If you found this valuable, Thursday’s paid newsletter goes even deeper.
I am breaking down the exact career moves that separate engineers who understand systems from engineers who just operate them. The difference in compensation between those two groups at companies like Meta is not small.
Thursday 5:30pm. Paid subscribers only.
See you Thursday.
— Avantika


