Progressive Coder
Posts
PC#12 - Why Replication Lag Occurs in Databases?

PC#12 - Why Replication Lag Occurs in Databases?

How Notion Handles Concurrent Updates and More...

Saurabh Dashora
October 03, 2023

Hello, this is Saurabh…👋

Welcome to the 158 new subscribers who have joined us since last week.

If you aren’t subscribed yet, join 900+ curious developers looking to expand their knowledge by subscribing to this newsletter.

In this edition, I cover the following topics:

🖥 System Design Concept → Why Replication Lag Occurs in Databases?

🧰 Case Study → How Notion Handles Concurrent Updates?

🍔 Food For Thought → Concurrency vs Parallelism

So, let’s dive in.

🖥

System Design Concept

Why Replication Lag Occurs in Databases?

Building an eventually consistent system is fun and games until you run into a little problem known as Replication Lag.

But what is Replication Lag and how does it occur?

In a typical Leader-based replication setup, all writes go through a single node. However, read-only queries can be served by any replica.

This is a huge benefit for systems that consist of mostly reads and a small percentage of write operations. Just to let you know, this is a very common pattern on the web.

In order to scale the read operations, you can create multiple followers and distribute the read requests across those followers. This reduces the load on the leader node.

However, this approach works well only in the case of asynchronous replication.

Why is that the case?

Imagine performing synchronous replication where you don’t confirm a write operation to the user until all replicas give a thumbs-up.

Even a single replica going down would make the whole system unavailable for write operations.

However, asynchronous replication has its own troubles.

When your application reads data from an asynchronous replica or follower, there’s a good chance that it reads outdated information if the follower has fallen behind.

Here’s what can actually happen:

User A sends an update (write) request to the Primary or Leader node.
The Leader node sends the replication information to its replicas.
Replica 1 gets updated
User B requests (reads) data from replica 2 and gets outdated information.
Replica 2 gets updated eventually

In normal operations, this delay between a write happening on the leader and being reflected on a follower node is known as the Replication Lag.

The lag may only be a fraction of a second (hardly noticeable). However, if the system is operating at its limit, the lag can also increase to several seconds or minutes.

Of course, this inconsistency is just a temporary state. The followers will eventually catch up with the leader. Hence, this situation is also called eventual consistency.

The trouble starts when the lag becomes too large and the inconsistencies become a real problem for applications.

There are several techniques to deal with this but more on them in the next post.

🧰

Case Study

How Notion Handles Concurrent Updates?

How do you build an application that lets you and your friend update a page together in real time?

Let’s learn from our favorite productivity tool Notion.

Notion serves millions of users across the globe and a lot of them work in collaborative teams.

It provides a concurrent interface to its users (meaning multiple users can collaborate and update a page at the same time)

Every item that you create in the Notion editor is a Block. We did speak about the incredibly flexible data model of Notion in an earlier post.

PC#8 - Making Your Database Highly Available

A look at Notion's Flexible Data Model and more...

progressivecoder.beehiiv.com/p/highly-available-database

What’s interesting, however, is that every Block goes through a 3-phase lifecycle.

Creating a new Block
Saving the Block on the server
Rendering the Block on the friend’s screen.

Here’s what it looks like on a high level:

However, these 3 stages occur in 11 total steps full of interesting insights.

Here’s a super-detailed illustration of the entire process in a step-by-step manner.

Let’s understand what’s going happening in the entire sequence:

👉 Stage 1 - Creating a New Block

The below steps occur in this stage:

Step 1 - The user creates a new Block in the UI

Step 2 - The Block is saved to an in-memory storage or something like IndexedDB.

Step 3 - UI is re-rendered and the block is shown on the user’s screen.

Step 4 - The data is also saved to the TransactionQueue

👉 Stage 2 - Saving the Block on the Server

Step 5 - Data is serialized and posted to the backend API

Step 6 - The API does its thing and stores the data in the main database. This is the source-of-truth database.

Step 7 - The backend API also notifies the MessageStore service about the changes

👉 Stage 3 - Rendering the Block on the Friend’s Screen

Step 8 - A client websocket connection subscribes to the MessageStore service.

Step 9 - As part of the subscription, the MessageStore passes the notification to the Websocket

Step 10 - The client receives the version update notifications

Step 11 - Based on the notification data, the client calls the backend API to fetch the latest records from the database and render the friend’s UI.

P.S. This post is inspired by the explanation provided on the Notion Engineering Blog. However, the diagrams have been made from scratch based on the information shared. You can find the original article over here.

🍔

Food For Thought

👉 Concurrency vs Parallelism

Concurrency and Parallelism are concepts that often get mixed up and end up confusing people.

Not anymore.

Here’s a post I wrote a few days ago on X(Twitter) where I explained the difference between the two terms.

As of this moment, the post has got over 550 likes and over a hundred reposts.

Do check it out 👇

Concurrency & Parallelism are two terms that always create confusion.
Let's understand them once and for all.
👉 Concurrency means more than one task can appear to make progress over a unit of time.
The key term over here is "APPEAR".
Concurrency gives the illusion that the… twitter.com/i/web/status/1…
— Saurabh Dashora (@ProgressiveCod2)
6:36 AM • Sep 25, 2023

👉 The Importance of Command-Query Separation (CQRS)

We all want to create modular systems that can be easily maintainable.

Following the principle of Separation of Concerns is key to realizing this goal.

And CQRS is a pattern that helps us move in the right direction.

Here’s a great post by Helen explaining the CQRS pattern in great detail.

Command-Query Separation is Non-Negotiable.
The separation of concerns principle helps create modular and maintainable systems.
One area where this principle manifests is in the clear delineation between commands and queries.
What are Commands and Queries?
Let's use the domain… twitter.com/i/web/status/1…
— Helen Sunshine (@Sunshine_Layer)
7:40 AM • Sep 28, 2023

That’s it for today! ☀️

Enjoyed this issue of the newsletter?

Share with your friends and colleagues

Also, send them over here to subscribe.

In case you want me to cover any specific topic in future editions, please don’t hesitate to fill out the below idea-suggestion form.

Progressive Coder Newsletter Idea Form

Collecting ideas for future newsletter editions

docs.google.com/forms/d/e/1FAIpQLSfrkSbTwT7V8xBNgPLzgv8zODwDsj5Bp_alkSirnTz-Het3cA/viewform

See you later with another value-packed edition — Saurabh.

Reply

or to participate.