Linearizability, Serializability, External Consistency, Strict Serializability

 

 

This is a reference blog to explain these  terms linearizability, serializability, external consistency, strict serializability since I have used them at various places in previous blogs.



 

Linearizability 

 In distributed systems, linearizability is one of the strongest consistency models. It provides the illusion that even though data is spread across multiple machines, there is only one single copy of the data, and all operations on it happen instantaneously, it focuses on single or individual pieces of data. 

 

Serializability 

While linearizability is about the "freshness" of single pieces of data, serializability is about the "correctness" of complex transactions involving multiple pieces of data. In simple terms, Serializability ensures that even though many transactions are running at the same time, the end result is exactly the same as if you had run them one by one in some order. 

 

Strict Serializability 

  • Linearizability is a consistency property: Single object, real-time.  
 
  • Serializability is an Isolation property: Multi-object, any valid order. 

 

In distributed systems, the strongest possible guarantee is Strict Serializability. This means the transactions are serializable and they follow real-world time (linearizable). 

 

CAP Theorem 

In the context of the CAP Theorem (Consistency, Availability, Partition Tolerance), Linearizability is the "C". There is no place for Serializability in the CAP theorem. 

 

 External Consistency 

External Consistency is a term primarily used to describe the strongest possible level of consistency in a distributed database. While it is often used as a synonym for Strict Serializability, it was popularized by Google Spanner to emphasize a specific real-world guarantee. 

  • Serializability: Transactions happen "one by one" in some order, but not necessarily the order they actually happened in real life. 
 
  • Strict Serializability: Transactions happen "one by one" AND they follow real-world time. 
 
  • External Consistency: Effectively the same as Strict Serializability, but specifically designed for distributed, multi-sharded systems where maintaining a single global clock is physically impossible. Figure is from https://jepsen.io/consistency 

 


 

The above Figure is from https://jepsen.io/consistency


 

The Consistency Hierarchy 

If we imagine a spectrum of consistency, here is how the various consistency models fit in the spectrum: 

  • Strict Serializability: The ultimate level. It combines Linearizability (real-time ordering) with Serializability (transactional multi-object safety). 
 
  • Linearizability: The strongest guarantee for a single object. It ensures operations appear instantaneous and respect real-time order. 
 
  • Sequential Consistency: Strong, but slightly relaxed. It ensures all nodes see operations in the same order, but that order doesn't have to match "real-world" time. 
 
  • Causal Consistency: Only operations that are "causally related" must be seen in the same order (e.g., a reply must come after the original post). 
 
  • Eventual Consistency: The "weakest" level. There are no ordering guarantees; the system only promises that eventually data will reach all nodes. 

 

 Databases 

 

Google Spanner 

The only database which I have used and which provides external consistency or strict serializability is Google Spanner. It uses a technology called TrueTime to solve the problem of "clock drift" between servers. Because no two clocks in the world are perfectly in sync, time is represented as a time interval [earliest, latest] that is guaranteed to contain the true time. To ensure external consistency, Spanner follows following rule: 

  • When a transaction wants to commit, the leader assigns it a timestamp. 
 
  • The leader then waits until it is absolutely certain that the real-world time has passed (based on the TrueTime uncertainty window). 
 
  • This wait is generally in order of 10 milliseconds or less and all commits follow the same pattern. 

 

 CockroachDB 


CockroachDB has adopted many concepts of Google Spanner (their documentation mentions Google Spanner as inspiration). CockroachDB is Serializable by default, but it is not strictly Linearizable (and thus not Externally Consistent) over the entire database. It provides "No Stale Reads" and "Serializable" isolation, but it stops just short of the "Real-time ordering" guarantee required for External Consistency. 

 

Comments

Popular posts from this blog

Architectural Analysis of the 2021 Roblox Infrastructure Outage

Paper Review - Paxos vs Raft: Consensus on distributed consensus