1. Introduction
In system design, databases must be chosen based on the requirements of consistency, availability, and partition tolerance. Two key concepts guide this decision:
- ACID (Atomicity, Consistency, Isolation, Durability) ensures strong guarantees for transactional databases.
- CAP Theorem (Consistency, Availability, Partition Tolerance) explains the trade-offs in distributed systems.
2. ACID Properties
ACID properties define how traditional relational databases (SQL) maintain reliability and correctness.
- Atomicity: A transaction is all or nothing. If one part fails, the entire transaction is rolled back.
- Example: In a banking system, transferring money from Account A to Account B must ensure that either both debit and credit occur or neither.
- Tools: PostgreSQL, MySQL, Oracle DB, Microsoft SQL Server
- Consistency: Ensures the database moves from one valid state to another.
- Example: A transaction cannot violate constraints (e.g., no duplicate primary keys).
- Tools: PostgreSQL, MySQL, Oracle DB, IBM Db2
- Isolation: Transactions do not interfere with each other.
- Example: Two transactions reading and writing the same data should execute independently.
- Tools: PostgreSQL, MySQL, SQL Server (Supports isolation levels: Read Committed, Serializable, etc.)
- Durability: Once committed, data is permanently saved, even in a crash.
- Example: If a system crashes, completed transactions must remain intact.
- Tools: PostgreSQL, MySQL (uses WAL – Write-Ahead Logging)
3. CAP Theorem
CAP Theorem states that in a distributed system, a database can achieve at most two of the following three properties:
- Consistency (C): Every read receives the most recent write or an error.
- Availability (A): Every request receives a response, but it might not be the latest data.
- Partition Tolerance (P): The system continues to operate even if some network failures occur.
CAP Combinations and Database Types
| Category | Ensures | Compromises | Example Databases |
|---|---|---|---|
| CP (Consistency + Partition Tolerance) | Strong Consistency | Sacrifices Availability | MongoDB, HBase, Redis (when configured for CP mode) |
| AP (Availability + Partition Tolerance) | High Availability | Sacrifices Strong Consistency | Cassandra, DynamoDB, CouchDB |
| CA (Consistency + Availability) – Theoretical | Strong Consistency & Availability | No Partition Tolerance (not practical in distributed systems) | Traditional SQL databases (PostgreSQL, MySQL, Oracle in single-node setup) |
4. Choosing the Right Database for System Design
- Use ACID-compliant databases when data integrity is critical (e.g., Banking, E-commerce transactions).
- Use CAP-based distributed databases when scalability and availability are more important (e.g., Social Media, IoT data processing).
Database Categories Based on ACID & CAP
| Category | ACID or CAP? | Use Case | Example Databases |
|---|---|---|---|
| Relational (SQL) | ACID | Financial Systems, Inventory | PostgreSQL, MySQL, SQL Server, Oracle |
| NoSQL (Document, Key-Value, Column Store) | CAP (AP or CP) | Scalable Applications, Big Data | MongoDB, Cassandra, DynamoDB, CouchDB |
| NewSQL (Distributed SQL) | ACID & CAP (CP) | High-availability, Global Systems | CockroachDB, Google Spanner |
5. Conclusion
- ACID ensures strong consistency and is best for critical transactions.
- CAP Theorem explains the trade-offs in distributed databases.
- Choosing the right database depends on system design requirements (e.g., scalability, consistency, or availability).