NoSQL Database Concepts
What is NoSQL?
NoSQL (“Not Only SQL”) refers to non-relational database systems designed to handle the scale, agility, and diverse data types that traditional RDBMS (like SQL Server or MySQL) often struggle with. They move away from the rigid table-and-row model toward more flexible structures optimized for specific access patterns.
The Four Pillars of NoSQL
1. Document Stores (e.g., MongoDB, Cosmos DB)
Data is stored as semi-structured documents (JSON/BSON). Best for content management, catalogs, and applications where the data schema evolves rapidly.
- Strength: Flexible schema, natural mapping to Objects/Classes.
2. Key-Value Stores (e.g., Redis, DynamoDB)
The simplest model: a unique key points to an opaque value. Used for caching, session management, and high-speed lookups.
- Strength: Extreme performance,
lookups.
3. Wide-Column Stores (e.g., Cassandra, HBase)
Data is stored in column families rather than rows. Optimized for massive datasets (petabytes) and high write throughput across distributed clusters.
- Strength: Massive write scalability, high availability.
4. Graph Databases (e.g., Neo4j)
Data is stored as Nodes and Edges (relationships). Ideal for social networks, fraud detection, and recommendation engines.
- Strength: Querying complex relationships without expensive SQL joins.
Consistency: ACID vs. BASE
While SQL databases favor ACID (Atomicity, Consistency, Isolation, Durability) for strictly correct transactions, many NoSQL databases follow the BASE model:
- Basically Available: The system guarantees availability.
- Soft State: The state of the system may change over time, even without input.
- Eventually Consistent: The system will eventually become consistent once it stops receiving input.
The CAP Theorem
You can only have two of the following in a distributed system:
- Consistency: Every read receives the most recent write.
- Availability: Every request receives a (non-error) response.
- Partition Tolerance: The system continues to operate despite network failures.
Practice Exercise
Model a “User and Posts” system for both SQL and NoSQL (Document). Explain when you would choose one over the other.
Answer
1. Relational Model (SQL Server)
Data is normalized into two tables: Users and Posts.
SELECT u.Name, p.Content
FROM Users u
JOIN Posts p ON u.Id = p.UserId
WHERE u.Id = 123;
- When to Choose: When data integrity is paramount, relationships are complex, and you need robust ad-hoc querying.
2. Document Model (MongoDB)
Data is often denormalized (embedded) within a single document.
{
"_id": 123,
"name": "Jane Doe",
"posts": [
{ "title": "My First Post", "date": "2025-01-01" },
{ "title": "NoSQL is cool", "date": "2025-01-02" }
]
}
- When to Choose: When you need to scale horizontally across multiple servers, your data structure changes frequently, or you want to retrieve a record and all its related items in a single
disk read.
Summary
Choosing a NoSQL database isn’t about avoiding SQL—it’s about picking the right tool for the scale. If you priority is relational integrity, stay with SQL. If your priority is massive scalability or schema flexibility, NoSQL is the answer.