Databases — NoSQL

NoSQL databases are non-relational, schema-free, distributed data stores designed for scale and flexibility.

Common traits:

  • Non-relational, schema-free
  • Eventual consistency (not ACID) — data propagates across nodes over time
  • Distributed across commodity servers — no single point of failure
  • Developed at and optimised for “web scale” (Google, Amazon, Facebook)

CAP Theorem

Any distributed system can guarantee only 2 of 3:

PropertyMeaning
ConsistencyAll nodes see the same data at the same time
AvailabilityEvery request gets a response
Partition ToleranceSystem works even if network splits nodes

NoSQL systems typically choose AP (Available + Partition Tolerant) at the cost of strong consistency. Use eventual consistency where brief staleness is acceptable (catalogue data, social feeds), not where precision is critical (bank balances, inventory).

Compromises in NoSQL:

  • Eventual consistency (not immediate)
  • Write buffering
  • Only primary keys can be indexed (in many systems)
  • Queries may require custom programming (no ad-hoc SQL)
  • Tooling is less mature than relational

NoSQL types

TypeExamplesBest for
DocumentMongoDB, CouchDB, CouchbaseJSON documents, flexible schema
Key-ValueRedis, DynamoDBCaching, sessions, simple lookups
Column FamilyCassandra, DataStaxWide rows, time-series, analytics
GraphCosmos DB (Gremlin), Neo4jHighly connected data, social graphs
SearchElasticsearchFull-text search, log analytics

Redis

REmote DIctionary Service — open-source in-memory key-value store. Extremely fast because data lives in RAM. Optionally persisted to disk.

Supported value types: Strings, Lists, Sets, Sorted Sets, Hashes, Streams

# Start server
redis-server
 
# CLI
redis-cli
 
# Basic commands
SET greeting "Hello"
GET greeting
DEL greeting
 
# Expiry (TTL)
SET session:abc123 "user:42" EX 3600   # expires in 1 hour
TTL session:abc123
 
# List
LPUSH queue job1 job2
RPOP queue
 
# Hash
HSET user:1 name "Ken" email "ken@example.com"
HGET user:1 name
HGETALL user:1

Use cases: Session storage, caching, rate limiting, pub/sub messaging, job queues, leaderboards.

Replication: Master-replica setup. Built-in LUA scripting (like stored procedures).

Note: No official Windows version — use WSL or Docker on Windows.


Elasticsearch

Distributed, full-text search engine built on Apache Lucene. NoSQL document store with powerful querying.

Hierarchy: Cluster → Nodes → Indexes → Shards → Documents

// A document
{
  "id": 1,
  "name": "Server Alpha",
  "color": "blue",
  "hostname": "default"
}

Why Elasticsearch:

  • Index millions of documents and search in milliseconds
  • Full-text search across entire document content (not just exact fields)
  • Aggregations and trend analysis
  • Multi-tenant capable
  • RESTful HTTP interface
# Index a document
PUT /products/_doc/1
{ "name": "Widget", "description": "A useful widget" }
 
# Search
GET /products/_search
{ "query": { "match": { "description": "useful" } } }
 
# Aggregation
GET /orders/_search
{ "aggs": { "by_category": { "terms": { "field": "category" } } } }

Architecture: Documents → Shards (distributed across nodes) → Replicas (redundancy)


Cosmos DB (Azure)

Microsoft’s globally distributed, multi-model NoSQL database. Supports multiple APIs:

APIData modelUse when
Core (SQL)JSON documentsGeneral document store
MongoDB APIBSON documentsMigrate from MongoDB
Gremlin APIGraphHighly connected data
Table APIKey-valueReplace Azure Table Storage
Cassandra APIColumn familyWide-row data

Cosmos DB is designed for global distribution — replicate data across any Azure region with one click.


Graph databases

Graph databases model data as vertices (nodes) and edges (relationships), each with properties.

Vertex: Person { name: "Ken", age: 30 }
Edge:   KNOWS { since: 2020 }  →  Vertex: Person { name: "Alice" }

Gremlin — Apache TinkerPop graph traversal language (used by Cosmos DB, JanusGraph):

// Find all people Ken knows
g.V().has('name', 'Ken').out('KNOWS')
 
// Traversal methods
.in()    // vertices with edges pointing IN to current vertex
.out()   // vertices with edges pointing OUT from current vertex
.both()  // both directions
.inE()   // edges coming in
.outE()  // edges going out
.count() // count traversal results
 
// Example: friends of friends
g.V().has('name', 'Ken').out('KNOWS').out('KNOWS').values('name')

Use cases: Social networks, recommendation engines, fraud detection, knowledge graphs.


IndexedDB (Browser)

Client-side NoSQL database built into browsers. Stores structured data with indexes, supports transactions.

const request = indexedDB.open('MyDB', 1);
 
request.onupgradeneeded = (e) => {
    const db = e.target.result;
    db.createObjectStore('users', { keyPath: 'id' });
};
 
request.onsuccess = (e) => {
    const db = e.target.result;
    const tx = db.transaction('users', 'readwrite');
    tx.objectStore('users').add({ id: 1, name: 'Ken' });
};

Larger capacity than localStorage (~50–250MB+). Use for offline-first apps.


See also

  • Databases-SQL — relational alternative; use when data is highly structured and relational
  • Node.js + MongoDB — to be ingested (see sync list)
  • Cloud-AWS-Azure — DynamoDB (AWS), Cosmos DB (Azure)
  • CSharpStackExchange.Redis for Redis in .NET