Building resilient infrastructure with CouchDB

Tim Perry

Tech Lead & Open-Source Champion at Softwire

Document Store

{
  "_id": "my-document-example",
  "_rev": "21-qwe123asd",
  
  "some-content": {
    "a": 1,
    "b": 2
  },
  
  "a list!": [3, 4, 5]
}

HTTP API

$ curl -X GET http://couchdb:5984/my-db/a-doc-id

{"_id": "a-doc-id"
 "_rev": "4-9812eojawd"
 "data": [1, 2, 3]}

HTTP API

$ curl -X PUT http://couchdb:5984/my-db/another-id \
       -H 'Content-Type: application/json' \
       -d '{ "other data": 4 }'
       
{"ok":true,
 "id":"another-id",
 "rev":"1-2902191555"}

Replication

# Pull from B > A
$ curl -X POST http://couchdb-A:5984/_replicator \
       -H 'Content-Type: application/json' \
       -d '{ "source": "http://couchdb-B:5984/demo-db",
             "target": "demo-db",                         
             "continuous": true }'
             
# Pull from A -> B
$ curl -X POST http://couchdb-B:5984/_replicator \
       -H 'Content-Type: application/json' \
       -d '{ "source": "http://couchdb-A:5984/demo-db",
             "target": "demo-db",
             "continuous": true }'

Indexed Views
Incremental Map/Reduce
ACID (locally)
Erlang-based
Web UI
Show Functions
Filters
Validation

Resilient Infrastructure

Let's break everything!

while true
do
  curl -X POST 'http://couchdb-A:5984/demo-db' \
       -H "content-type: application/json" \
       -d '{ "created_at": "'"`date +%s%N`"'" }' \
       --max-time 0.1
       
  curl -X POST 'http://couchdb-B:5984/demo-db' \
       -H "content-type: application/json" \
       -d '{ "created_at": "'"`date +%s%N`"'" }' \
       --max-time 0.1
done

while true
do
  vagrant halt couchdb-a --force
  sleep 30
  vagrant up couchdb-a --no-provision
  sleep 30
  
  vagrant halt couchdb-b --force
  sleep 30
  vagrant up couchdb-b --no-provision
  sleep 30
done

(Some console logging omitted)

Let's break everything!

while true
do
  curl -X POST 'http://couchdb-A:5984/demo-db' \
       -H "content-type: application/json" \
       -d '{ "created_at": "'"`date +%s%N`"'" }' \
       --max-time 0.1
       
  curl -X POST 'http://couchdb-B:5984/demo-db' \
       -H "content-type: application/json" \
       -d '{ "created_at": "'"`date +%s%N`"'" }' \
       --max-time 0.1
done

while true
do
  vagrant halt couchdb-a --force
  sleep 30
  vagrant up couchdb-a --no-provision
  sleep 30
  
  vagrant halt couchdb-b --force
  sleep 30
  vagrant up couchdb-b --no-provision
  sleep 30
done

(Some console logging omitted)

Is this useful?

Real World Example

(Anonymized)

B2B SaaS product, with strict SLAs
Millions of paying daily users
3,000 servers across 25 datacentres
50,000 requests per second, average
Highly latency sensitive
Every request needs the (readonly) user session

Bonus Challenges

Struggling network infrastructure
Frequent loss of connection to datacentres
Occasional power outages in datacentres
Users can and do roam, worldwide
Server failover is always to a different datacentre
Data centres have hub & spoke connectivity only (through London)

Previous Solution

Hold all user sessions on every server
Announce new sessions to every server with a central message queue
Canonical store kept in a single RDBMS (for server initialisation)

Real World Problems

Memory usage doesn't scale
Network and server failures are big problems
Message queue failures are catastrophic problems

CouchDB Solution

Small LRU cache in every server
CouchDB in every datacentre
CouchDB in the central datacentre
Hub & spoke replication
Servers query local CouchDB by default, or fall back to central CouchDB

Real World Improvements

No single point of failure
Scales horizontally easily
Major memory savings
Much cleaner design

Some Challenges

Ops ramp-up
Support service setup
Disk usage

Hoodie

http://hood.ie

Hoodie

No Backend
Offline-First

Hoodie

Save data

$('.addTask .submit').click(function () {
    var desc = $('.addTask .desc').val();
    hoodie.store.add('task', { desc: desc });
});

Hoodie

Handle new data

hoodie.store.on('add:task', function (task) {
    $('.taskList').append('<li>'+task.desc+'</li>');
});

Hoodie

Log in users

$('.login').click(function () {
    var username = $(".username").val();
    var password = $(".password").val();
    
    hoodie.account.signIn(username, password)
                  .done(loginSuccessful);
});

Hoodie

Architecture

(From the Hoodie team at http://hood.ie/intro#magic, CC-BY-SA-NC)
Future Architecture (Probably)

(Modified, from the Hoodie team's diagram at http://hood.ie/intro#magic, CC-BY-SA-NC)

Changes are local, synced with CouchDB in the background
Each user gets a database, only that is replicated
Operations are documents
CouchDB, with node for extra functionality if required
Core stuff is all in by default
Operations updated server side when they complete
Built their own syncing replication client
Don't build your own sync! It's hard
Our project example was essentially home-made sync
Hoodie are actually looking to move to PouchDB currently
Essentially CouchDB entirely in the browser, and fully compatible
All together, gives you web applications that don't need servers
Hold all their own data client-side
Persist asynchronously
Don't depend on server being present

Why does any of this work?

Reliable Replication

The Changes Feed

$ curl -X GET http://couchdb:5984/my-db/_changes?since=1

{ "results": [
  {"seq":2,"id":"my-doc","changes":[{"rev":"1-128qw99"}]},
  {"seq":3,"id":"my-doc","changes":[{"rev":"2-98s9123"}]},
], last_seq: 3}

Reliable Replication

Replication Process

Track the source's sequence number in a local-only metadata document in the target DB, unique to this replication, set to 0 initially
Read the changes from the source, since the sequence id stored in the local document in the target
Read any missing document revisions from the source DB
Write these updates to the target DB
Update the sequence number tracked in the target
Go to 2

(Paraphrased from http://replication.io)

Can use this feed to incrementally work out what changes we need
Essentially pull the required changes to get to the current sequence number
Update our current sequence number only once all complete
If it fails before hand, we will've still pulled in some of the revisions
Which is why this 3rd step exists: can resume at any point
Sequence numbers just let us do that with more efficiency
Note that this is potentially going to be separate from the local DBs own sequence number
Asynchronous
Handles network failures with retries
1/4 of a second, exponential backoff, until 5 minutes max
Pull preferred, as it can run immediately on restart
New nodes need data locally more than they need to provide data

Multiversion Concurrency Control

(or MVCC)

Append-Only B+ Trees

Hopefully you're still following
This gets more complicated, but it's where the magic happens
Each new bit of tree is allocated on the end of the memory
Along with the footer
To find the root, you just read from the end
Immutable so as many concurrent reads as you like
Never update, so crashes can't corrupt
Any prefix of a CouchDB DB is a valid previous DB, after the first write
All sounds suboptimal, but partly because I've oversimplified
Don't need these empty spaces, for example
Footer actually written and flushed twice, with checksums, for error correction
Batch writes provide some efficiency
Compaction engine in the background, basically copies live data only
Overall though, very neat, unbreakable data structure that's still very quick
Lets see if it worked

Did we break everything?

while true
do
  curl -X POST 'http://couchdb-A:5984/demo-db' \
       -H "content-type: application/json" \
       -d '{ "created_at": "'"`date +%s%N`"'" }' \
       --max-time 0.1
       
  curl -X POST 'http://couchdb-B:5984/demo-db' \
       -H "content-type: application/json" \
       -d '{ "created_at": "'"`date +%s%N`"'" }' \
       --max-time 0.1
done

while true
do
  vagrant halt couchdb-a --force
  sleep 30
  vagrant up couchdb-a --no-provision
  sleep 30
  
  vagrant halt couchdb-b --force
  sleep 30
  vagrant up couchdb-b --no-provision
  sleep 30
done

(Some console logging omitted)

Phew.

CouchDB is not perfect

But 'always available' is a great superpower

Any questions?

Tim Perry

Tech Lead & Open-Source Champion at Softwire

Building resilient infrastructure with CouchDB

Document Store

HTTP API

HTTP API

Replication

Resilient Infrastructure

Let's break everything!

Let's break everything!

Is this useful?

Real World Example

(Anonymized)

Bonus Challenges

Previous Solution

Real World Problems

CouchDB Solution

Real World Improvements

Some Challenges

Hoodie

http://hood.ie

Hoodie

Hoodie

Save data

Hoodie

Handle new data

Hoodie

Log in users

Hoodie

Architecture

Future Architecture (Probably)

Why does any of this work?

Reliable Replication

Reliable Replication

The Changes Feed

Reliable Replication

Replication Process

Multiversion Concurrency Control

(or MVCC)

Append-Only B+ Trees

Did we break everything?

Phew.

CouchDB is not perfect

But 'always available' is a great superpower

Any questions?