Dominic Tarr: The database of the future: leveldb

Orde Saunders' avatarPublished: by Orde Saunders

Dominic Tarr (@dominictarr) was talking about LevelDB at Scotland.js, these are my notes from his talk.

Databases of the past

  • Navigational databases
  • SQL relational databases
  • NoSQL key value pairs

All have request/response API, monolithic service, cheaper to take computation to the data that moving the data to the processing so there is an extension language (e.g. SQL).

Database of the future

  • Modular
  • Realtime
  • Replciation

LevelDB

Written by the people who wrote BigTable. Data will be stored in key order so you can read data out in sequence very quickly.

  • Unix: build many small tools that work togther
  • Emacs: build a tight core in C, build extensions in other languages

Node.js is a combination of these. No matter how smart you are you don't know a good idea until you try it. A big userland will go through ideas very quickly.

LevelDB was used to build IndexdDB in Chrome.

Basic Operations

Five operations: get, put, delete, batch, readStream (range of keys)

Can keep range queries open that show new items that match the query. (Like tail -f)

MapReduce - good for range queries and group by.

Partition data. Create sub levels that are new DB objects but are subsections of the main database. Multiple modules can work together without conflicting.

"JavaScript has some quirks but SQL's change delimiter is insane."

Can add functions to run before insert - SQL's trigger. Can add this to different sublevels. Can do batches across multiple subsections - can't do that with multiple databases.

Don't use : to separate keys - it comes before letters in the ASCI sequence. Use a low separator or a low separator.

Batches allow us to update multiple keys atomically - all updates fail or succeed.

Joins

Relations is something that all data has. Structure our data into ranges and have many-to-many join table. Disks are really cheap so store the data in multiple places, don't have to join.

Do nothing

Nothing is faster than fast - do nothing as much as possible. Either be eager (do it first then use it leisure) or be lazy (don't do it until you need it). When the zombie apocalypse comes you want to already have your chainsaw (eager), you won't be able to go down to the shop and buy one after it has started (lazy).

Replication: get the data to where it needs to be just before it needs to be there. Master -> Slave, when disconnected then reconnect the master will first push missed changes before continuing to push real time updates. Master-Master you reconcile writes to either master but your data has to be structured to support that (deletes must be an update).

JavaScript bindings

The tests for the C modules of leveldown are written in JS which means we can use them to build against to polyfill in the browser. This means we can use our serverside (Node.js) extensions in the browser.


Comments, suggestions, corrections? Contact me via this website