memdb

¶

memdb

Simple in-memory database, with multiple indices, range query, snapshot and (nested) transactions

https://git.io/memdb
https://github.com/santazhang/memdb

Creating tables

#include <vector>
#include <map>
using namespace std;

Include the memdb library.

/* Memory table */
#include "memdb/table.h"

/* Transaction support */
#include "memdb/txn.h"

using namespace mdb;

int main() {

¶

First, you will need to create the table schema. For this demo, we create a table with only 2 columns: “id”, and “name”.
```
    Schema schema;
```
¶

“id” is the primary key, it is 32-bit integer.
```
    schema.add_key_column("id", Value::I32);
```
¶

“name” is string. There’s no limit on string length.
```
    schema.add_column("name", Value::STR);
```
¶

The possible column types are: I32, I64, DOUBLE, and STR.
¶

Now you can create the table. There’s SortedTable, UnosrtedTable and SnapshotTable at your choice.
¶

SortedTable allows range query and scanning on the primary key. UnsortedTable only allows point query on primary key, but it’s faster than SortedTable. SnapshotTable is a specialized SortedTable with the ability to cheaply create read-only snapshots. Under most cases, you only need to use either SortedTable or UnsortedTable.
```
    UnsortedTable table(&schema);
```
¶

Inserting data

Let’s first create some data.

    Row* row1 = CoarseLockedRow::create(&schema, vector<Value>{ Value((i32) 1), Value("alice") });

¶

This seems long. Let me explain it.

We want to create a Row * pointer. There’s plain Row object, but it does not support transactions. The other alternatives are CoarseLockedRow and FineLockedRow, they can be used to provide two phase locking. There’s also a VersionedRow, which can be used to provide optimistic concurrency control. Since we will be writing transactions later, I choose to use CoarseLockedRow.

Value class is used to represent all the possible types of value in memdb. vector<Value>{ Value((i32) 1), Value("alice") } specifies data in the two columns. Their order should match the order you add columns into Schema.

You can also store column data in an STL map.

    map<string, Value> row2_data;
    row2_data["id"] = Value((i32) 2);
    row2_data["name"] = "bob";
    Row* row2 = CoarseLockedRow::create(&schema, row2_data);

¶

Now, insert the rows into table. The rows CAN have duplicated primary keys.
```
    table.insert(row1);
    table.insert(row2);
```
¶

Running queries

Query on primary key only. The results are returned as a cursor, which could be used to traverse all the matched rows.

You should see:

1 row(s) returned.
 * id=2, name=bob

For SortedTable, there’s also query_gt, query_lt and query_in. See memdb/table.h for full API.

    auto cursor = table.query(Value((i32) 2));
    printf("%d row(s) returned.\n", cursor.count());
    while (cursor.has_next()) {
        Row* r = cursor.next();
        printf(" * id=%d, name=%s\n", r->get_column("id").get_i32(), r->get_column("name").get_str().c_str());
    }

¶

Updating rows

You can directly update the rows returned from a query.

    {
        auto cursor = table.query(Value((i32) 2));
        Row* r = cursor.next();
        r->update("name", "catherine");

¶

You CAN even update the primary key!
```
        r->update("id", (i32) 3);
    }
```

Let’s check the table contents. You should see:

2 row(s) in table.
 * id=3, name=catherine
 * id=1, name=alice

(Note that row ordering doesn’t matter, since we are using UnsortedTable)

    cursor = table.all();
    printf("%d row(s) in table.\n", cursor.count());
    while (cursor.has_next()) {
        Row* r = cursor.next();
        printf(" * id=%d, name=%s\n", r->get_column("id").get_i32(), r->get_column("name").get_str().c_str());
    }

Removing rows from a table

    {
        auto cursor = table.query(Value((i32) 1));
        Row* r = cursor.next();

¶

You can remove(Row *) a specific row.
```
        table.remove(r);
    }
```

Let’s check the table contents. You should see:

1 row(s) in table.
 * id=3, name=catherine

    cursor = table.all();
    printf("%d row(s) in table.\n", cursor.count());
    while (cursor.has_next()) {
        Row* r = cursor.next();
        printf(" * id=%d, name=%s\n", r->get_column("id").get_i32(), r->get_column("name").get_str().c_str());
    }

¶

For UnsortedTable, you can use remove(Value) to remove all rows with certain primary key.
¶

For SortedTable, you can use remove(Cursor) to remove all rows based on a query result.
¶

Running transactions
¶

You need to #include "memdb/txn.h" for transaction support.
¶

First, decide what policy to be used for concurrency control. Two phase locking (2PL) and Optimistic Concurrency Control (OCC) are supported. You will need to use either TxnMgr2PL or TxnMgrOCC to launch transactions.

Remember to use CoarseLockedRow or FineLockedRow with TxnMgr2PL, and VersionedRow for TxnMgrOCC.
```
    TxnMgr2PL txn_mgr;
```
¶

Lets start transaction T₁.
```
    Txn* txn1 = txn_mgr.start(1);
```
¶

And transaction T₂.
```
    Txn* txn2 = txn_mgr.start(2);
```

In transaction T₁, we insert a new row. Note that we need to provide transaction context when doing insert.

    Row* row4 = CoarseLockedRow::create(&schema, vector<Value>{ Value((i32) 4), Value("david") });
    txn1->insert_row(&table, row4);

We should be able to see the newly inserted row in the context of T₁. Note that we need to provide transaction context when doing query.

table content (transaction T1)
 * id=3, name=catherine
 * id=4, name=david

    auto result_set = txn1->all(&table);
    printf("table content (transaction T1)\n");
    while (result_set.has_next()) {
        Row* r = result_set.next();
        Value id;

We also need to provide transaction context when doing column access.

        txn1->read_column(r, 0, &id);
        Value name;
        txn1->read_column(r, 1, &name);
        printf(" * id=%d, name=%s\n", id.get_i32(), name.get_str().c_str());
    }

But this new row should not be visible in T₂.

table content (transaction T1)
 * id=3, name=catherine

    result_set = txn2->all(&table);
    printf("table content (transaction T2)\n");
    while (result_set.has_next()) {
        Row* r = result_set.next();
        Value id;
        txn2->read_column(r, 0, &id);
        Value name;
        txn2->read_column(r, 1, &name);
        printf(" * id=%d, name=%s\n", id.get_i32(), name.get_str().c_str());
    }

¶

Now we drop transaction T₂ and start T₃ instead. This is due to locking issues (explained later).
```
    txn2->abort();
    Txn* txn3 = txn_mgr.start(3);
```

Let’s update name “catherine” to “cathy” in T₁.

    {
        Row* r = txn1->query(&table, Value((i32) 3)).next();
        if (txn1->write_column(r, 1, Value("cathy"))) {
            printf("updated catherine -> cathy in T1\n");
        } else {
            printf("failed to update catherine -> cathy in T1\n");
        }
    }

¶

And try to update name “catherine” to “katherine” in T₃.

We will see:
```
updated catherine -> cathy in T1
failed to update catherine -> katherine in T3
```
This is because T₁ tried to update “catherine” first. It will succeed, and hold a writer lock. When T₃ tries to update “catherine” later, it cannot acquire a writer lock since the locks is already taken by T₁, so the write operation in T₃ fails. We see this because we are using two phase locking for transaction concurrency control.

The reason we abort T₂ earlier is that T₂ acquired reader lock on all rows when traversing table content. This will prevent T₁ acquiring writer lock.

Now I can tell you the difference between CoarseLockedRow and FineLockedRow: CoarseLockedRow has lock granularity at row level, while FineLockedRow has lock granularity at column level. CoarseLockedRow uses less memory, yet FineLockedRow provides better performance under high load.
```
    {
        Row* r = txn3->query(&table, Value((i32) 3)).next();
        if (txn3->write_column(r, 1, Value("katherine"))) {
            printf("updated catherine -> katherine in T3\n");
        } else {
            printf("failed to update catherine -> katherine in T3\n");
        }
    }
```

Let’s now try to commit transaction T₁, or abort it if we can’t commit. For this example, it will commit.

    if (txn1->commit_or_abort()) {
        printf("transaction T1 COMMIT\n");
    } else {
        printf("transaction T1 ABORT\n");
    }

¶

And ABORT T₃.
```
    txn3->abort();
```
¶

You HAVE to either abort or commit a transaction to cleanup its resource.

Now lets check the final table contents. You should see:

2 row(s) in table.
 * id=3, name=cathy
 * id=4, name=david

    cursor = table.all();
    printf("%d row(s) in table.\n", cursor.count());
    while (cursor.has_next()) {
        Row* r = cursor.next();
        printf(" * id=%d, name=%s\n", r->get_column("id").get_i32(), r->get_column("name").get_str().c_str());
    }

    return 0;
}

¶

More features to be covered
¶

memdb also supports the following features, which will be documented soon. You can also checkout the code in test source code folder for real world usages.
- Secondary index using IndexedSchema and IndexedTable.
- Compound primary key in Schema, and compound index key in IndexedSchema.
- Query result ordering (ascending or descending) in SortedTable.
- Snapshotting using SnapshotTable.
- Nested transactions.
- “Eager” and “lazy” policy for Optimistic Concurrency Control.

memdb

Creating tables

Inserting data

Running queries

Updating rows

Removing rows from a table

Running transactions

More features to be covered