#include <vector>
#include <map>
using namespace std;
Simple in-memory database, with multiple indices, range query, snapshot and (nested) transactions
#include <vector>
#include <map>
using namespace std;
Include the memdb
library.
/* Memory table */
#include "memdb/table.h"
/* Transaction support */
#include "memdb/txn.h"
using namespace mdb;
int main() {
First, you will need to create the table schema. For this demo, we create a table with only 2 columns: “id”, and “name”.
Schema schema;
“id” is the primary key, it is 32-bit integer.
schema.add_key_column("id", Value::I32);
“name” is string. There’s no limit on string length.
schema.add_column("name", Value::STR);
The possible column types are: I32
, I64
, DOUBLE
, and STR
.
Now you can create the table. There’s SortedTable
, UnosrtedTable
and SnapshotTable
at your choice.
SortedTable
allows range query and scanning on the primary key. UnsortedTable
only allows
point query on primary key, but it’s faster than SortedTable
. SnapshotTable
is a specialized
SortedTable
with the ability to cheaply create read-only snapshots. Under most cases, you only need to
use either SortedTable
or UnsortedTable
.
UnsortedTable table(&schema);
Let’s first create some data.
Row* row1 = CoarseLockedRow::create(&schema, vector<Value>{ Value((i32) 1), Value("alice") });
This seems long. Let me explain it.
We want to create a Row *
pointer. There’s plain Row
object, but it does not support transactions.
The other alternatives are CoarseLockedRow
and FineLockedRow
, they can be used to provide
two phase locking. There’s also a VersionedRow
, which can be used to provide optimistic concurrency control.
Since we will be writing transactions later, I choose to use CoarseLockedRow
.
Value
class is used to represent all the possible types of value in memdb.
vector<Value>{ Value((i32) 1), Value("alice") }
specifies data in the two columns. Their order should match
the order you add columns into Schema
.
You can also store column data in an STL map
.
map<string, Value> row2_data;
row2_data["id"] = Value((i32) 2);
row2_data["name"] = "bob";
Row* row2 = CoarseLockedRow::create(&schema, row2_data);
Now, insert the rows into table. The rows CAN have duplicated primary keys.
table.insert(row1);
table.insert(row2);
Query on primary key only. The results are returned as a cursor, which could be used to traverse all the matched rows.
You should see:
1 row(s) returned.
* id=2, name=bob
For SortedTable
, there’s also query_gt
, query_lt
and query_in
. See memdb/table.h
for full API.
auto cursor = table.query(Value((i32) 2));
printf("%d row(s) returned.\n", cursor.count());
while (cursor.has_next()) {
Row* r = cursor.next();
printf(" * id=%d, name=%s\n", r->get_column("id").get_i32(), r->get_column("name").get_str().c_str());
}
You can directly update the rows returned from a query.
{
auto cursor = table.query(Value((i32) 2));
Row* r = cursor.next();
r->update("name", "catherine");
You CAN even update the primary key!
r->update("id", (i32) 3);
}
Let’s check the table contents. You should see:
2 row(s) in table.
* id=3, name=catherine
* id=1, name=alice
(Note that row ordering doesn’t matter, since we are using UnsortedTable
)
cursor = table.all();
printf("%d row(s) in table.\n", cursor.count());
while (cursor.has_next()) {
Row* r = cursor.next();
printf(" * id=%d, name=%s\n", r->get_column("id").get_i32(), r->get_column("name").get_str().c_str());
}
{
auto cursor = table.query(Value((i32) 1));
Row* r = cursor.next();
You can remove(Row *)
a specific row.
table.remove(r);
}
cursor = table.all();
printf("%d row(s) in table.\n", cursor.count());
while (cursor.has_next()) {
Row* r = cursor.next();
printf(" * id=%d, name=%s\n", r->get_column("id").get_i32(), r->get_column("name").get_str().c_str());
}
For UnsortedTable
, you can use remove(Value)
to remove all rows with certain primary key.
For SortedTable
, you can use remove(Cursor)
to remove all rows based on a query result.
You need to #include "memdb/txn.h"
for transaction support.
First, decide what policy to be used for concurrency control. Two phase locking (2PL) and Optimistic Concurrency Control (OCC)
are supported. You will need to use either TxnMgr2PL
or TxnMgrOCC
to launch transactions.
Remember to use CoarseLockedRow
or FineLockedRow
with TxnMgr2PL
, and VersionedRow
for TxnMgrOCC
.
TxnMgr2PL txn_mgr;
Lets start transaction T1.
Txn* txn1 = txn_mgr.start(1);
And transaction T2.
Txn* txn2 = txn_mgr.start(2);
In transaction T1, we insert a new row. Note that we need to provide transaction context when doing insert.
Row* row4 = CoarseLockedRow::create(&schema, vector<Value>{ Value((i32) 4), Value("david") });
txn1->insert_row(&table, row4);
We should be able to see the newly inserted row in the context of T1. Note that we need to provide transaction context when doing query.
table content (transaction T1)
* id=3, name=catherine
* id=4, name=david
auto result_set = txn1->all(&table);
printf("table content (transaction T1)\n");
while (result_set.has_next()) {
Row* r = result_set.next();
Value id;
We also need to provide transaction context when doing column access.
txn1->read_column(r, 0, &id);
Value name;
txn1->read_column(r, 1, &name);
printf(" * id=%d, name=%s\n", id.get_i32(), name.get_str().c_str());
}
But this new row should not be visible in T2.
table content (transaction T1)
* id=3, name=catherine
result_set = txn2->all(&table);
printf("table content (transaction T2)\n");
while (result_set.has_next()) {
Row* r = result_set.next();
Value id;
txn2->read_column(r, 0, &id);
Value name;
txn2->read_column(r, 1, &name);
printf(" * id=%d, name=%s\n", id.get_i32(), name.get_str().c_str());
}
Now we drop transaction T2 and start T3 instead. This is due to locking issues (explained later).
txn2->abort();
Txn* txn3 = txn_mgr.start(3);
Let’s update name “catherine” to “cathy” in T1.
{
Row* r = txn1->query(&table, Value((i32) 3)).next();
if (txn1->write_column(r, 1, Value("cathy"))) {
printf("updated catherine -> cathy in T1\n");
} else {
printf("failed to update catherine -> cathy in T1\n");
}
}
And try to update name “catherine” to “katherine” in T3.
We will see:
updated catherine -> cathy in T1
failed to update catherine -> katherine in T3
This is because T1 tried to update “catherine” first. It will succeed, and hold a writer lock. When T3 tries to update “catherine” later, it cannot acquire a writer lock since the locks is already taken by T1, so the write operation in T3 fails. We see this because we are using two phase locking for transaction concurrency control.
The reason we abort T2 earlier is that T2 acquired reader lock on all rows when traversing table content. This will prevent T1 acquiring writer lock.
Now I can tell you the difference between CoarseLockedRow
and FineLockedRow
:
CoarseLockedRow
has lock granularity at row level, while FineLockedRow
has lock
granularity at column level. CoarseLockedRow
uses less memory, yet FineLockedRow
provides better performance under high load.
{
Row* r = txn3->query(&table, Value((i32) 3)).next();
if (txn3->write_column(r, 1, Value("katherine"))) {
printf("updated catherine -> katherine in T3\n");
} else {
printf("failed to update catherine -> katherine in T3\n");
}
}
Let’s now try to commit transaction T1, or abort it if we can’t commit. For this example, it will commit.
if (txn1->commit_or_abort()) {
printf("transaction T1 COMMIT\n");
} else {
printf("transaction T1 ABORT\n");
}
And ABORT T3.
txn3->abort();
You HAVE to either abort or commit a transaction to cleanup its resource.
Now lets check the final table contents. You should see:
2 row(s) in table.
* id=3, name=cathy
* id=4, name=david
cursor = table.all();
printf("%d row(s) in table.\n", cursor.count());
while (cursor.has_next()) {
Row* r = cursor.next();
printf(" * id=%d, name=%s\n", r->get_column("id").get_i32(), r->get_column("name").get_str().c_str());
}
return 0;
}
memdb also supports the following features, which will be documented soon.
You can also checkout the code in test
source code folder for real world usages.
IndexedSchema
and IndexedTable
.Schema
, and compound index key in IndexedSchema
.SortedTable
.SnapshotTable
.