-
Notifications
You must be signed in to change notification settings - Fork 12
Terms and Techniques
This page explains some terms used in MongoMVCC and the related implementation techniques.
A collection in MongoMVCC is similar to a collection in MongoDB. It is a group of documents. The difference is that each branch in the MongoMVCC tree may contain its own collections. You can obtain a collection as follows:
VDatabase db = ...
VBranch master = db.checkout(VConstants.MASTER);
VCollection coll = master.getCollection("persons");
In order to uniquely identify objects in the database, MongoMVCC automatically assigns new _id
and uid
attributes to each document inserted into a collection. _id
is used to identify a certain version of a document. The uid
attribute identifies all versions of the same document.
Please do not assign your own identifiers for _id
and uid
. The library currently does not prevent you from doing so, but you might accidentally corrupt your data. If you need a special identifier, please use another attribute.
You may add as many documents to the database as you like, but they are not visible until you also make a commit. A commit is a single document which points to all documents added/changed since the last commit. As long as this document does not exist, MongoMVCC will skip all unreferenced documents. This is how full isolation is implemented.
Each commit has a special identifier, the so-called CID. The CID can be used to uniquely identify the commit in the current branch or in the database's history. CIDs are assigned in a strictly ascending order.
A branch is a number of subsequent commits. You can obtain existing branches from the database object. There's always a master
branch:
VDatabase db = ...
VBranch master = db.checkout(VConstants.MASTER);
You can also create new branches. Therefore you have to provide a unique name and the CID of a commit that should be used as the branch's head--i.e the newest commit in the branch.
long cid = master.commit();
VBranch newBranch = db.createBranch("new-branch", cid);
It is possible to checkout unnamed branches as well by specifying a certain commit that should be used as the branch's root. In this case you can make commits and later create a named branch:
VBranch unnamed = master.checkout(cid);
unnamed.commit();
long newHead = unnamed.commit();
db.createBranch("new-branch2", newHead);
All commits and all branches in the database are part of the so-called Tree. The tree has a root commit which is already there when you create the MVCC database.
You can browse the tree using a history object which you can obtain from the database. The history contains information about the parents and the children of each commit.
VHistory history = db.getHistory();
long cid = ...
while (cid != 0) {
// do something with the commit
...
// get the commit's parent (may be
// 0 if the commit is the root commit)
cid = history.getParent(cid);
}
public void processRecursively(long cid) {
// do something with the commit
...
// process the commit's children
long[] children = history.getChildren(cid);
for (long c : children) {
processRecursively(c);
}
}
Processing children recursively may lead to a StackOverflowError if there are a lot of commits. You may want to process them iteratively.
Queue<Long> cids = new LinkedList<Long>();
cids.add(firstCid);
while (!cids.isEmpty()) {
long cid = cids.poll();
// do something with the commit
...
// process the commit's children
for (long c : history.getChildren(cid)) {
cids.add(c);
}
}
The index aggregates all commits up to the commit currently checked out. It references objects which are visible in this commit.
Documents which have already been deleted, replaced by newer versions or which have not been added in this commit or any of its parents yet (but rather in a subsequent commit) are not referenced. MongoMVCC uses the index to filter out all documents that do not belong to the version currently checked out.
The index is also used to store information about dirty documents. Documents become dirty if they have been added or changed since the last commit. As soon as you call the commit()
method on the branch, dirty documents will be consolidated into a commit.