Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transactions #689

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from
Draft

Transactions #689

wants to merge 7 commits into from

Conversation

alecgibson
Copy link
Collaborator

No description provided.

@@ -1106,4 +1118,25 @@ Doc.prototype._clearInflightOp = function(err) {
if (err && !called) return this.emit('error', err);
};

Doc.prototype._setTransaction = function(op, transaction) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I really like how much the Doc "knows" about the transaction concept, but maybe this is a necessary evil given that we have to pass a transaction as part of the op submit?

I'd toyed with having some sort of class TransactionConnection extends Connection {} sort of architecture, where you get a "special" connection that has its own Docs and everything in there is assumed to be part of a transaction. It seemed overly-complicated, though, and passing a transaction as part of the submission seemed a bit more flexible. It has its own downsides, though — a Doc that is part of a transaction cannot be part of another transaction (or non-transaction); the workaround is to create a second Connection, which would give you similar behaviour to the extended class approach (which is also why it feels a bit unnecessary).

Comment on lines +173 to +174
var docs = JSON.stringify(this.docs);
var ops = JSON.stringify(this.ops);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this is technically breaking — we'd clobber any references to db.docs or db.ops by doing this, but I think I consider these object private implementation details of this in-memory DB?

var snapshotOptions = {};
snapshotOptions.agentCustom = request.agent.custom;
backend.db.getSnapshot(collection, id, fields, snapshotOptions, function(err, snapshot) {
if (err) return callback(err);

var transactionOps = [];
if (request.transaction) {
// Get other ops in the same transaction that have not yet been committed to the DB
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Admittedly things could be simplified a bit if we didn't allow multiple ops from a single Doc in a given transaction: there's an argument to be made that this should be what type.compose() is for.

However, my whole reason for putting together this change is because I want to be able to delete-then-create as a single transaction, which is basically the only combination of ops that can't be composed:

doc.preventCompose = false; // Doesn't do anything for delete or create

await doc.del();
// if the server dies here, I don't want the doc to be left in its deleted state
await doc.create();

@@ -225,30 +255,40 @@ SubmitRequest.prototype.commit = function(callback) {
request.snapshot,
request.options,
function(err, succeeded) {
// TODO: Should this callback be called before committing a transaction?
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is well worth discussing. At the moment, the API is written like this:

const transaction = connection.startTransaction();
await doc1.submitOp([...], {transaction});
await doc2.submitOp([...], {transaction});
// must actively await acknowledgement from the server before committing
// a transaction
await transaction.commit();

In other words, this has undefined behaviour:

const transaction = connection.startTransaction();
-await doc1.submitOp([...], {transaction});
+doc1.submitOp([...], {transaction});
-await doc2.submitOp([...], {transaction});
+doc2.submitOp([...], {transaction});
await transaction.commit();

The API for this is largely ripped off MongoDB.

According to their source code:

IMPORTANT: Running operations in parallel is not supported during a transaction. The use of Promise.all, Promise.allSettled, Promise.race, etc to parallelize operations inside a transaction is undefined behaviour.

The benefit to us (as ShareDB) is that I think forcing consumers to wait for an acknowledgement feels like a clearer, better-defined set of behaviours.

The major downside is that the Doc starts to drift away from canonical, both locally, and even on the server, where we need to track other ops made as part of the same commit and do extra work to transform the snapshot to the transaction version. However, this isn't completely crazy: already when calling submitOp, the local Doc optimistically updates its state as if the remote ops have been accepted. The main difference here is that we also optimistically bump the version (which submitOp doesn't do). Is this semantically valid? Also, it changes the meaning of the "ack": instead of meaning "this op has been committed to the database", it now means "this op has been committed to the transaction). If users need to know when the op has been "truly" committed, they'll need to wait for the callback from transaction.commit().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I was thinking about this pre-holidays, I was thinking of:

  • startTransaction creates a random session id locally, and causes subsequent submitOp with the transaction to queue locally
  • transaction.commit() causes all the locally queued ops to be sent to the server. Then the driver does whatever it needs to do with the batch of ops, based on the database.
  • Any errors are reported back in transaction.commit()

Per discussion today, we do think the approach of local op queueing does make the implementation, middleware, and testing easier, so we're going with that.


PendingTransaction.prototype.registerSubmitRequest = function(request) {
if (!this._transaction._requests.length) {
this._transaction.backend.db.startTransaction(this._transaction.id, function(error) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of this complication may be removed if we ignore the possibility of MongoDB deadlock

Comment on lines +28 to +31
// TODO: Discuss if this is an acceptable API? Doc will always emit error on
// a failed transaction, since the ops may have been successfully acked for this Doc, and
// we force a hard rollback with no callback, which causes an 'error' event
doc.on('error', function() {});
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth discussing, and related to the discussion around whether a submitOp callback should be fired when the op is committed, or when it's accepted as part of the transaction.

In the current design (callback fires when accepted in the transaction but not yet committed), it means that if the transaction fails for some reason away from doc.submit() and transaction.commit() (eg timeout; DB dies; etc.), there's no "sensible" place to emit the error that's generated as we hard rollback the Doc to recover.

], done);
});

it('remote submits after but commits first', function(done) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the test case that managed to force my MongoDB driver update into deadlock.

I added a lot of complexity into the sharedb-mongo update to get this test case to pass, but I wonder if we need to.

The main issue is that we do this:

  1. Wait for doc to submit an op to a transaction
  2. remoteDoc tries to update the same document outside of this transaction
  3. We try to wait for remoteDoc.submitOp to resolve
  4. However, it seems that MongoDB locks the snapshot document until doc's transaction is committed, so remoteDoc's submission stalls, introducing deadlock

We can trivially rewrite this test case to remove the deadlock by not having such a strict series of awaiting — if we run the two transactions in parallel with no awareness of one another (which is how all transactions will run in practice in the real world), then no deadlock happens, because doc will commit its transaction, and remoteDoc will finally return with a failed write and retry, then eventually succeed.

The reason I didn't write it like this originally is that I was trying to force us into an edge case where both transactions attempt to touch the database, and running these operations in parallel removes a lot of the control I have over logical paths we take through ShareDB's recovery/retry machinery. However, is it worth enforcing this as part of the spec just for this? I'm not sure it is...?

@@ -225,30 +255,40 @@ SubmitRequest.prototype.commit = function(callback) {
request.snapshot,
request.options,
function(err, succeeded) {
// TODO: Should this callback be called before committing a transaction?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I was thinking about this pre-holidays, I was thinking of:

  • startTransaction creates a random session id locally, and causes subsequent submitOp with the transaction to queue locally
  • transaction.commit() causes all the locally queued ops to be sent to the server. Then the driver does whatever it needs to do with the batch of ops, based on the database.
  • Any errors are reported back in transaction.commit()

Per discussion today, we do think the approach of local op queueing does make the implementation, middleware, and testing easier, so we're going with that.

Comment on lines +180 to +181
this.docs = JSON.parse(docs);
this.ops = JSON.parse(ops);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With potentially large sets of docs/ops in tests, using a structured clone would be better for performance/memory. You mentioned rfdc is working well for you in other situations, and I've looked at in the past and it seems good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants