-
Notifications
You must be signed in to change notification settings - Fork 453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transactions #689
base: master
Are you sure you want to change the base?
Transactions #689
Conversation
@@ -1106,4 +1118,25 @@ Doc.prototype._clearInflightOp = function(err) { | |||
if (err && !called) return this.emit('error', err); | |||
}; | |||
|
|||
Doc.prototype._setTransaction = function(op, transaction) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I really like how much the Doc
"knows" about the transaction concept, but maybe this is a necessary evil given that we have to pass a transaction as part of the op submit?
I'd toyed with having some sort of class TransactionConnection extends Connection {}
sort of architecture, where you get a "special" connection that has its own Doc
s and everything in there is assumed to be part of a transaction. It seemed overly-complicated, though, and passing a transaction
as part of the submission seemed a bit more flexible. It has its own downsides, though — a Doc
that is part of a transaction cannot be part of another transaction (or non-transaction); the workaround is to create a second Connection
, which would give you similar behaviour to the extended class approach (which is also why it feels a bit unnecessary).
var docs = JSON.stringify(this.docs); | ||
var ops = JSON.stringify(this.ops); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this is technically breaking — we'd clobber any references to db.docs
or db.ops
by doing this, but I think I consider these object private implementation details of this in-memory DB?
var snapshotOptions = {}; | ||
snapshotOptions.agentCustom = request.agent.custom; | ||
backend.db.getSnapshot(collection, id, fields, snapshotOptions, function(err, snapshot) { | ||
if (err) return callback(err); | ||
|
||
var transactionOps = []; | ||
if (request.transaction) { | ||
// Get other ops in the same transaction that have not yet been committed to the DB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Admittedly things could be simplified a bit if we didn't allow multiple ops from a single Doc
in a given transaction: there's an argument to be made that this should be what type.compose()
is for.
However, my whole reason for putting together this change is because I want to be able to delete-then-create as a single transaction, which is basically the only combination of ops that can't be composed:
doc.preventCompose = false; // Doesn't do anything for delete or create
await doc.del();
// if the server dies here, I don't want the doc to be left in its deleted state
await doc.create();
@@ -225,30 +255,40 @@ SubmitRequest.prototype.commit = function(callback) { | |||
request.snapshot, | |||
request.options, | |||
function(err, succeeded) { | |||
// TODO: Should this callback be called before committing a transaction? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is well worth discussing. At the moment, the API is written like this:
const transaction = connection.startTransaction();
await doc1.submitOp([...], {transaction});
await doc2.submitOp([...], {transaction});
// must actively await acknowledgement from the server before committing
// a transaction
await transaction.commit();
In other words, this has undefined behaviour:
const transaction = connection.startTransaction();
-await doc1.submitOp([...], {transaction});
+doc1.submitOp([...], {transaction});
-await doc2.submitOp([...], {transaction});
+doc2.submitOp([...], {transaction});
await transaction.commit();
The API for this is largely ripped off MongoDB.
According to their source code:
IMPORTANT: Running operations in parallel is not supported during a transaction. The use of
Promise.all
,Promise.allSettled
,Promise.race
, etc to parallelize operations inside a transaction is undefined behaviour.
The benefit to us (as ShareDB) is that I think forcing consumers to wait for an acknowledgement feels like a clearer, better-defined set of behaviours.
The major downside is that the Doc
starts to drift away from canonical, both locally, and even on the server, where we need to track other ops made as part of the same commit and do extra work to transform the snapshot to the transaction version. However, this isn't completely crazy: already when calling submitOp
, the local Doc
optimistically updates its state as if the remote ops have been accepted. The main difference here is that we also optimistically bump the version (which submitOp
doesn't do). Is this semantically valid? Also, it changes the meaning of the "ack": instead of meaning "this op has been committed to the database", it now means "this op has been committed to the transaction). If users need to know when the op has been "truly" committed, they'll need to wait for the callback from transaction.commit()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I was thinking about this pre-holidays, I was thinking of:
startTransaction
creates a random session id locally, and causes subsequentsubmitOp
with the transaction to queue locallytransaction.commit()
causes all the locally queued ops to be sent to the server. Then the driver does whatever it needs to do with the batch of ops, based on the database.- Any errors are reported back in
transaction.commit()
Per discussion today, we do think the approach of local op queueing does make the implementation, middleware, and testing easier, so we're going with that.
|
||
PendingTransaction.prototype.registerSubmitRequest = function(request) { | ||
if (!this._transaction._requests.length) { | ||
this._transaction.backend.db.startTransaction(this._transaction.id, function(error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of this complication may be removed if we ignore the possibility of MongoDB deadlock
// TODO: Discuss if this is an acceptable API? Doc will always emit error on | ||
// a failed transaction, since the ops may have been successfully acked for this Doc, and | ||
// we force a hard rollback with no callback, which causes an 'error' event | ||
doc.on('error', function() {}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth discussing, and related to the discussion around whether a submitOp
callback should be fired when the op is committed, or when it's accepted as part of the transaction.
In the current design (callback fires when accepted in the transaction but not yet committed), it means that if the transaction fails for some reason away from doc.submit()
and transaction.commit()
(eg timeout; DB dies; etc.), there's no "sensible" place to emit the error that's generated as we hard rollback the Doc
to recover.
], done); | ||
}); | ||
|
||
it('remote submits after but commits first', function(done) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the test case that managed to force my MongoDB driver update into deadlock.
I added a lot of complexity into the sharedb-mongo
update to get this test case to pass, but I wonder if we need to.
The main issue is that we do this:
- Wait for
doc
to submit an op to a transaction remoteDoc
tries to update the same document outside of this transaction- We try to wait for
remoteDoc.submitOp
to resolve - However, it seems that MongoDB locks the snapshot document until
doc
's transaction is committed, soremoteDoc
's submission stalls, introducing deadlock
We can trivially rewrite this test case to remove the deadlock by not having such a strict series of awaiting — if we run the two transactions in parallel with no awareness of one another (which is how all transactions will run in practice in the real world), then no deadlock happens, because doc
will commit its transaction, and remoteDoc
will finally return with a failed write and retry, then eventually succeed.
The reason I didn't write it like this originally is that I was trying to force us into an edge case where both transactions attempt to touch the database, and running these operations in parallel removes a lot of the control I have over logical paths we take through ShareDB's recovery/retry machinery. However, is it worth enforcing this as part of the spec just for this? I'm not sure it is...?
@@ -225,30 +255,40 @@ SubmitRequest.prototype.commit = function(callback) { | |||
request.snapshot, | |||
request.options, | |||
function(err, succeeded) { | |||
// TODO: Should this callback be called before committing a transaction? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I was thinking about this pre-holidays, I was thinking of:
startTransaction
creates a random session id locally, and causes subsequentsubmitOp
with the transaction to queue locallytransaction.commit()
causes all the locally queued ops to be sent to the server. Then the driver does whatever it needs to do with the batch of ops, based on the database.- Any errors are reported back in
transaction.commit()
Per discussion today, we do think the approach of local op queueing does make the implementation, middleware, and testing easier, so we're going with that.
this.docs = JSON.parse(docs); | ||
this.ops = JSON.parse(ops); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With potentially large sets of docs/ops in tests, using a structured clone would be better for performance/memory. You mentioned rfdc is working well for you in other situations, and I've looked at in the past and it seems good.
No description provided.