-
Notifications
You must be signed in to change notification settings - Fork 751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GOBBLIN-1859] Multi-active Unit Test for Multi-Participant state #3721
[GOBBLIN-1859] Multi-active Unit Test for Multi-Participant state #3721
Conversation
+ "SELECT ?, ?, ?, ?, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP WHERE NOT EXISTS (SELECT * FROM %s " | ||
+ WHERE_CLAUSE_TO_MATCH_KEY + ")"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than the NOT EXISTS
can we simply rely on the uniqueness constraint for the composite PK (fg, fn, fexid, flow_action)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if our DB settings for MySQL will throw an error if insert fails or UPSERT by default or ignore the insert if PK already exists so i wanted to use a statement that I can predict the behavior from
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you use the IGNORE modifier, ignorable errors that occur while executing the [INSERT](https://dev.mysql.com/doc/refman/8.0/en/insert.html) statement are ignored. For example, without IGNORE, a row that duplicates an existing UNIQUE index or PRIMARY KEY value in the table causes a duplicate-key error and the statement is aborted. With IGNORE, the row is discarded and no error occurs. Ignored errors generate warnings instead.
https://dev.mysql.com/doc/refman/8.0/en/insert.html suggests that an error will be thrown if PK already exists so need to use IGNORE or WHERE NOT EXISTS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO "where not exists" approaches obfuscation, especially in the midst of both the prepare statement ?
placeholders and the %s
formats for the table name. given the baseline cost of a remote DB call, an exception is not terribly expensive. thus, I'd recommend INSERT
w/o IGNORE
and then to be prepared to catch it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to catch the error
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/api/MysqlMultiActiveLeaseArbiter.java
Outdated
Show resolved
Hide resolved
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/api/MysqlMultiActiveLeaseArbiter.java
Outdated
Show resolved
Hide resolved
...n-runtime/src/test/java/org/apache/gobblin/runtime/api/MysqlMultiActiveLeaseArbiterTest.java
Outdated
Show resolved
Hide resolved
public void testConditionallyAcquireLeaseIfNewRow() throws IOException { | ||
// Inserting the first time should update 1 row | ||
int numRowsUpdated = this.mysqlMultiActiveLeaseArbiter.withPreparedStatement(formattedAcquireLeaseNewRowStatement, | ||
insertStatement -> { | ||
completeInsertPreparedStatement(insertStatement, resumeDagAction); | ||
return insertStatement.executeUpdate(); | ||
}, true); | ||
Assert.assertEquals(numRowsUpdated, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please explain what's being tested here ITO the class under test. I would expect the test to call one encapsulated method of MysqlMALeaseArbiter
's API, but here is seems like you're combining bits and pieces of it (like withPreparedStatement
), which arguably should be private
/protected
.
generally the public interface is up for testing... maybe with verification for a SMALL number of @VisibleForTesting
methods.
perhaps I'm just not understanding the connection between this test and the overall public API...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I want to test cases of trying to acquire a lease after processing the result of the initial GET query where we find that either no such lease exists in the store, lease has expired, or lease was completed for a distinct event, where another participant has altered the table state before the current participant tries the insert. The public API tested here is tryAcquireLease
called by multiple participants at once but I'm isolating the behavior I want to test by verifying the INSERTION SQL statements execute only if table state has not changed from the initial read. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall, the tryAcquireLease
impl deserves more abstraction. it's presently 110 LoC... with SO MANY details spread throughout.
naturally, one could test either the SQL or the logic flow, but from what you describe, it sounds like your aim is to test the logic. for instance, if you created abstractions to distill the first 40 lines of tryAcquireLease
to this:
// Check table for an existing entry for this flow action and event time
Optional<GetEventInfoResult> getResult = getExistingEventInfo(flowAction);
try {
if (!getResult.isPresent()) {
log.debug("tryAcquireLease for [{}, eventTimestamp: {}] - CASE 1: no existing row for this flow action, then go"
+ " ahead and insert", flowAction, eventTimeMillis);
boolean didAcquireLease = initializeToAcquireLease(flowAction);
return evaluateStatusAfterLeaseAttempt(didAcquireLease, flowAction, Optional.absent());
}
// [continues...]
couldn't you merely test evaluateStatusAfterLeaseAttempt
to see what it does (perhaps via mocking) when the first param is false
?
meanwhile, when it comes to testing SQL itself, use the systematic use of withPreparedStatement
to your advantage. i.e. exercise various methods and intercept withPreparedStatement
to validate that the expected SQL is passed as the first arg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the PR to contain refactoring discussed above where I've created abstractions to simplify the tryAcquireLease
function and make it easier to test each of the SQL statements in isolation.
54f4a03
to
f6ad3d7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice work--the code and tests read much more clearly now!
Codecov Report
@@ Coverage Diff @@
## master #3721 +/- ##
============================================
+ Coverage 47.08% 47.11% +0.02%
- Complexity 10862 10865 +3
============================================
Files 2146 2146
Lines 84781 84789 +8
Branches 9408 9409 +1
============================================
+ Hits 39922 39949 +27
+ Misses 41238 41216 -22
- Partials 3621 3624 +3
... and 6 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
…ache#3721) * new unit tests passing * clean up * change upsert to insert * catch sql exception & update test * Refactor to create better api for testing --------- Co-authored-by: Urmi Mustafi <[email protected]>
Dear Gobblin maintainers,
Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!
JIRA
Description
In multi-active mode, multiple participants will be reading/writing to MySQL table. We want to make sure each participant acts on the table in a manner that takes into account that the state may have changed while the MultiActiveLeaseArbiter is processing the result of a READ. This PR adds unit tests that validate SQL Insertion statements are conditional upon the state of the particular row corresponding to the flow action event being unchanged from the read made by this participant.
Tests
The PR adds 3 unit tests to test each of the cases of acquiring a lease to ensure the following insertions do not complete if the state has changed from our initial read.
Commits