[GOBBLIN-1859] Multi-active Unit Test for Multi-Participant state #3721

umustafi · 2023-07-21T22:08:54Z

Dear Gobblin maintainers,

Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!

JIRA

My PR addresses the following Gobblin JIRA issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
- https://issues.apache.org/jira/browse/GOBBLIN-1859

Description

Here are some details about my PR, including screenshots (if applicable):
In multi-active mode, multiple participants will be reading/writing to MySQL table. We want to make sure each participant acts on the table in a manner that takes into account that the state may have changed while the MultiActiveLeaseArbiter is processing the result of a READ. This PR adds unit tests that validate SQL Insertion statements are conditional upon the state of the particular row corresponding to the flow action event being unchanged from the read made by this participant.

Tests

My PR adds the following unit tests OR does not need testing for this extremely good reason:
The PR adds 3 unit tests to test each of the cases of acquiring a lease to ensure the following insertions do not complete if the state has changed from our initial read.
insert when there is no matching primary key in the table
insert if a lease has expired
insert if a lease has completed

Commits

My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not "adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"

phet · 2023-07-21T22:21:16Z

gobblin-runtime/src/main/java/org/apache/gobblin/runtime/api/MysqlMultiActiveLeaseArbiter.java

+      + "SELECT ?, ?, ?, ?, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP WHERE NOT EXISTS (SELECT * FROM %s "
+      + WHERE_CLAUSE_TO_MATCH_KEY + ")";


rather than the NOT EXISTS can we simply rely on the uniqueness constraint for the composite PK (fg, fn, fexid, flow_action)?

I am not sure if our DB settings for MySQL will throw an error if insert fails or UPSERT by default or ignore the insert if PK already exists so i wanted to use a statement that I can predict the behavior from

If you use the IGNORE modifier, ignorable errors that occur while executing the [INSERT](https://dev.mysql.com/doc/refman/8.0/en/insert.html) statement are ignored. For example, without IGNORE, a row that duplicates an existing UNIQUE index or PRIMARY KEY value in the table causes a duplicate-key error and the statement is aborted. With IGNORE, the row is discarded and no error occurs. Ignored errors generate warnings instead. https://dev.mysql.com/doc/refman/8.0/en/insert.html suggests that an error will be thrown if PK already exists so need to use IGNORE or WHERE NOT EXISTS

IMO "where not exists" approaches obfuscation, especially in the midst of both the prepare statement ? placeholders and the %s formats for the table name. given the baseline cost of a remote DB call, an exception is not terribly expensive. thus, I'd recommend INSERT w/o IGNORE and then to be prepared to catch it

Updated to catch the error

gobblin-runtime/src/main/java/org/apache/gobblin/runtime/api/MysqlMultiActiveLeaseArbiter.java

...n-runtime/src/test/java/org/apache/gobblin/runtime/api/MysqlMultiActiveLeaseArbiterTest.java

phet · 2023-07-24T21:00:02Z

...n-runtime/src/test/java/org/apache/gobblin/runtime/api/MysqlMultiActiveLeaseArbiterTest.java

+  public void testConditionallyAcquireLeaseIfNewRow() throws IOException {
+    // Inserting the first time should update 1 row
+    int numRowsUpdated = this.mysqlMultiActiveLeaseArbiter.withPreparedStatement(formattedAcquireLeaseNewRowStatement,
+        insertStatement -> {
+          completeInsertPreparedStatement(insertStatement, resumeDagAction);
+          return insertStatement.executeUpdate();
+        }, true);
+    Assert.assertEquals(numRowsUpdated, 1);


please explain what's being tested here ITO the class under test. I would expect the test to call one encapsulated method of MysqlMALeaseArbiter's API, but here is seems like you're combining bits and pieces of it (like withPreparedStatement), which arguably should be private/protected.

generally the public interface is up for testing... maybe with verification for a SMALL number of @VisibleForTesting methods.

perhaps I'm just not understanding the connection between this test and the overall public API...

Here I want to test cases of trying to acquire a lease after processing the result of the initial GET query where we find that either no such lease exists in the store, lease has expired, or lease was completed for a distinct event, where another participant has altered the table state before the current participant tries the insert. The public API tested here is tryAcquireLease called by multiple participants at once but I'm isolating the behavior I want to test by verifying the INSERTION SQL statements execute only if table state has not changed from the initial read. Does that make sense?

overall, the tryAcquireLease impl deserves more abstraction. it's presently 110 LoC... with SO MANY details spread throughout.

naturally, one could test either the SQL or the logic flow, but from what you describe, it sounds like your aim is to test the logic. for instance, if you created abstractions to distill the first 40 lines of tryAcquireLease to this:

// Check table for an existing entry for this flow action and event time Optional<GetEventInfoResult> getResult = getExistingEventInfo(flowAction); try { if (!getResult.isPresent()) { log.debug("tryAcquireLease for [{}, eventTimestamp: {}] - CASE 1: no existing row for this flow action, then go" + " ahead and insert", flowAction, eventTimeMillis); boolean didAcquireLease = initializeToAcquireLease(flowAction); return evaluateStatusAfterLeaseAttempt(didAcquireLease, flowAction, Optional.absent()); } // [continues...]

couldn't you merely test evaluateStatusAfterLeaseAttempt to see what it does (perhaps via mocking) when the first param is false?

meanwhile, when it comes to testing SQL itself, use the systematic use of withPreparedStatement to your advantage. i.e. exercise various methods and intercept withPreparedStatement to validate that the expected SQL is passed as the first arg.

Updated the PR to contain refactoring discussed above where I've created abstractions to simplify the tryAcquireLease function and make it easier to test each of the SQL statements in isolation.

phet

nice work--the code and tests read much more clearly now!

codecov-commenter · 2023-08-10T20:28:53Z

Codecov Report

Merging #3721 (614d27d) into master (af48b31) will increase coverage by 0.02%.
The diff coverage is 85.36%.

@@             Coverage Diff              @@
##             master    #3721      +/-   ##
============================================
+ Coverage     47.08%   47.11%   +0.02%     
- Complexity    10862    10865       +3     
============================================
  Files          2146     2146              
  Lines         84781    84789       +8     
  Branches       9408     9409       +1     
============================================
+ Hits          39922    39949      +27     
+ Misses        41238    41216      -22     
- Partials       3621     3624       +3

Files Changed	Coverage Δ
...blin/runtime/api/MysqlMultiActiveLeaseArbiter.java	`78.24% <85.36%> (-0.61%)`	⬇️

... and 6 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

ZihanLi58

+1

…ache#3721) * new unit tests passing * clean up * change upsert to insert * catch sql exception & update test * Refactor to create better api for testing --------- Co-authored-by: Urmi Mustafi <[email protected]>

Urmi Mustafi added 2 commits July 21, 2023 11:54

new unit tests passing

e335a80

clean up

7fcacd2

phet reviewed Jul 21, 2023

View reviewed changes

Urmi Mustafi added 2 commits July 24, 2023 10:24

change upsert to insert

1ecaafa

catch sql exception & update test

6244592

phet reviewed Jul 24, 2023

View reviewed changes

Refactor to create better api for testing

f6ad3d7

umustafi force-pushed the umustafi/multiParticipantUnitTest branch from 54f4a03 to f6ad3d7 Compare August 4, 2023 18:46

phet approved these changes Aug 8, 2023

View reviewed changes

Merge branch 'master' into umustafi/multiParticipantUnitTest

614d27d

ZihanLi58 approved these changes Aug 10, 2023

View reviewed changes

ZihanLi58 merged commit 01c433c into apache:master Aug 10, 2023
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GOBBLIN-1859] Multi-active Unit Test for Multi-Participant state #3721

[GOBBLIN-1859] Multi-active Unit Test for Multi-Participant state #3721

umustafi commented Jul 21, 2023

phet Jul 21, 2023

umustafi Jul 21, 2023

umustafi Jul 21, 2023

phet Jul 24, 2023

umustafi Jul 24, 2023

phet Jul 24, 2023

umustafi Jul 25, 2023

phet Jul 26, 2023

umustafi Aug 4, 2023

phet left a comment

codecov-commenter commented Aug 10, 2023 •

edited

Loading

ZihanLi58 left a comment

		+ "SELECT ?, ?, ?, ?, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP WHERE NOT EXISTS (SELECT * FROM %s "
		+ WHERE_CLAUSE_TO_MATCH_KEY + ")";

[GOBBLIN-1859] Multi-active Unit Test for Multi-Participant state #3721

[GOBBLIN-1859] Multi-active Unit Test for Multi-Participant state #3721

Conversation

umustafi commented Jul 21, 2023

JIRA

Description

Tests

Commits

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

phet left a comment

Choose a reason for hiding this comment

codecov-commenter commented Aug 10, 2023 • edited Loading

Codecov Report

ZihanLi58 left a comment

Choose a reason for hiding this comment

codecov-commenter commented Aug 10, 2023 •

edited

Loading