Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GOBBLIN-1864] Catch exceptions if specific watermark calculation fails #3728

Merged
merged 2 commits into from
Aug 1, 2023

Conversation

jack-moseley
Copy link
Contributor

Dear Gobblin maintainers,

Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!

JIRA

Description

  • Here are some details about my PR, including screenshots (if applicable):
    If one type of watermark calculation fails (e.g. due to missing tier, or count being 0) it causes all watermarks not to get updated. This PR allows watermarks to get updated individually to avoid blocking other ones if they are working correctly.

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:
    Added unit test which reproduces the issue

Commits

  • My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

try {
result.put(type, calculateCompleteness(datasetName, beginInMillis, endInMillis, type, countsByTier) > threshold);
} catch (IOException e) {
log.error("Failed to calculate completeness for type " + type, e);
Copy link
Contributor

@wsarecv wsarecv Jul 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we set a value to false if its computation fails? So that the returned map always contains two elements.
Just to avoid null when map.get(), ClassicWatermarkUpdater & TotalCountWatermarkUpdater have similar logic like:

protected void computeAndUpdateInternal(Map<KafkaAuditCountVerifier.CompletenessType, Boolean> results,
ZonedDateTime timestampDT) {
if (!results.get(KafkaAuditCountVerifier.CompletenessType.TotalCountCompleteness)) {
return;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes good point, I missed that it would cause a null failure there. Instead of always returning false, I changed the checks themselves to handle null properly, that seems safer in case the map is incomplete for some reason.
if (!results.getOrDefault(KafkaAuditCountVerifier.CompletenessType.ClassicCompleteness, false))

@codecov-commenter
Copy link

codecov-commenter commented Jul 31, 2023

Codecov Report

Merging #3728 (7fa4fab) into master (8198404) will increase coverage by 3.49%.
The diff coverage is n/a.

@@             Coverage Diff              @@
##             master    #3728      +/-   ##
============================================
+ Coverage     47.45%   50.95%   +3.49%     
+ Complexity     8687     7215    -1472     
============================================
  Files          1731     1349     -382     
  Lines         66736    51126   -15610     
  Branches       7219     5686    -1533     
============================================
- Hits          31671    26051    -5620     
+ Misses        32306    22769    -9537     
+ Partials       2759     2306     -453     

see 383 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Contributor

@ZihanLi58 ZihanLi58 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@ZihanLi58 ZihanLi58 merged commit 951809d into apache:master Aug 1, 2023
6 checks passed
phet pushed a commit to phet/gobblin that referenced this pull request Aug 15, 2023
…ls (apache#3728)

* Catch exceptions if specific watermark calculation fails

* Add null handling for watermark map
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants