Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add notifications for nightly build failures #296

Closed
SajidAlamQB opened this issue Aug 9, 2023 · 14 comments · Fixed by #322
Closed

Add notifications for nightly build failures #296

SajidAlamQB opened this issue Aug 9, 2023 · 14 comments · Fixed by #322
Assignees

Comments

@SajidAlamQB
Copy link
Contributor

SajidAlamQB commented Aug 9, 2023

Description

Nightly builds can fail silently without notifying the team, we should add a way either through slack or email integration to let the team know.

Example failure: https://github.com/kedro-org/kedro-plugins/actions/runs/5803202341
Possible solution: https://github.com/slackapi/slack-github-action

Context

Why is this change important to you? How would you use it? How can it benefit other users?

@astrojuanlu
Copy link
Member

At the moment there are 10/10 apps installed in the Kedro workspace, so new ones can't be added. Happy to chat about this when I'm back.

@merelcht
Copy link
Member

Discussed in backlog grooming: set this up to send email notification to the Kedro Framework team email address. Ideally set it up so it only sends an email on failure.

@merelcht merelcht moved this to To Do in Kedro Framework Aug 18, 2023
@DimedS DimedS moved this from To Do to In Progress in Kedro Framework Aug 21, 2023
@DimedS
Copy link
Contributor

DimedS commented Aug 22, 2023

I see three potential solutions to this problem:

  1. Anyone interested in notifications from the repository can 'watch' it on GitHub. In the GitHub settings, they can opt to receive notifications only for failed workflows (this is the default selection).
  2. Similar to the first option, we can set up another technical GitHub account specifically for our team's email([email protected]). Within this account's settings, we can enable only notifications from GitHub Actions.
  3. An alternative approach is to utilize a third-party action, such as dawidd6/action-send-mail, to integrate the notification system directly within the workflow. However, I have reservations about the reliability of this method:
    - name: Notify failure
      if: failure()
      uses: dawidd6/action-send-mail@v2
      with:
        server_address: ${{secrets.EMAIL_SERVER}}
        server_port: ${{secrets.EMAIL_PORT}}
        username: ${{secrets.EMAIL_USERNAME}}
        password: ${{secrets.EMAIL_PASSWORD}}
        subject: GitHub Actions build failed
        body: Build failed!
        to: [email protected]
        from: ${{secrets.EMAIL_USERNAME}}

We need to create the following repository secrets:

EMAIL_SERVER: Your SMTP server. E.g., smtp.gmail.com for Gmail.
EMAIL_PORT: Your SMTP port. E.g., 465 for Gmail with SSL.
EMAIL_USERNAME: The email address you're sending from.
EMAIL_PASSWORD: The password or app-specific password for the email.

@SajidAlamQB, what do you think?

@astrojuanlu
Copy link
Member

Quick comments: About (1) I didn't know this was possible

image

(https://github.com/settings/notifications)

About (2) possibly we'd use a non-McK email, such as the one we set up for shared social media accounts

@SajidAlamQB
Copy link
Contributor Author

I like approach (1) just using GitHub's built-in functionality is pretty straightforward with not much maintenance required. Plus, we don't need to worry about external dependencies or changes in third-party tools. It puts more of an individual responsibility.

We could make a confluence page guide on how the team can set this up and get everyone who wants to receive notifications from build failures follow it. Additionally it could be a part of our onboarding docs for new engineers on framework.

What do others feel? @noklam @ankatiyar

@ankatiyar
Copy link
Contributor

ankatiyar commented Aug 22, 2023

I have an idea but I'm not sure how feasible this is - Can lend a hand with figuring it out!
Basically, have a github actions based solution which creates an issue on the kedro-plugins repo when there is a nightly build failure? Maybe even configure it to see if there's failures, say 2-3 days in a row? Do you think this is possible @SajidAlamQB @DimedS? We could then bring it into the following sprint. This would cut down the middle man need for someone to monitor the email/notifications and create an issue.

@ankatiyar
Copy link
Contributor

Something like this -

  • Move the schedule: out of individual kedro-datasets/airflow/docker/telemetry.yml workflows.
  • Separate nightly-build.yml to call all the tests on the plugins
  • job 1: run the test
  • job 2: create an issue if there's a failure on job1

@ankatiyar
Copy link
Contributor

Thinking more about it - perhaps configuring nightly build to create an issue every time it fails might not be the best idea. It'll create a duplicate issue every day for failures we haven't fixed yet.

  • How adding a separate weekly build that creates an issue if there's test failures.
  • If it's nightly, configuring the job to create an issue if an issue doesn't exist OR adding a comment to an existing issue.

@SajidAlamQB
Copy link
Contributor Author

Also what if some failures are intermittent and just need to be re-run. Would that not result in unnecessary issues being made?

@ankatiyar
Copy link
Contributor

@SajidAlamQB If it's just a weekly build, we can run on Friday and close unnecessary issues on Monday during backlog grooming or sprint planning?

@merelcht
Copy link
Member

I'm not quite sure option 1 here solves the problem. I think that just sends you emails for your own builds that are failing and not for the scheduled nightly builds. I like @ankatiyar 's idea of creating an issue or something else very visible that makes sure we don't miss build failures.

@astrojuanlu
Copy link
Member

Maybe it can be one single issue that gets reopened everytime there's a failure. That way we avoid creating more and more issues that then we need to keep closing. And if the issue is already open and there's a new failure, nothing happens (so it's idempotent). Obviously the first time it would need to be created, and then from there it would look the issue by title, ID, label, or some other mechanims. How does that sound?

@ankatiyar
Copy link
Contributor

Sounds good @astrojuanlu. Can we park this till I finish the refactoring of the kedro-plugins ci?

@merelcht merelcht assigned ankatiyar and unassigned DimedS and SajidAlamQB Sep 4, 2023
@merelcht merelcht moved this from In Progress to In Review in Kedro Framework Sep 4, 2023
@github-project-automation github-project-automation bot moved this from In Review to Done in Kedro Framework Sep 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants