-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add common workflow for taps and targets #33
base: main
Are you sure you want to change the base?
Conversation
@kgpayne and @WillDaSilva - This is a quick proof of concept of where I'd like to go with common (aka "required" workflows) across MeltanoLabs taps and targets. In a single place, I'd like to have workflows which manage the basic functions of PR validation, release drafting, and release publishing. Note: per each workflow and repo, the behavior is 'opt in'. I've given you both Admin access so you should be able to maintain the settings via this screen. While I understand this sacrifices control, it reduces overall surface area to maintain by at least one or two orders of magnitude. With specific expectations and assumptions established ahead of time, we should be able to iterate on improvements from a single location, and benefit all "opted-in" repos simultaneously. The savings of applying and maintaining changes across 20-40 repos' workflows independently is well worth the tradeoff (pending unseen blockers/issues), and will keep our practices uniform and streamlined. Knowing at the same time that there may be sharp edges and/or learnings to have along the way, I'd like to start this with 4-5 repos that each have similar 'status quo'; for instance, we could start with those which are already configured for dynamic versioning. Wdyt? cc @pnadolny13 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Knowing at the same time that there may be sharp edges and/or learnings to have along the way, I'd like to start this with 4-5 repos that each have similar 'status quo'; for instance, we could start with those which are already configured for dynamic versioning. Wdyt?
Makes sense, and I'm happy to see progress being made on this front.
I considered placing these workflows in https://github.com/MeltanoLabs/.github, but it looks like that would make them active by default, rather than opt-in.
# | ||
# Assumptions: | ||
# | ||
# 1. The repo has a tox.ini file, which defines a 'lint' command. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really want to be promoting Tox? It's a rather painful tool in my experience. Most of the time when I run it for the first time in a new repository it fails in some obscure way because my Python installation doesn't match that of the one who set up Tox for the repository.
That's not really Tox's fault, it's more of an indictment against Python packaging and version management as a whole, but it's still a poor experience for contributors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WillDaSilva Is there an alternative you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For linting? I've been finding pre-commit
pretty good for that job, but we don't need a workflow for that since we could use https://pre-commit.ci instead.
In either case, you need to ensure you're running on the right Python version, whatever that may be for a given project. Here we assume that's Python 3.10.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'm 100% in favor of replacing Tox with pre-commit for all lint checks & fixes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love not having a lint job defined in CI, but that's a separate matter.
Tox is the only thing we had that could run all 4 or 5 lint tools from a single comment, the same way on local as in the CI cloud service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a way to manually invoke the equivalent of what the cloud runs with pre-commit.ci, yes?
I'm a bit behind on that invocation, and might need to do a deeper dive / refresher.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, I think I'm in favor of committing this lint workflow since it pairs with our cookiecutter. I've already named the file in a way that it is specific to this invocation pattern.
We can just not enable it for repos where it isn't useful, but in the meanwhile, we still have the option to do so for repos where the tox-based lint rules exist and the pre-commit.ci set does not (yet).
What do you think of that, at least for short-term?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a way to manually invoke the equivalent of what the cloud runs with pre-commit.ci, yes?
pre-commit run --all-files
n.b. the default pre-commit
/ pre-commit run
command will only run on the staged changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can just not enable it for repos where it isn't useful, but in the meanwhile, we still have the option to do so for repos where the tox-based lint rules exist and the pre-commit.ci set does not (yet).
What do you think of that, at least for short-term?
Make sense 👍
1. Automated CI testing on currently-supported Python versions. | ||
1. Automated verification of Semantic PR logic. | ||
1. Release management using dynamic versioning. | ||
1. Automated Release Notes drafting using Semantic PRs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Automated Release Notes drafting using Semantic PRs
Do you intend to add a workflow for this before merging this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how we do it currently for Meltano and the SDK: https://github.com/meltano/sdk/blob/1ee4c4ab32bdb3b7c7e88711df2e455a2826c6ed/.github/workflows/version_bump.yml#L52-L74
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WillDaSilva - If it is ready to go, we can certainly add it. I wasn't planning on blocking for it though, since we can add/iterate afterwards too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aaronsteers I just don't think it should be documented if it isn't present. I agree that we can add it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second review, I think as the workflow is defined today, it must be triggered manually.
We probably would want to make this idempotent (if not already so) and likely trigger on all pushes to main
, so that each main
update automatically updates/creates the release notes draft.
As such, probably this should wait for a subsequent PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
name: Python Pytest Check (MeltanoLabs) | ||
|
||
on: [push] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on: [push] | |
on: | |
pull_request: {} | |
push: | |
branches: [main] | |
workflow_dispatch: | |
inputs: {} | |
# Cancel in-progress jobs if a newer job supersedes it | |
concurrency: | |
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }} | |
cancel-in-progress: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if I follow the distinctions here.
- Workflow dispatches to allow these to be run manually - I think that makes sense.
- What's the benefit of
pull_request: {}, push: { branches: [main] }
vs juston: [push]
? - I could guess but I'm not confident I know what the
concurrency
block does. An inline comment would be helpful for clarity+maintainability I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on: push
triggers a workflow run for each push, even if there is no associated PR.
The suggested on
block triggers a workflow run for each update to the PR (e.g. each time new commits are pushed), and will additionally run the workflow when main
is pushed to. It also support running the workflow manually, as you said.
The concurrency
block groups running jobs if they have the same group
, which is ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
, e.g. common_pytest-1234-branch-name
, or common_pytest-12341251346132346
. When a new job named ABC is started that matches that group, and an old job ABC within the group is still in-progress, the old one is cancelled. This is so that if someone pushes multiple times to a PR in quick succession, only the most recent jobs will run to completion, since the old ones are obsolete. We use this same block of code in several Meltano repositories.
I'll update it with a comment explaining it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on: push
triggers a workflow run for each push, even if there is no associated PR.
I think this was my desired behavior, to run on all commits even if a PR doesn't (yet) exist and including main
branch along with all other branches. Any reason not to run if a PR doesn't exist yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW this may trigger duplicate runs once the PR is created, one for the push and another for the PR sync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There isn't much visibility when automatically running workflows without a PR, and it may waste compute (which we don't pay for for public repos, but it still feels wasteful).
I have no strong objection to using on: push
(plus workflow_dispatch
) if you think running on every push is valuable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so that if someone pushes multiple times to a PR in quick succession, only the most recent jobs will run to completion, since the old ones are obsolete. We use this same block of code in several Meltano repositories.
Okay, this makes sense. I don't personally mind those jobs running to completion for taps/targets where the duration is rather short. In the past I have pushed a commit followed by a second one and maybe a third, specifically to see how the variations will perform in CI, essentially parallelizing the work by offloading it.
I'm happy to use the dedupe / short-circuit logic you propose. I slightly prefer not to dedupe, but I do not feel strongly enough to push back if that's your preferred behavior.
|
||
name: Python Lint Checks from Tox (MeltanoLabs) | ||
|
||
on: [push] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on: [push] | |
on: | |
pull_request: {} | |
push: | |
branches: [main] | |
workflow_dispatch: | |
inputs: {} | |
# Cancel in-progress jobs if a newer job supersedes it | |
concurrency: | |
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }} | |
cancel-in-progress: true |
Deleting my suggestion had an unintended side-effect @edgarrmondragon |
Co-authored-by: Will Da Silva <[email protected]>
Co-authored-by: Will Da Silva <[email protected]>
Co-authored-by: Will Da Silva <[email protected]>
Co-authored-by: Will Da Silva <[email protected]>
This takes the dynamic versioning publish workflow from target-snowflake and makes it available as a shared workflow for other repositories.