Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds linting via pre-commit #62

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
2 changes: 2 additions & 0 deletions .git-blame-ignore-revs
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
64168e2e0e9d3ef2fe93fcb6adb8e5d2b020eda1
57b9446ff233660e8668e76b9c27f995ba5eb5f8
76 changes: 76 additions & 0 deletions .github/workflows/pre-commit.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
name: Lint Check (pre-commit)

on:
# We have to use pull_request_target here as pull_request does not grant
# enough permission for reviewdog
pull_request_target:
push:

jobs:
pre-commit-push:
name: Pre-Commit check on Push
runs-on: ubuntu-latest
if: ${{ github.event_name == 'push' }}

steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: 3.13

# We wish to run pre-commit on all files instead of the changes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wonder if this is really what we want? Aside from the obvious inefficiencies involved in checking all files all the time this will make it potentially difficult to override a lint/format setting. Let me give you a concrete case of this. In my day-job-project we using clang-tidy on code, but it struggles in any setup we've used to nicely format monadic function chains. So if we had are setup run all files every checkin it would be annoying (note there's likely ways to turn off the changes we don't know about). Looking for other opinions on this point.

Copy link
Member Author

@wusatosi wusatosi Nov 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upstream for this CI config is in exemplar, so the context is this is meant to be run along side some huge unit test complex.

The main idea is if we have a linting rule (or any enforceable rule) then if some part of the system is broken, the whole system is broken. Linting potentially causes sweeping changes due to reformatting, so its best the reformat is as close to lint error as possible (best in the same commit).

Explicitly, we are trying to avoid this:

  • commit 0 feature [x diff breaks lint a.cpp]
  • commit 1 feature [✓ diff pass lint b.cpp]
  • commit 2 ready to ship [✓ diff pass lint b.cpp]

Here the lint failure in 0 may still not be fixed because no diff later touches a.cpp .

  • This is inefficient: Yes this is inefficient, but all the linters are light weight and often doesn't even involve the need to construct an AST. In comparison to pulling the packages to prepare linting, the unit test of cross platform + cross compiler matrix/ later the long clang-tidy job (in exemplar/ C++ project context), the actual lint cost here is minimal. Arguing lint cost here is in efficient.
  • Override linting: This is not the concern of CI, overriding linting should be the concern of configuring the linter. e.g. codespell:ignore, markdownlint-disable-next-line, gersemi: ignore. Linting error should be explicitly suppressed (best in-line), but not to be ignored. Running lint check only on diff doesn't guarantee error propagation unless we do complicated trickery. Think this as unit test, you would ideally run the entire unit-test set, and the solution to flaky unit tests are, ideally to fix them first, if it doesn't work then skip, but not to ignore them from CI perspective out-right.

Copy link
Member

@JeffGarland JeffGarland Nov 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the long explanation. And while I agree there are overrides possible many of these involve actual source modifications which are sometimes unpleasant for authors, etc. And, if I'm not mistaken the standard we're setting here will require the revisiting of those standards on each commit for unchanged source.

Upstream for this CI config is in exemplar, so the context is this is meant to be run along side some huge unit test complex.

Sure. I'll note however, that none of this is explained in exemplar at the moment.

The main idea is if we have a linting rule (or any enforceable rule) then if some part of the system is broken, the whole system is broken.

To me there's a large difference between a broken test and some code that doesn't pass formatting checks. So maybe that makes these 'less enforceable rules'.

I guess at a minimum I'd like some others to chime in on this.

@steve-downey @neatudarius @camio - thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @wusatosi that enforcement of linter checks at CI time is the most desirable option and this is what we're doing with every other repository in beman so far. We "push left" by allowing developers to get this feedback with the pre-commit tool, but if they don't use this, there's review dog to suggest the changes needed at PR time. I've worked in several repositories that enforce linters and it's been a positive experience.

# only made in the push commit.
#
# So linting error persists when there's formatting problem.
- uses: pre-commit/[email protected]

pre-commit-pr:
name: Pre-Commit check on PR
runs-on: ubuntu-latest
if: ${{ github.event_name == 'pull_request_target' }}

permissions:
contents: read
checks: write
issues: write
pull-requests: write

steps:
- name: Checkout repository
uses: actions/checkout@v4

# pull_request_target checkout the base of the repo
# We need to checkout the actual pr to lint the changes.
- name: Checkout pr
run: gh pr checkout ${{ github.event.number }}
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: 3.13

# we only lint on the changed file in PR.
- name: Get Changed Files
id: changed-files
uses: tj-actions/changed-files@v45

# See:
# https://github.com/tj-actions/changed-files?tab=readme-ov-file#using-local-git-directory-
- uses: pre-commit/[email protected]
id: run-pre-commit
with:
extra_args: --files ${{ steps.changed-files.outputs.all_changed_files }}

# Review dog posts the suggested change from pre-commit to the pr.
- name: suggester / pre-commit
uses: reviewdog/action-suggester@v1
if: ${{ failure() && steps.run-pre-commit.conclusion == 'failure' }}
with:
tool_name: pre-commit
level: warning
reviewdog_flags: "-fail-level=error"
8 changes: 8 additions & 0 deletions .markdownlint.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# MD033/no-inline-html : Inline HTML : https://github.com/DavidAnson/markdownlint/blob/v0.35.0/doc/md033.md
# Disable inline html linter is needed for <details> <summary>
MD033: false

# MD013/line-length : Line length : https://github.com/DavidAnson/markdownlint/blob/v0.35.0/doc/md013.md
# General Column Limit for code is 119.
MD013:
code_block_line_length: 119
31 changes: 31 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files

# CMake linting and formatting
- repo: https://github.com/BlankSpruce/gersemi
rev: 0.17.1
hooks:
- id: gersemi
name: CMake linting

# Markdown linting
# Config file: .markdownlint.yaml
- repo: https://github.com/igorshubovych/markdownlint-cli
rev: v0.43.0
hooks:
- id: markdownlint

# Code spelling check, note auto-correction is not enabled
- repo: https://github.com/codespell-project/codespell
rev: v2.2.4
hooks:
- id: codespell
exclude: ^build|presentations
146 changes: 90 additions & 56 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,80 +7,114 @@ include(CTest)
include(FetchContent)

option(
BEMAN_USE_MAIN_BRANCHES
"Instead of using pinned versions of beman libraries, use each from its main branch."
OFF
BEMAN_USE_MAIN_BRANCHES
"Instead of using pinned versions of beman libraries, use each from its main branch."
OFF
)

# jsonfile = ./libraries.json
set(Beman_jsonfile "${CMAKE_CURRENT_LIST_DIR}/libraries.json")

# force CMake to reconfigure if the JSON file changes
set_property(
DIRECTORY "${CMAKE_CURRENT_LIST_DIR}"
APPEND
PROPERTY
CMAKE_CONFIGURE_DEPENDS "${Beman_jsonfile}"
DIRECTORY "${CMAKE_CURRENT_LIST_DIR}"
APPEND
PROPERTY CMAKE_CONFIGURE_DEPENDS "${Beman_jsonfile}"
)

# Read the JSON file
file(READ "${Beman_jsonfile}" Beman_rootobj)

# Get the libraries array and store it in Beman_librariesobj
string(JSON Beman_librariesobj ERROR_VARIABLE Beman_error GET "${Beman_rootobj}" "libraries")
if (Beman_error)
message(FATAL_ERROR "${Beman_error}")
endif ()
string(
JSON
Beman_librariesobj
ERROR_VARIABLE Beman_error
GET "${Beman_rootobj}"
"libraries"
)
if(Beman_error)
message(FATAL_ERROR "${Beman_error}")
endif()

# Get the length of the libraries array
string(JSON Beman_numlibraries ERROR_VARIABLE Beman_error LENGTH "${Beman_librariesobj}")
if (Beman_error)
message(FATAL_ERROR "${Beman_error}")
endif ()
string(
JSON
Beman_numlibraries
ERROR_VARIABLE Beman_error
LENGTH "${Beman_librariesobj}"
)
if(Beman_error)
message(FATAL_ERROR "${Beman_error}")
endif()

# Loop over each library object
math(EXPR Beman_maxindex "${Beman_numlibraries} - 1")
foreach (Beman_index RANGE "${Beman_maxindex}")
# libraryobj = libraryobjs[index]
string(JSON Beman_libraryobj ERROR_VARIABLE Beman_error GET "${Beman_librariesobj}" "${Beman_index}")
if (Beman_error)
message(FATAL_ERROR "${Beman_error}")
endif ()

# name = libraryobj["name"]
string(JSON Beman_name ERROR_VARIABLE Beman_error GET "${Beman_libraryobj}" "name")
if (Beman_error)
message(FATAL_ERROR "${Beman_error}")
endif ()

# repo = libraryobj["git_repository"]
string(JSON Beman_repo ERROR_VARIABLE Beman_error GET "${Beman_libraryobj}" "git_repository")
if (Beman_error)
message(FATAL_ERROR "${Beman_error}")
endif ()

# Select the field to use to fetch the library based on the value of
# the BEMAN_USE_MAIN_BRANCHES option
if (BEMAN_USE_MAIN_BRANCHES)
set(Beman_tagfieldkey "default_branch")
else ()
set(Beman_tagfieldkey "git_tag")
endif ()

# tag = libraryobj[Beman_tagfieldkey]
string(JSON Beman_tag ERROR_VARIABLE Beman_error GET "${Beman_libraryobj}" "${Beman_tagfieldkey}")
if (Beman_error)
message(FATAL_ERROR "${Beman_error}")
endif ()

message(DEBUG "Fetching ${Beman_name} from ${Beman_repo} at ${Beman_tag}")
FetchContent_Declare(
"${Beman_name}"
GIT_REPOSITORY "${Beman_repo}"
GIT_TAG "${Beman_tag}"
)
list(APPEND Beman_names "${Beman_name}")
endforeach ()
foreach(Beman_index RANGE "${Beman_maxindex}")
# libraryobj = libraryobjs[index]
string(
JSON
Beman_libraryobj
ERROR_VARIABLE Beman_error
GET "${Beman_librariesobj}"
"${Beman_index}"
)
if(Beman_error)
message(FATAL_ERROR "${Beman_error}")
endif()

# name = libraryobj["name"]
string(
JSON
Beman_name
ERROR_VARIABLE Beman_error
GET "${Beman_libraryobj}"
"name"
)
if(Beman_error)
message(FATAL_ERROR "${Beman_error}")
endif()

# repo = libraryobj["git_repository"]
string(
JSON
Beman_repo
ERROR_VARIABLE Beman_error
GET "${Beman_libraryobj}"
"git_repository"
)
if(Beman_error)
message(FATAL_ERROR "${Beman_error}")
endif()

# Select the field to use to fetch the library based on the value of
# the BEMAN_USE_MAIN_BRANCHES option
if(BEMAN_USE_MAIN_BRANCHES)
set(Beman_tagfieldkey "default_branch")
else()
set(Beman_tagfieldkey "git_tag")
endif()

# tag = libraryobj[Beman_tagfieldkey]
string(
JSON
Beman_tag
ERROR_VARIABLE Beman_error
GET "${Beman_libraryobj}"
"${Beman_tagfieldkey}"
)
if(Beman_error)
message(FATAL_ERROR "${Beman_error}")
endif()

message(DEBUG "Fetching ${Beman_name} from ${Beman_repo} at ${Beman_tag}")
FetchContent_Declare(
"${Beman_name}"
GIT_REPOSITORY "${Beman_repo}"
GIT_TAG "${Beman_tag}"
)
list(APPEND Beman_names "${Beman_name}")
endforeach()

FetchContent_MakeAvailable(${Beman_names})

Expand Down
Loading