Read the Paper: Automated_Code_Style_Enforcement_in_C_Programming_Courses.pdf
eastwood-tidy
is a port of the Eastwood C language linter to the clang-tidy check system, part of the LLVM infrastructure.
Initial work was done by Connor McMillin. The project is now maintained by Rowan Hart [email protected].
Users of eastwood-tidy do not need to follow the developer documentation to download and install LLVM, and should instead prefer to obtain a binary distribution of the linter.
Note that you will need to have llvm-dev
installed on your system if you want
to avoid spurious errors for missing header files. Otherwise, there are no other
dependencies and the distributed binary is statically linked.
You can obtain the latest release of eastwood-tidy
by downloading the latest binary
from the releases page on
GitHub. eastwood-tidy
currently only supports Linux, and there are no plans to create
releases for other platforms. You are, of course, welcome to create these yourself.
The binary can be installed by simply copying it to a $PATH
locationAutomated_Code_Style_Enforcement_in_C_Programming_Courses.pdf
Automated_Code_Style_Enforcement_in_C_Programming_Courses.pdf
, or with a full
path.
The recommended way to run eastwood-tidy
is through the linter script
to avoid needing to pass large numbers of arguments to the program. You may also choose
to write your own wrapper script, or you may modify the path given in ours and use it.
The program can be run in several modes:
- With no compile database:
clang-tidy
will attempt to guess compile options. This isn't ideal but will probably
work in most cases.
clang-tidy -checks "-*,eastwood*" path-to-file.c
- With a compile database:
clang-tidy
will use compile options from a compilation database. This can be made by
hand or output by cmake.
To generate the database with cmake, call cmake with -DCMAKE_EXPORT_COMPILE_COMMANDS=ON
To generate the database by hand, simply follow the below format and create
compile_commands.json
in the same directory (or a parent directory) of your .c
files.
[
{
"directory": "/absolute/path/to/the/output/directory/of/cmake/or/other/build/system/",
"command": "gcc -g -ggdb -O0 -any -other -compile -options -here -o relative/or/absolute/path/to/outputfile.o -c /absolute/path/to/the/c/file.c",
"file": "/absolute/path/to/the/c/file.c"
},
{
...
},
...
]
If that's unparseable, basically it's a dict containing the directory of the output (ie directory to build from), the compile command (gcc -whatever -o thing thing.c), and the file to compile.
The compile database can be manually specified with clang-tidy -p <database>.json
, but
it is probably easier to just put it in the same place as your source (or 1 directory
above).
The command clang-tidy -checks "-*,eastwood*" path-to-file.c
is the same.
- Options:
The clang-tidy linter has a few command line options, outlined below. Unfortunately, they are taken in a rather nasty format. To pass options you can do:
$ clang-tidy -checks "-*,eastwood*" -config="{CheckOptions: [{key: a, value: b}, {key: x, value: y}]}" path-to-file.c
The options we provide (and an example usage) are below:
Option | Default | Type | Example | Description |
---|---|---|---|---|
eastwood-Rule1bCheck.dump | false | bool | -config="{CheckOptions: [{key: eastwood-Rule1bCheck.dump, value: true}]}" |
Dump Names |
eastwood-Rule11dCheck.dump | false | bool | -config="{CheckOptions: [{key: eastwood-Rule11dCheck.dump, value: true}]}" |
Dump Embedded Constants |
The recommended IDE for developing eastwood-tidy
is VSCode. You can
develop it in VIM or Eclipse or anything you want, but rest assured it
will be a less user friendly experience.
Recommended plugin configuration for this project:
- Install
llvm-vs-code-extensions.vscode-clangd
- Install
ms-vscode.cpptools
- Use the repo
.vscode
settings which will disableC_Cpp
intellisense and useclangd
instead.
Below during the initial build there are instructions to add a
compile_commands.json
symlink to provide clangd
. Do this!
It will detect all the header information needed to view methods,
get autocompletion, and get error detection.
Before contributing by opening a PR or pushing code, install the pre-commit hook for
clang-format
by installing pre-commit
and the hook:
$ python3 -m pip install pre-commit # Skip this if you have already installed pre-commit
$ pre-commit install
You should now have a check/lint pass before every commit.
The only build dependencies are LLVM, LLVM tools, a compiler, and cmake. On Linux:
$ sudo apt install build-essential cmake git
$ "$(wget -O - https://apt.llvm.org/llvm.sh)" | bash -s -- 15
$ sudo apt install clangd-15 clang-15 llvm-15-dev llvm-15-tools \
clang-tools-15 clang-format-15 clang-tidy-15 lld-15
$ sudo update-alternatives --install /usr/bin/clang clang $(which clang-15) 100
$ sudo update-alternatives --install /usr/bin/clang++ clang++ $(which clang++-15) 100
$ sudo update-alternatives --install /usr/bin/clang-tidy clang-tidy $(which clang-tidy-15) 100
$ sudo update-alternatives --install /usr/bin/clang-format clang-format $(which clang-format-15) 100
$ sudo update-alternatives --install /usr/bin/lld lld $(which lld-15) 100
- Get the llvm github repo:
$ git clone https://github.com/llvm/llvm-project.git
- Clone this repo:
$ git clone https://github.com/novafacing/eastwood-tidy.git
- Link files from eastwood/setup to the proper places in llvm-project/clang-tools-extra/clang-tidy/
$ rm llvm-project/clang-tools-extra/clang-tidy/CMakeLists.txt llvm-project/clang-tools-extra/clang-tidy/ClangTidyForceLinker.h
$ ln -s $(pwd)/eastwood-tidy/setup/CMakeLists.txt $(pwd)/llvm-project/clang-tools-extra/clang-tidy/CMakeLists.txt
$ ln -s $(pwd)/eastwood-tidy/setup/ClangTidyForceLinker.h $(pwd)/llvm-project/clang-tools-extra/clang-tidy/ClangTidyForceLinker.h
$ ln -s $(pwd)/eastwood-tidy/ $(pwd)/llvm-project/clang-tools-extra/clang-tidy/
- Run patches to remove extraneous un-toggleable error reports from clang-tidy about missing compilation databases.
$ $(pwd)/eastwood-tidy/patch/patch.sh $(pwd)/llvm-project
- Use CMake + Make to build the new clang-tidy
cd llvm-project/llvm
mkdir build
cd build
cmake -GNinja -DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra" -DCMAKE_CXX_COMPILER="clang++" -DCMAKE_C_COMPILER="clang" -DLLVM_BUILD_TESTS="OFF" -DCMAKE_BUILD_TYPE="Debug" -DBUILD_SHARED_LIBS="OFF" -DCMAKE_COMPILE_COMMANDS=1 ..
cmake --build . -j NN # where NN is the number of cores in your machine + 1
-
If you are using vscode, you will want to symlink the
compile_commands.json
into your development directoryln -s $(pwd)/compile_commands.json /path/to/eastwood-tidy/compile_commands.json
to allowclangd
to work. -
Binary will be located at
llvm-project/llvm/build/bin/clang-tidy
-
If developing for use at [REDACTED] University, the binary can be updated by using the update script which will upload both the new binary and the necessary include directories to avoid spurious errors.
There is a .clang-format
file provided in the root directory. Ensure that your
clang-format
utility uses this format file.
To facilitate linting, a pre-commit
configuration is provided. You can run
pre-commit install
in the root of the repository to install the hook to format
the src
and include
directories on commit.
Note that improperly formatted code in Pull Requests will not be accepted, and you will be gently asked to run the formatter and update your request. I recognize that everyone has their preferred style and this may not be yours, but this is the source code for a linter, after all! Let's keep it consistent.
Tests have been updated once again to use a bespoke pytest-based testing framework.
To run tests, you will need to install the virtual environment for the tests:
cd test
poetry install
If you do not have poetry
, you can get it
here.
To run the tests, first make sure you are in the virtual environment. You can either run
poetry shell
(recommended) to work on the environment in your shell, or you can
prepend poetry run
to any test commands you run.
Specific instructions on writing and running tests in various forms can be found
in the testing readme, but in general you can run all of the tests
by running pytest
. There are currently a large number of failing and non-implemented
tests due to the in-progress overhaul of the codebase for the project.
To find your glibc directory, just nix eval nixpkgs.glibc.dev.outPath | tr -d '\n' | tr -d '"' | cat <<< "/include/"
.
Where glibc.dev
is any library whose headers you need.
You can just do clang-tidy <args> -- $(nix eval nixpkgs.glibc.dev.outPath | tr -d '\n' | tr -d '"' | cat <<< "/include")
.
Note for include directories and files: using clang-tidy <regular args> -- <clang args>
can be done to specify include directories. For example:
./clang-tidy -checks "-*,eastwood*" /home/novafacing/hub/llvm-project/clang-tools-extra/clang-tidy/eastwood/test/I/test_I_D_fail.c -- -I/nix/store/lqn6r231ifgs2z66vvaav5zmrywlllzf-glibc-2.31-dev/include/
The following rules must be applied to all functions, structures, typedefs, unions, variables, etc.
If the name is composed of more than one word, then underscores must be used to separate them.
Example: int temperature = 0;
Example: int room_temperature = 0;
Example: Variable such as "room_temperature" is descriptive and meaningful, but "i" is not. An exception can be made if "i" is used for loop counting, array indexing, etc.
An exception can also be made if the variable name is something commonly used in a mathematical equation, and the code is implementing that equation.
Constants must be declared using #define. A constant numeric value assigned must be enclosed in parenthesis.
String constants need to be placed in quotes but do not have surrounding parentheses.
Example: #define TEMPERATURE_OF_THE_ROOM (10)
Example: #define FILE_NAME "Data_File"
Declarations/definitions should be at the top of the file.
Example: int g_temperature = 0;
Global variable use should be avoided unless absolutely necessary.
A. Each line must be kept within 80 columns in order to make sure the entire line will fit on printouts. If the line is too long, then it must be broken up into readable segments. The indentation of the code on the following lines needs to be at least 2 spaces.
Example:
room_temperature = list_head->left_node->
left_node->
left_node->
left_node->
left_node->
temperature;
Example:
fread(&value, sizeof(double),
1, special_fp);
B. Each function should be kept small for modularity purpose. The suggested size is less than two pages. Exception can be made, if the logic of the function requires its size to be longer than two pages. Common sense needs to be followed.
Example: If a function contains more than two pages of printf or switch statements, then it would be illogical to break the function into smaller functions.
One space must also be present between the closing parenthesis and opening brace.
Example: if (temperature == room_temperature) {
Example: while (temperature < room_temperature) {
Exception for unary and data reference
operators (i.e. [], ., &, *, ->)
.
Example: temperature = room_temperature + offset;
Example: temperature = node->data;
Example: if (-temperature == room_temperature)
Example: for (i = 0; i < limit; ++i)
Example: *value = head->data;
Example: for (i = 0; i < limit; ++i)
Example: printf("%f %f %f\n", temperature, volume, area);
They need to have a blank line above and below.
Example:
int whatever(int some_value) {
#define FUNCTION_NAME "Whatever"
#define UPPER_LIMIT (56)
. . .
} /* whatever() */
This two space indentation rule must be applied to the entire program.
Note that the opening brace must be placed on the same line as the structure, control, or flow command. The closing brace must be placed on the line after the structure, control, or flow commands. The closing brace must also be alone on the line. Even if only one statement is to be executed it is necessary to use braces.
Example:
for (i = 0; i <= size; ++i) {
average += data[i];
}
Example:
while (temperature < MAX_TEMPERATURE) {
average += data[i];
temperature += offset;
}
Bad Example:
if (x < 7) {
. . .
} else { /* NO: else { should be on the next line */
. . .
}
B. Parameters for functions with more than one parameter should be on the same line, unless the line length is exceeded.
In that case, parameters on the next line should begin at the same column position as the parameters on the first line. The example below uses fewer than 80 characters just for demonstration purposes.
Example:
double average(double *data, int size, char *name,
int temperature) {
. . .
} /* average() */
Example:
int main(void) {
. . .
} /* main() */
Example:
do {
. . .
} while (size < LIMIT);
Comments are intended to alert people the intention of the code.
Example: This is a bad comment.
/* Variable to store the temperature */
double temperature = 0.0;
Example: This is a bad comment.
/* Increment i by one */
++i;
Example: This is a good comment.
/* Temperature is measured in Celsius */
/* and it ranges from 0 - 150 degrees */
double temperature = 0.0;
Exceptions can be made for short comments placed beside declarations, else, and switch commands.
Example:
if (temperature == room_temperature) {
statement;
statement;
}
else { /* comment for else statement */
statement;
statement;
}
Example:
switch (key) {
case DO_THIS: { /* comment for DO_THIS */
statement;
statement;
break;
}
case DO_THAT: { /* comment for DO_THAT */
statement;
statement;
break;
}
default: { /* comment for defaults */
statement;
statement;
break;
}
}
A blank line is not required above the comment if it is the 1st. line following an opening brace.
Example:
if (temperature == room_temperature) {
statement;
/* Comment */
statement;
}
else {
/* Comment for action when temperature is */
/* not equal room_temperature */
statement;
}
The comment should be the name of the function indented one space after the closing brace and include left and right parentheses.
Example:
double average(double *data, int size) {
. . .
} /* average() */
See section VII for an example.
If multiple logical expressions are used, sub-expressions must be parenthesized. Note the spacing and format below.
Example:
if ((volume_box_a == volume_box_b) &&
(volume_box_b == volume_box_c)) {
. . .
}
Example
if (((side_a < side_b) && (time < max_time)) ||
((value < data) && (limit > MIN_VALUE))) {
. . .
}
A header must be placed at the beginning of each function (including the main program). A header must contain detailed information, which describes the purpose of the function. The format is defined below: The header comment block must be at the left edge.
Example: Header at left edge.
/*
* This function computes the average of the passed data,
* which is stored in an array pointed to by ptr_data.
* The parameter Size is the index of the last array
* element used. The array uses items 0 to Size.
*/
double average(double *data, int size) {
int i = 0;
double average = 0.0;
for (i = 0; i <= size; ++i) {
average += data[i];
}
return (average / (size + 1));
} /* average() */
C. All header files should have #define guards to prevent multiple inclusions. The format of the symbol name should be _H
.
Example:
#ifndef BAZ_H
#define BAZ_H
...
#endif // BAZ_H
UNIX directory shortcuts (e.g., . and ..) are forbidden.
Example: src/base/logging.h should be included as:
#include "base/logging.h"
Suppose dir/foo.c implements things in dir2/foo.h. Your includes should be ordered as follows:
#include "dir2/foo.h"
#include <sys/types.h>
#include <unistd.h>
#include "base/basictypes.h"
#include "base/commandflags.h"
#include "foo/server/bar.h"
Note the spaces. Any adjacent blank lines should be collapsed.
Do not assume that just because foo.h currently includes bar.h you do not need to explicitly include bar.h.
Double quotes should only be used for include files that exist in the local directory structure.
A. Return values of functions such as malloc, calloc, fopen, fread, fwrite, and system must be checked or returned whenever a possible error condition exists.
Example:
if ((data_fp = fopen(file_name, "r")) == NULL) {
fprintf(stderr, "Error: Cannot open file ");
fprintf(stderr, "%s.\n", file_name);
exit(TRUE);
}
It is important to remember that fclose does not explicitly set the file pointer back to NULL. Therefore, it is necessary to set the file pointer to NULL.
D. Appropriate range checking must be performed to make sure received parameters are within expected range.
Example:
if ((temperature < LOWER_BOUND) ||
(temperature > HIGHER_BOUND)) {
fprintf(stderr, "Error: Temperature out of range.\n");
exit(TRUE);
}
Example:
assert((temperature > LOWER_BOUND) &&
(temperature <= HIGHER_BOUND));
Integers: Use 0
Reals: Use 0.0
Pointers: Use NULL
Characters: Use '\0'
All error messages must be directed to standard error.
Example: fprintf(stderr, "Error has occurred.\n");
XI. FORBIDDEN STATEMENTS
Bad Example: volume_box_a = volume_box_b = volume_box_c;
Bad Example:
if ((volume_box_a = compute_volume_box_a()) +
(volume_box_b = compute_volume_box_b()))
D. Do not use embedded constants; except for general initialization purposes and values that lack intrinsic meaning.
Common sense needs to apply.
Bad Example: return 1; /* DO NOT USE **/
In this case, the return value represents a failure. Create a #defined constant at the beginning of your program instead and use that...
#define HAILSTONE_ERROR (1)
Bad Example: if (temperature > 10) /* DO NOT USE */
Again, in this case the value means something, and you
should define a constant instead...
#define MAX_TEMPERATURE (10)
Bad Example: int temperature = 10; /* DO NOT USE */
Same idea...
#define START_TEMP (10)
Bad Examples:
#define ONE (1)
#define FIRST_INDEX (0)
#define FIRST_ITERATION (0)
If the value has intrinsic meaning, that meaning should be conveyed in the constant's name.
Example:
int temperature = 0; /* OK if 0 has no
* specific meaning
*/
Some exceptions:
Example: somefile_fp = fopen(file_name, "rb");
The use of the embedded constant "rb" is ok.
Example: if (fread(&data, sizeof(int), 2, somefile_fp) == 2)
The embedded constants 2 are ok here.
Example: if (fscanf(fp, "%d %s %f", &my_int, my_str, &my_float) == 3)
The embedded constant 3 is okay here.
Example: some_string[MAX_SIZE - 1] = '\0';
Subtracting or adding one from/to the size for the NULL terminator is fine.
Example: if (strchr(some_buf, 's')) {
You do not need to define a constant when referring to single, human-readable characters.
Example: malloc(sizeof(something) * 4); /* Allocate space for 4 somethings */
The constant 4 here is okay.
Again, please use common sense.
DON'T DO THIS:
int side_a, side_b, side_c = 0;
Do it this way:
int side_a = 0;
int side_b = 0;
int side_c = 0;
C. Variables should be placed in as local a scope as possible, as close to the first use as possible.
Example:
while (const char *p = strchr(str, '/')) {
str = p + 1;
}