-
Notifications
You must be signed in to change notification settings - Fork 0
Types of Data to Collect
Ben Klein edited this page Nov 13, 2019
·
2 revisions
- count LOC (no comments)
- CCN (cyclomatic score)
- Token count per function
- Parameter count per function
- last modified
- size
- lines
- bytes
- encoding
- mode
- executable? (does it have a shebang?)
- symlink?
- non-text?
- linker stats
- compiler stats
- architecture
- path
- project path length
- filename length
- casing (snake,camel,etc)
- extension (and does it match contents?)
- number of contributors to file
- per-line ownership of file contents
- per-token ownership of contents
- line/token "heat" (frecency of changes)
- meta files
- .gitkeep usage
- lfs objects
- binary objects
- other git extensions in use
- number of issues
- tags
- number of comments (frecency)
- contributor heat index
- commits vs hub activity
- see github's little diagram of issues,prs,reviews,commits
- do that per repo per user
- all the same data as the insights tab could provide
- relating stargazers/watchers to activity in the repo