PyMiner - Python Feature Counter

Python Feature Counter is a tool designed to analyze Python repositories and count occurrences of specific Python language features introduced in various versions. The application processes repositories from a given list and generates detailed CSV reports with the analysis results.

Features

Analyzes Git repositories for occurrences of modern Python features.
Supports multithreaded processing for better performance.
Filters commits by date range.
Outputs results in CSV format for easy reporting and visualization.

Requirements

Python: Version 3.12 or newer.
Git: Must be installed and accessible in the system's PATH.
Dependencies: Installable via pip.

Installation

Clone the repository:

git clone https://github.com/your-username/feature-counter.git
cd feature-counter
´´´

Install dependencies:
```
pip install -r requirements.txt
´´´
```

Directory Structure

PyMiner/ ├── visitors/ # Feature-specific visitor modules ├── results/ # Output directory for CSV files ├── main.py # Main script ├── feature_counter.py # Core processing logic ├── requirements.txt # Dependency file └── README.md # Documentation

Usage

Prepare the Input CSV File

The application expects a CSV file with a single column named name, containing the list of repositories to analyze in the format /. Example:

name
owner1/repo1
owner2/repo2

Save the file as python-projects.csv or any name of your choice.

Run the Application

Run the script with the path to your CSV file as a command-line argument:

python3 main.py python-projects.csv
´´´
3. Results

Processed results are saved in the results/ directory as CSV files, named <owner>_<repo>.csv. Each file includes:

 Repository details.
 Date range of commits analyzed.
 Count of specific Python feature occurrences.

## Configuration

The application can be customized directly in the script:

 start_date: Defines the earliest commit date to analyze. Default is 2012-01-01.
 max_threads: Sets the number of threads for parallel processing. Default is 4.
 steps: Specifies the number of days between commit analyses. Default is 30.

Name		Name	Last commit message	Last commit date
Latest commit History 157 Commits
dataset		dataset
results		results
tests		tests
visitors		visitors
.gitignore		.gitignore
LICENSE		LICENSE
PAMunb_PyMiner.csv		PAMunb_PyMiner.csv
README.md		README.md
commit_processor.py		commit_processor.py
feature_counter.py		feature_counter.py
limpar_dataset.sh		limpar_dataset.sh
main.py		main.py
python-projects.csv		python-projects.csv
repo_manager.py		repo_manager.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyMiner - Python Feature Counter

Features

Requirements

Installation

Directory Structure

Usage

About

Releases

Packages

Contributors 2

Languages

License

PAMunb/PyMiner

Folders and files

Latest commit

History

Repository files navigation

PyMiner - Python Feature Counter

Features

Requirements

Installation

Directory Structure

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages