V9.3 - Repository Reprocessing Automation and Enhanced Repository List Management
Summary
This release focuses on automating repository reprocessing to prevent unnecessary work by leveraging the worked-example-miner-candidates
GitHub directory to check the status of repositories. By improving the tracking and management of processed repositories, this version enhances the efficiency of the Worked-Example-Miner (WEM) project.
Key Improvements
Repository Reprocessing Automation
- Introduced a mechanism to check the
worked-example-miner-candidates
GitHub directory for already processed repositories, preventing reprocessing of previously analyzed candidates. - Updated the repository JSON list to track repositories that have been processed, ensuring smooth candidate management and eliminating redundant work.
Repository List Enhancements
- Enhanced the handling of repository data in
PyDriller/repositories_picker.py
to store and update theCandidates_Generated
attribute effectively, ensuring accurate tracking of processed repositories. - Improved the filtering functionality in
PyDriller/code_metrics.py
to only include repositories that have not yet been processed.
Refactorings and Code Quality
- Refined import statements in multiple files for better code structure and maintainability (
PyDriller/metrics_changes.py
). - Renamed JSON attributes for improved clarity and consistency across repository data handling (
PyDriller/repositories_picker.py
).
Bug Fixes
- Fixed issues with attribute initialization and repository list management to ensure correct marking of processed repositories.
Important Notes
- Users should ensure the updated repository list is properly synced with the
worked-example-miner-candidates
GitHub directory to take full advantage of the reprocessing automation. - The enhanced filtering mechanism will ensure that only unprocessed repositories are considered, reducing unnecessary computation.
Full Changelog: v9.2...v9.3