Releases: AI-team-UoA/pyJedAI
Releases · AI-team-UoA/pyJedAI
0.0.9
⚒️Fixed:
- FAISS euclidean distance
- Workflow methods
- Removed whoosh
- Removed SCANN
➕Added:
- 3 New workflow methods
- Export pairs in each step
- Tfidf weights in matching options
- Website:
- code API
- new tutorials
⚠️ Issues:
- None
0.0.8
Fixed:
- Word grams tokenization
- Code architecture in entity matching
- py_stringmatching dependencies
- Pypi readme
Added:
- Boolean/Tfidf/Tf weights
0.0.7
Fixed:
- Issues in block filtering
- Issues in vector based blocking
- Data model set types
- EJoin wrong naming
Added:
- Prioritization algorithms
- Tf-Idf functionality
- More metrics on entity matching
- Optional data cleaning functionalities
- New visualizations
- New stats for the blocking workflows
v0.0.6
Fixed issue in VB.
v0.0.5
Added:
- New evaluation module
- Matching metrics
- Vector based blocking techniques
- Data process methods
- Entity matching plots
- sphinx website
- New tests
Fixed:
- Architecture, abstract data types
- Data bugs in block building
- Bugs in vector based blocking
- Using workflows without gt
- Code runtime
v0.0.4
Python 3.7 and 3.8 are now supported!
New dependencies. pyJedAI supports now older python versions.
Total supported versions:
- 3.7
- 3.8
- 3.9
- 3.10
Also, added tests for all supported python versions and MacOS.
v0.0.3
First official release in PyPI
Contains:
- Tutorials and demos
- Fixed issues
v0.0.2
Optimizations, User-friendly Approach Updates
This is the second release. Project is still under development. In this release we:
- Added
WorkFlow
module: A high-level method that simplifies all the process. User friendly approach. - Added comments in the basic methods.
- Performed time optimizations using by utilizing the most python.
- Created automatic tests.
- Created new Block Building Method, by using pre-trained embeddings and Gensim. Similarity search with FAISS framework.
- Uploaded to PyPI.
- Visualization techniques for performance check.
v0.0.1
First pyJedAI release: This release presents the basic structure of the well-known JedAI toolkit into the python environment. Contains:
- Data reading techniques: RDF/OWL, SPARKQL, CSV, JSON, DB
- Block building: Standard Blocking, QGrams & Extended, SuffixArray & Extended
- Block cleaning: Block purging, Block filtering
- Comparison cleaning: Weighted edge/node pruning, Cardinality edge/node pruning, BLAST, etc
- Entity matching: strsimpy
- Entity clustering: Connected component clustering
- Similarity Joins: SchemaAgnosticΕJoin, TopKSchemaAgnosticJoin
- Evaluation through Jupyter notebook