Releases: AI-team-UoA/pyJedAI
Releases · AI-team-UoA/pyJedAI
0.1.9
⚒️ Fixed
- Issue #25
➕ Added
- Optimized exports runtime - Removed pandas concat
⚠️ Issues
- None
Full Changelog: 0.1.8...0.1.9
Authored by @Nikoletos-K
0.1.8
⚒️ Fixed
- Issue #22 and #23.
- NNs save/load embeddings issue [ @JacobMaciejewski ].
- NN unused print.
- Matching issues.
➕ Added
- New visualizations (PCA and tSNE)
⚠️ Issues
- None
Full Changelog: 0.1.7...0.1.8
Authored by @Nikoletos-K
0.1.7
⚒️ Fixed
- Issue #19 , #20 , #21 ;
- Removed FALCONN and SCANN
- Refined dependencies
- Removed Optuna injection
- Fixed typos
- Reports
➕ Added
- New utilities to docs
⚠️ Issues
- None
Full Changelog: 0.1.6...0.1.7
Authored by @Nikoletos-K
0.1.6
⚒️ Fixed
- Issue #16 ;
- Typos in clustering.py
- Datamodel gt initialization
- Imports in utils
- Bugs in NN-workflow
- Bugs and evaluation of simple Schema Clustering
➕ Added
- Dataframe memory consumption
- New Schema Clustering method for RDF data [Not final implementation - alpha version]
⚠️ Issues
- SCANN and FALCONN produce warnings
Full Changelog: 0.1.5...0.1.6
Authored by @Nikoletos-K
0.1.5
⚒️ Fixed
- Schema Matching structure [ @Nikoletos-K ]
➕ Added
- First working version of Schema Clustering [ @Nikoletos-K ]
- vector_based_blocking component: SCANN/FAISS full functionality on Linux OS only! [ @JacobMaciejewski ]
- RowColumnClustering: new clustering algorithm [ @JacobMaciejewski ]
⚠️ Issues
- Minor changes in ProgressiveWorkFlow(PYJEDAIWorkFlow). [ @JacobMaciejewski ]
0.1.4
⚒️ Fixed
- Correlation Clustering method.
- nltk.download('stopwords') download only when needed.
- Schema Matching component to align with the latest version of Valentine.
➕ Added
- datamodel.py: SchemaData for Schema Matching Component
‼️ New Component; pyJedAI Spatial, for Interlinking geospatial RDF data. [ @IordanisT ]- SCANN functionality, only available for Linux OS. [ @JacobMaciejewski ]
⚠️ Issues
- None
0.1.3
⚒️ Fixed
- None
➕ Added
-
Clustering algorithms: [ Author: @JacobMaciejewski 📌 ]
- EquivalenceCluster
- ExtendedSimilarityEdge
- Vertex
- RicochetCluster
- ExactClustering
- CenterClustering
- BestMatchClustering
- MergeCenterClustering
- CorrelationClustering
- CutClustering
- MarkovClustering
- KiralyMSMApproximateClustering
- RicochetSRClustering
-
Blocking:
- Statistics
⚠️ Issues
- None
0.1.2
⚒️ Fixed
- Fixed export methods. Use case of not providing a ground-truth
- Time of vectorization by saving and retrieving the distance matrix
- Bug resolution in PER indexing, Dirty ER
- Speed/Memory optimizations in NN Blocking & Join PER
➕ Added
- 'sqeuclidean' metric in matching step
- Valentine as a Schema Matching plugin
- Frequency Evaluator compatible with base ER matching
⚠️ Issues
- Vectorizers (tfidf, etc) don't support dirty er. Will be fixed in the next release.
0.1.1
⚒️ Fixed
- Removed deprecated whoosh imports from prioritization file
➕ Added
- None
⚠️ Issues
- None
0.1.0
⚒️ Fixed
- Restructured Matching Module - vectorizer, tokenizer, and qgrams as arguments (not inferred)
- Clustering step randomization bug
➕ Added
- PER notebook tutorials
- PER grid-search pipeline (config files, search scripts, storage)
- PER workflows visualization and comparison through:
- feature configuration budget-centric metric progress plots
- feature configuration dataset-centric sorting and comparison
⚠️ Issues
- None