This folder contains the code used for assessing the performance of the 2 fine-tuned similarity join algorithms as well as the baseline one.
The folder data contains all datasets in the form of JedAI's Java serialized objects. The synthetic datasets for the scalability analysis are available as JedAI's Java serialized objects here.
The folder lib contains the libraries necessary for running the experiments (in the form of jar files).
The folder src contains the source code.