Package for uploading Weka experiments to OpenML. Works in combination with the OpenML Apiconnector (available on Maven Central; version >= 1.0.14) and Weka (available on Maven Central; version >= 3.9.0)
The following code example downloads a specific set of OpenML datasets and loads them into the Weka data format (weka.core.Instances), that can be used trivially for off line development and experimenting.
public static void downloadData() throws Exception {
// Fill in the API key (obtainable from your OpenML profile)
String apikey = "<FILL_IN_OPENML_API_KEY>";
// Instantiate the OpenmlConnector object
// requires artifact org.openml.apiconnector (version 1.0.14) from Maven central
OpenmlConnector openml = new OpenmlConnector(apikey);
// Download the OpenML object containing the `OpenML100' benchmark set
Study s = openml.studyGet("OpenML100", "data");
// Loop over all the datasets
for (Integer dataId : s.getDataset()) {
// DataSetDescription is an OpenML object containing meta-information about the dataset
DataSetDescription dsd = openml.dataGet(dataId);
// datasetFile downloads the raw dataset file from openml
File datasetFile = dsd.getDataset(apikey);
// Converts this file into the Weka format
Instances dataset = new Instances(new FileReader(datasetFile));
System.out.println("Downloaded " + dsd.getName());
System.out.println("numObservations = " + dataset.numInstances() + "; numFeatures = " + dataset.numAttributes());
}
}
The following code example downloads a specific set of OpenML tasks (dubbed: the OpenML100) and executes a NaiveBayes classifier on it.
public static void runTasksAndUpload() throws Exception {
// Fill in the API key (obtainable from your OpenML profile)
String apikey = "<FILL_IN_APIKEY>";
// The WekaConfig module gives us the possibilities to enable or disable various Weka Specific options
WekaConfig config = new WekaConfig();
// Instantiate the OpenmlConnector object
// requires artifact org.openml.apiconnector (version >= 1.0.14) from Maven central
OpenmlConnector openml = new OpenmlConnector(apikey);
// Download the OpenML object containing the `OpenML100' benchmark set
Study s = openml.studyGet("OpenML100", "tasks");
// Loop over all the tasks
for (Integer taskId : s.getTasks()) {
// create a Weka classifier to run on the task
Classifier tree = new NaiveBayes();
// execute the task (can take a while, depending on the classifier / dataset combination)
int runId = RunOpenmlJob.executeTask(openml, config, taskId, tree);
// After several minutes, the evaluation measures will be available on the server
System.out.println("Available on " + openml.getApiUrl() + "run/" + runId);
// Download the run from the server:
Run run = openml.runGet(runId);
}
}
OpenML contains a large number of experiments, conveniently available for everyone. In order to obtain and analyse these results, the OpenML Apiconnector could be of use. Please follow the demonstration depicted on the respective Github page.
If you found this package useful, please cite: J. N. van Rijn, Massively Collaborative Machine Learning, Leiden University, 2016. If you used OpenML in a scientific publication, please check out the OpenML citation policy.