Skip to content

Commit

Permalink
Update dataset template to be compatible with version 1.0 (#8)
Browse files Browse the repository at this point in the history
Co-authored-by: Jonathan de Bruin <[email protected]>
  • Loading branch information
jteijema and J535D165 authored Aug 23, 2022
1 parent f6ab5a8 commit 88afcd6
Show file tree
Hide file tree
Showing 6 changed files with 75 additions and 47 deletions.
1 change: 1 addition & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
recursive-include asreviewcontrib/*/data *.*
43 changes: 27 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,19 @@
# Template for extending ASReview with a new dataset

ASReview has support for extensions, which enable you to seemlessly integrate
![Badge](https://img.shields.io/badge/ASReview-v1.0-%23ffcb05)

ASReview has support for extensions, which enable you to seamlessly integrate
your own programs with the ASReview framework. This template can extent ASReview
with new data.

See the section [Extensions](https://asreview.readthedocs.io/en/latest/extensions_dev.html)
See the section
[Extensions](https://asreview.readthedocs.io/en/latest/extensions_dev.html#dataset-extensions)
on ReadTheDocs for more information on writing extensions.

## Getting started

Click the `Use this template` button and add/modify the algorithms. Install
your new dataset with
Click the `Use this template` button and add/modify the algorithms. Install your
new dataset with

```bash
pip install .
Expand All @@ -22,24 +25,32 @@ or
pip install git+https://github.com/{USER_NAME}/{REPO_NAME}.git
```

and replace `{USER_NAME}` and `{REPO_NAME}` by your own details.

and replace `{USER_NAME}` and `{REPO_NAME}` by your own details.

## Usage

The new dataset is defined in
[`asreviewcontrib/dataset_name/your_dataset.py`](asreviewcontrib\dataset_name\your_dataset.py)
and can be used as a new dataset.
Adding a dataset to ASReview is done by extending the
[`BaseDataSet`](https://asreview.readthedocs.io/en/latest/reference.html#BaseDataSet)
class, adding it to a `BaseDataGroup` and finally, adding it to ASReview. To use
this template, fork it and modify the following files:

- A `BaseDataSet` object and a `BaseDataGroup` are defined in
[`asreviewcontrib/dataset_name/your_dataset.py`](asreviewcontrib/dataset_name/your_dataset.py).
Modify this file to add your own datasets. The `BaseDataSet` class should
always be added to a `BaseDataGroup` object.

- Adding your `BaseDataGroup` object to ASReview is done via the
[`asreviewcontrib/dataset_name/__init__.py`](asreviewcontrib/dataset_name/__init__.py)
file. This file should import your `BaseDataGroup`.

By supplying the `from_config()` method with a
config object, a new DataSet object is created, and integrated to ASReview. See
[asreview.datasets.BaseDataSet.from_config](https://asreview.readthedocs.io/en/latest/API/generated/asreview.datasets.BaseDataSet.html#asreview.datasets.BaseDataSet.from_config)
for more information on this function.
- Adjust [`setup.py`](setup.py) with information about your dataset, and define
the dataset entrypoint by adding your `BaseDataGroup`.

[`setup.py`](setup.py) contains the code needed for integration into ASReview.
- Add your dataset to the `data` folder of the template.

[`asreviewcontrib/dataset_name/__init__.py`](asreviewcontrib/dataset_name/__init__.py)
contains directions for loading the dataset module.
For advanced usage, check out the
[`BaseDataGroup`](https://asreview.readthedocs.io/en/latest/reference.html#asreview.datasets.BaseDataGroup)
in the example and the documentation.

## License

Expand Down
2 changes: 1 addition & 1 deletion asreviewcontrib/dataset_name/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
from asreviewcontrib.dataset_name.your_dataset import YourDataGroup
from asreviewcontrib.dataset_name.your_dataset import ExampleDatasetGroup
File renamed without changes.
63 changes: 40 additions & 23 deletions asreviewcontrib/dataset_name/your_dataset.py
Original file line number Diff line number Diff line change
@@ -1,29 +1,46 @@
"""This module shows example dataset classes for creating your own dataset."""

from pathlib import Path

from asreview.datasets import BaseDataSet
from asreview.datasets import BaseDataGroup

class YourDataGroup(BaseDataGroup):
group_id = "your_data_group"
description = "A new data group with my awesome datasets."

class ExampleDatasetGroup(BaseDataGroup):
"""This is an example dataset group."""

group_id = "example_group"
description = "Example dataset group"

def __init__(self):
"""Initialize the dataset group."""

example_dataset_local = BaseDataSet(
dataset_id="example_dataset_local",
filepath=str(Path(Path(__file__).parent, 'data', 'your_dataset.csv')), # noqa
title="Example dataset (local)",
description="This is an example dataset that is stored locally.",
authors='Teijema, J.J. (2022)',
topic='example datasets',
link='ASReview.ai',
reference=None,
img_url=None,
license='MIT',
year='2022'
)

example_dataset_remote = BaseDataSet(
dataset_id="example_dataset_remote",
filepath='https://raw.githubusercontent.com/asreview/systematic-review-datasets/master/datasets/van_de_Schoot_2017/output/van_de_Schoot_2017.csv', # noqa
title="Example dataset (remote)",
description="This is an example dataset that is stored remotely.",
authors='Teijema, J.J. (2022)',
topic=None,
link=None,
reference=None,
img_url=None,
license=None,
year=None
)

dataset = BaseDataSet.from_config({
"dataset_id": "your_data_id",
"url": "",
"reference": "",
"link": "",
"license": "",
"title": "Your Data",
"authors": [
"Jane Doe",
"John Doe"
],
"year": 2021,
"topic": "Your topic",
"final_inclusions": True,
"title_abstract_inclusions": False
}
)

super(YourDataGroup, self).__init__(dataset)
# pass multiple datasets to init if there are more datasets
super().__init__(example_dataset_local, example_dataset_remote)
13 changes: 6 additions & 7 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

setup(
name='asreview-template-dataset-extension',
version='0.1',
version='1.0',
description='Example dataset extension',
url='https://github.com/asreview/template-extension-new-dataset',
author='ASReview team',
Expand All @@ -17,20 +17,19 @@
],
keywords='systematic review',
packages=find_namespace_packages(include=['asreviewcontrib.*']),
include_package_data=True,
python_requires='~=3.6',
install_requires=[
'asreview>=0.16',
'asreview>=1.0',
],

entry_points={
"asreview.datasets": [
"newDataset = asreviewcontrib.dataset_name.your_dataset:YourDataGroup"
"example_group = asreviewcontrib.dataset_name:ExampleDatasetGroup", # noqa
]

},

project_urls={
'Bug Reports': 'https://github.com/asreview/template-extension-new-dataset/issues',
'Source': 'https://github.com/asreview/template-extension-new-dataset/',
'Bug Reports': 'https://github.com/asreview/template-extension-new-dataset/issues', # noqa
'Source': 'https://github.com/asreview/template-extension-new-dataset/', # noqa
},
)

0 comments on commit 88afcd6

Please sign in to comment.