Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/149-Add-benchmark-datasets #157

Merged
merged 12 commits into from
Oct 1, 2024
Merged

Conversation

NoB0
Copy link
Collaborator

@NoB0 NoB0 commented May 14, 2024

What's changed?

  • The organization in the folder data, dialogues in the DK format are now saved under data/datasets/<dataset_name>
  • Add a script to download ReDial dataset from source and process it to extract items, ratings, and dialogue formatted with regards to DialogueKit
  • Add a script to artificially augment ReDial dialogues with dialogue acts and information need. This is needed for the training of TUS
  • Item collections (incl. items and ratings) are now stored in data/item_collections

Part of #149

Copy link

Current Branch Main Branch
Coverage Badge Coverage Badge

@NoB0 NoB0 marked this pull request as ready for review October 1, 2024 09:45
@NoB0 NoB0 requested a review from kbalog October 1, 2024 10:22
Copy link
Contributor

@kbalog kbalog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some comments.

data/datasets/README.md Outdated Show resolved Hide resolved
scripts/datasets/redial/format_redial.py Outdated Show resolved Hide resolved
scripts/datasets/redial/format_redial.py Outdated Show resolved Hide resolved
@NoB0 NoB0 merged commit f5bfd46 into main Oct 1, 2024
5 checks passed
@NoB0 NoB0 deleted the feature/149-Add-benchmark-datasets branch October 1, 2024 14:14
@NoB0 NoB0 mentioned this pull request Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants