Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data source csvs #17

Open
davecampbell opened this issue Nov 28, 2020 · 2 comments
Open

data source csvs #17

davecampbell opened this issue Nov 28, 2020 · 2 comments

Comments

@davecampbell
Copy link

davecampbell commented Nov 28, 2020

do the csvs happen to be available anywhere - public dataset somewhere?
https://github.com/mhjabreel/CharCNN/tree/master/data/ag_news_csv

they are mentioned here:
https://github.com/johnb30/py_crepe

i could reassemble them given the text files in /data, but would hope to not introduce some oddity by mishandling double-quotes or something like that.

@mhjabreel
Copy link
Owner

Hi,
You can find the datasets in HuggingFace datasets. Please check this url.

https://huggingface.co/docs/datasets/loading_datasets.html

@davecampbell
Copy link
Author

thank you SO much - what a great resource!
looks like the py_crepe project was reading data files before the title and description were combined, so i can adjust that helper code to use the more recent format.
i'm trying to see if i can get that to run for me - and then progress to this repo to fully appreciate what you have here.

i hope to use it on a non-nlp project related to gene sequences, but i am just an ML beginner so i have to take everything step-by-step.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants