Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make_datafiles.py issue #34

Open
zixiliuUSC opened this issue Feb 24, 2020 · 0 comments
Open

make_datafiles.py issue #34

zixiliuUSC opened this issue Feb 24, 2020 · 0 comments

Comments

@zixiliuUSC
Copy link

I run make_datafiles.py to generate raw text file for BART preprocessing, but I meet following issue:

python make_datafiles.py ./cnn/stories ./dailymail/stories/
Making bin file for URLs listed in url_lists/all_test.txt...
Traceback (most recent call last):
File "make_datafiles.py", line 138, in
write_to_bin(all_test_urls, os.path.join(finished_files_dir, "test"))
File "make_datafiles.py", line 84, in write_to_bin
url_list = read_text_file(url_file)
File "make_datafiles.py", line 26, in read_text_file
with open(text_file, "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'url_lists/all_test.txt'

Then I assume it is because all_test_urls doesn't direct to the url file in the dataset, i.e., wayback_test_urls.txt. So, I alter the file name to all_test.txt and put it in the folder, ./cnn/url_lists . But the code still gives the same error. So, I check the source again and find something wrong in the following line.
url_list = read_text_file(url_file)
And I alter it to be:
url_list = read_text_file(os.path.join('./cnn', url_file))
In this way, I think all the source and target file is generated from only cnn dataset. Am I right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant