Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the recommended way to access the simpleQA dataset? #25

Open
ZhangYiqun018 opened this issue Oct 31, 2024 · 1 comment
Open

Comments

@ZhangYiqun018
Copy link

ZhangYiqun018 commented Oct 31, 2024

Hi,

I'm encountering an "Access Failure" error (ResourceNotFound) when trying to read the simple_qa_test_set.csv file from the Azure path az://openaipublic/simple-evals/simple_qa_test_set.csv using pandas and blobfile.

Is the simple-evals container publicly accessible? If not, how can I obtain the necessary credentials? Also, what is the recommended way to access the simpleQA dataset?

thanks.

@ZhangYiqun018
Copy link
Author

I found a solution to access the simpleQA dataset. Instead of using blobfile with the az:// path, you can directly use pandas to read the CSV file from the HTTPS URL:

import pandas
df = pandas.read_csv(
    "https://openaipublic.blob.core.windows.net/simple-evals/simple_qa_test_set.csv"
)

The key is to use the HTTPS URL format (https://openaipublic.blob.core.windows.net/...) rather than the Azure blob storage path (az://...). This approach works without requiring any authentication credentials.

Hope this helps anyone facing a similar issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant