Skip to content

Any extra dataset prep needed? #72

Answered by akolesnikoff
sayakpaul asked this question in Q&A
Discussion options

You must be logged in to vote

You are on the good track, but there is still one step missing.

After manually downloading the dataset, you need to run tfds once to reformat the data. We provided the script for doing this: big_vision/tools/download_tfds_datasets.py.

As indicated in the README, to launch data formatting on a TPU machine you could run

gcloud alpha compute tpus tpu-vm ssh $NAME --zone=$ZONE --worker=0 --command "TFDS_DATA_DIR=gs://imagenet-1k/tensorflow_datasetsbash big_vision/run_tpu.sh big_vision.tools.download_tfds_datasets imagenet2012"

Alternatively, you can even do it on your local machine by directly running the util, assuming the local machine has access to the cloud bucket.

Let us know whether it…

Replies: 10 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by lucasb-eyer
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants
Converted from issue

This discussion was converted from issue #2 on November 07, 2023 13:13.