Any extra dataset prep needed? #72
-
I have followed the instructions from README. I have set up a TPU v3-8 machine which can be confirmed below: I have hosted the ImageNet-1k ( While launching training, I am using the following command: gcloud alpha compute tpus tpu-vm ssh $NAME --zone=$ZONE --worker=all --command "TFDS_DATA_DIR=gs://imagenet-1k/tensorflow_datasets bash big_vision/run_tpu.sh big_vision.train --config big_vision/configs/vit_s16_i1k.py --workdir gs://$GS_BUCKET_NAME/big_vision/workdir/`date '+%m-%d_%H%M'`" It results into the following:
Is there anything I'm missing out here? |
Beta Was this translation helpful? Give feedback.
Replies: 10 comments
-
You are on the good track, but there is still one step missing. After manually downloading the dataset, you need to run tfds once to reformat the data. We provided the script for doing this: As indicated in the README, to launch data formatting on a TPU machine you could run
Alternatively, you can even do it on your local machine by directly running the util, assuming the local machine has access to the cloud bucket. Let us know whether it works for you. Leaving the issue open for now. |
Beta Was this translation helpful? Give feedback.
-
Thank you! Giving it a try right now. |
Beta Was this translation helpful? Give feedback.
-
leads to:
|
Beta Was this translation helpful? Give feedback.
-
Will running the following help?
|
Beta Was this translation helpful? Give feedback.
-
The error in #2 (comment) is expected I think since
|
Beta Was this translation helpful? Give feedback.
-
yeah, sorry, you likely need to manually override that variable as you suggested. Let me know if you eventually succeed. In any case, once I have time, I will update the readme with well-tested instructions to get imagenet data to work. |
Beta Was this translation helpful? Give feedback.
-
Sure! I am currently running this: |
Beta Was this translation helpful? Give feedback.
-
Update. This is the current error (I faced one regarding
Currently doing (after installing import tensorflow_datasets as tfds
data_dir = "gs://imagenet-1k/tensorflow_datasets"
ds = tfds.load("imagenet_v2", data_dir=data_dir, download=True) It seems to be taking more than expected but will keep on updating anyway. I am maintaining a log here: https://gist.github.com/sayakpaul/9544d3ba935805bd47d71fd8596e7bc0 (not yet complete). |
Beta Was this translation helpful? Give feedback.
-
Looks like I was able to make things up and running: -- -- I have also updated the gist I mentioned in #2 (comment). Keeping it open until the training completes. |
Beta Was this translation helpful? Give feedback.
-
Was able to reproduce everything (76.23% on ImageNet-1k validation set) within 90 epochs of pre-training on TPU v3-8 (that took 7 hours 22 mins to complete in total): The following repository contains everything including the updated instructions, training logs, and the checkpoints: |
Beta Was this translation helpful? Give feedback.
You are on the good track, but there is still one step missing.
After manually downloading the dataset, you need to run tfds once to reformat the data. We provided the script for doing this:
big_vision/tools/download_tfds_datasets.py
.As indicated in the README, to launch data formatting on a TPU machine you could run
Alternatively, you can even do it on your local machine by directly running the util, assuming the local machine has access to the cloud bucket.
Let us know whether it…