-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inflate() failed with eror -4: incorrect header check #49
Comments
Integrity of the downloaded data is probably a good first thing to check. Can you confirm how you downloaded the data? I can provide a checksum for the tarball, though given its small size you could even just feasibly try to download it again. |
also, what's your software stack? That code is from September 2020 so there may be issues if you're trying to run with newer versions of TF, etc. |
I used wget on the portal.nersc.gov to retrieve the data. Should I be retrieving the data another way?
The tensorflow version I am using is 2.9.1. Is there a specifc version of tensorflow that you would recommend? I am currently trying to get this to run on my local machine before I run multi-node or single-node on our HPC platform. It will run on the HPC platform on an Docker image (ubuntu base with the required package installs). |
I also tried redownloading the data again but ran into the same issue. Here is the image I am using on my local system:
|
Any thoughts? Still running into the same problem @sparticlesteve |
Background:
I am trying to test whether GPU-intensive ML programs can be run faster/cheaper than on an A100 by running on an HPC, distributed platform by spreading the job over multiple CPU nodes. I am running the job on V0.7 small data set and am running on CosmoFlow TensorFlow Keras benchmark implementation.
Command Used to Run Job locally:
python3 train.py --data-dir cosmoUniverse_2019_05_4parE_tf_small --n-train 32 --n-valid 32 --batch-size 2
Error Log:
The text was updated successfully, but these errors were encountered: