Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Md5 mismatch #214

Open
BrendaLee1 opened this issue Nov 2, 2023 · 1 comment
Open

Md5 mismatch #214

BrendaLee1 opened this issue Nov 2, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@BrendaLee1
Copy link

BrendaLee1 commented Nov 2, 2023

Hi,
I tried to download dataset EGAD00001009109, about 1T. The download speed is very low (~2M/s), and the following error always occure:
Traceback (most recent call last):
File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/urllib3/response.py", line 710, in _error_catcher
yield
File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/urllib3/response.py", line 835, in _raw_read
raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
urllib3.exceptions.IncompleteRead: IncompleteRead(5921036 bytes read, 98936564 more expected)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/requests/models.py", line 816, in generate
yield from self.raw.stream(chunk_size, decode_content=True)
File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/urllib3/response.py", line 940, in stream
data = self.read(amt=amt, decode_content=decode_content)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/urllib3/response.py", line 911, in read
data = self._raw_read(amt)
^^^^^^^^^^^^^^^^^^^
File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/urllib3/response.py", line 813, in _raw_read
with self._error_catcher():
File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/contextlib.py", line 155, in exit
self.gen.throw(typ, value, traceback)
File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/urllib3/response.py", line 727, in _error_catcher
raise ProtocolError(f"Connection broken: {e!r}", e) from e
urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(5921036 bytes read, 98936564 more expected)', IncompleteRead(5921036 bytes read, 98936564 more expected))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/pyega3/libs/data_file.py", line 323, in download_file_retry
self.download_file(output_file, num_connections, max_slice_size)
File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/pyega3/libs/data_file.py", line 159, in download_file
for part_file_name in executor.map(self.download_file_slice_, params):
File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/concurrent/futures/_base.py", line 619, in result_iterator
yield _result_or_cancel(fs.pop())
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/concurrent/futures/_base.py", line 317, in _result_or_cancel
return fut.result(timeout)
^^^^^^^^^^^^^^^^^^^
File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/concurrent/futures/_base.py", line 456, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self.exception
File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/pyega3/libs/data_file.py", line 189, in download_file_slice

return self.download_file_slice(*args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/pyega3/libs/data_file.py", line 224, in download_file_slice
for chunk in r.iter_content(DOWNLOAD_FILE_MEMORY_BUFFER_SIZE):
File "/rd1/laixh/soft/anaconda2/envs/pyega3/lib/python3.11/site-packages/requests/models.py", line 818, in generate
raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(5921036 bytes read, 98936564 more expected)', IncompleteRead(5921036 bytes read, 98936564 more expected))

The download can be finished after several rounds of retry, but unfortunately the md5 file is mismatched. I tried old version (4.0.2 and 5.0.1) of pyega3, seems not work.

I wounder if there are alternate options to download data from EGA or any suggestion to fix these problems.
Any help will be appreciated.

@BrendaLee1 BrendaLee1 added the bug Something isn't working label Nov 2, 2023
@Jungal10
Copy link

Did you get any solution for this? I am stuck in the same problem here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants