Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disk full error #57

Open
dandaii opened this issue Jun 7, 2023 · 4 comments
Open

disk full error #57

dandaii opened this issue Jun 7, 2023 · 4 comments

Comments

@dandaii
Copy link

dandaii commented Jun 7, 2023

Hi there,

I have a large size of social media dataset (~44GB after preprocessed into sqlite db). When I ran the package in my terminal, I constantly encountered the error message "sqlite3.OperationalError: database or disk is full". I assume it's because there are large size of temporary files generated in the background, which takes all of my available RAMs. Any thoughts to solve this issue? I'm using a vm with 128GB RAM, with sufficient storage size in the disk.

Here's the complete error message I had:
"
compute_networks weibocov2_20230603_file.db compute co_retweet --time_window 60
Calculating a co_retweet network on weibocov2_20230603_file.db with the following settings:
time_window: 60 seconds
min_edge_weight: 2 co-occurring messages
n_cpus: 32 processors
output_file: None
Ensure the indexes exist to drive the join.
Calculating the co-retweet network
concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.8/concurrent/futures/process.py", line 239, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "/home/ubuntu/.local/lib/python3.8/site-packages/coordination_network_toolkit/compute_networks.py", line 748, in _run_query
db.execute(
sqlite3.OperationalError: database or disk is full
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/ubuntu/.local/bin/compute_networks", line 8, in
sys.exit(main())
File "/home/ubuntu/.local/lib/python3.8/site-packages/coordination_network_toolkit/main.py", line 281, in main
compute_co_retweet_parallel(
File "/home/ubuntu/.local/lib/python3.8/site-packages/coordination_network_toolkit/compute_networks.py", line 703, in compute_co_retweet_parallel
return parallise_query_by_user_id(
File "/home/ubuntu/.local/lib/python3.8/site-packages/coordination_network_toolkit/compute_networks.py", line 147, in parallise_query_by_user_id
d.result()
File "/usr/lib/python3.8/concurrent/futures/_base.py", line 437, in result
return self.__get_result()
File "/usr/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
raise self._exception
sqlite3.OperationalError: database or disk is full
"

Here's the output of RAM usage of the vm when having the above error:
"
free -h
total used free shared buff/cache available
Mem: 125Gi 1.7Gi 11Gi 0.0Ki 112Gi 122Gi
Swap: 92Mi 5.0Mi 87Mi
"

Thanks.
Dan (HDR from DMRC)

@deimosnz
Copy link

deimosnz commented Jun 8, 2023

Hi Dan,

This sounds a bit like a VM issue to me. Did you go to QUT hacky hour the other day?

Rob

@dandaii
Copy link
Author

dandaii commented Jun 8, 2023 via email

@deimosnz
Copy link

deimosnz commented Jun 8, 2023 via email

@dandaii
Copy link
Author

dandaii commented Jun 8, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants