-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance question #36
Comments
This may be related to Docker on OSX. I need to do a further investigation (like running it natively on OSX and or Linux) to get a comparison. Still wonder how much data you can plug into EliasDB. Is it realistic to have millions of nodes and edges? |
Hey there, you raise a very valid point. I've done some soak testing yesterday and have the following timings (this ran on a Linux VM on a server of mine): Time / API response time (measured in a Browser) So the time definitely goes up as time passes. Now let's me add some of my thoughts:
In the moment the underlying datastore is more-or-less a giant hash map. It uses a so called HTree (https://en.wikipedia.org/wiki/HTree) see here: https://github.com/krotik/eliasdb/blob/master/hash/htree.go |
I also tested some more. The major speed degradation I got was related to me "somehow" had created persistent volumes for the docker-compose setup. After changing some code in collector (URL), I wondered why I see old URLs in the results. I removed all the containers. Even rebuild all images. Data was still there. Then I found the unnamed volumes. After deleting them, everything was fine and much faster. This is expected because the Docker implementation on OSX "sucks" when using volumes. It makes everything file system related up to 60 times slower. There are some workarounds. One of them is running docker inside a virtual machine on the same Mac, which is fast. Yeah... Computers. Getting "the last" entries of something (indexed) seems a pretty common requirement. Probably also for a Graph database. Sadly nothing we do will have less than thousands if not millions of nodes, and there will be many searches for date ranges or last entries. I don't know who is using EliasDB in anything real and how that works out. I plan to use it for a small project which connects multiple SPAs to a PHP server where the SPAs and the server use GrapQL Websockets to implement a notification system. Between groups of running SPAs. Like having SPAs A B C D while A B and C D form a group, A sends new data to the server. The server informs B to update and the same for the C and D pair. Currently, A B C D connects to a Go-based Websocket server, and the Go server polls the DB for changes). But I would love to create bigger systems with much more complex business logic. EliasDB seems to have the right feature set for this. But I guess this would really need an extension to sequential traverse of the DB. Not sure if one could keep something like a "last20" edge between the last elements and updating it accordingly. Just something that came to mind a second ago. |
I let the data-miner demo run for some time (2 hours roughly) and get ever-decreasing answer times (4-5 seconds now) when reloading the frontend. As this is just a simple query with a last:50 clause I wonder how the database will perform when using it for something "real"?
The text was updated successfully, but these errors were encountered: