-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Any chance to make ur have no requirement on single executor machine's memory? #37
Comments
WayneWang12
changed the title
Any chance to make ur have no request on single machine memory?
Any chance to make ur have no requirement on single machine memory?
Sep 27, 2017
WayneWang12
changed the title
Any chance to make ur have no requirement on single machine memory?
Any chance to make ur have no requirement on single executor machine's memory?
Sep 27, 2017
Spark needs to have all data in memory spread across the cluster and some data structures have to be copied to each machine. Therefore it is a memory hog. The good news is that if you use AWS we can start large machines, train, and shut them down afterward. There is no use for Spark unless you are training while using the UR. This means that if you have a permanent Spark cluster it is wasted most of the time unless you are sharing it and that is not recommend if you want guaranteed model update times.
So yes memory is required, our solution is big but temporary machines in AWS.
BTW, please join the Google group for questions so others can benefit: https://groups.google.com/forum/#!forum/actionml-user <https://groups.google.com/forum/#!forum/actionml-user>
On Sep 26, 2017, at 10:20 PM, Wayne Wang <[email protected]> wrote:
I'm researching this model and it is really awesome for small companies like us.
I've trained a model easily with 10 million trading orders. However, when I increase the number to 100 million, model cannot be trained.
Actually we have a cluster with 1TB memory. But this model requires the memory size of a single machine. My cluster have 20 nodes and each gets 64GB memory. It is obviously not enough for 10 million orders. I'm wondering if there is any chance for this model to make no requirement to a single machine. I think 1TB is quite enough. The bottleneck is on single machine's memory.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#37>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAT8S7YkkYZmyZEpDIQdqn7rx2sm9RjUks5smdskgaJpZM4PlOwn>.
|
OK, I see. But why the training in Is there a way to make it quickier? And also I'll post it to the group. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I'm researching this model and it is really awesome for small companies like us.
I've trained a model easily with 10 million trading orders. However, when I increase the number to 100 million, model cannot be trained.
Actually we have a cluster with 1TB memory. But this model requires the memory size of a single machine. My cluster have 20 nodes and each gets 64GB memory. It is obviously not enough for 10 million orders. I'm wondering if there is any chance for this model to make no requirement to a single machine. I think 1TB is quite enough. The bottleneck is on single machine's memory.
Driver is OK. I can find a temporary machine with 128GB or 256GB for a day. But I can't make this to single executor machines because they are constant and maybe I have to upgrade machines for all.
Or is there any way to make executors run on high memory machines?
The text was updated successfully, but these errors were encountered: