Optimizing Performance of Internal Horde Worker? #555
Replies: 5 comments 3 replies
-
Why do you say it runs at .5T/s? How long does each job typically take for you? 170 kudos/hr is quite low. You are probably running a suboptimal setup. |
Beta Was this translation helpful? Give feedback.
-
I should have said .5T/s is the slowest gen I’ve seen. The T/s is actually all over the map as shown below. I’m pretty sure my settings are not optimal, but I guess that’s my question. I’m not clear on which settings to tweak.
ContextLimit: 405/1024, Processing:7.32s (24.1ms/T), Generation:13.04s (130.4ms/T), Total:20.36s (4.91T/s)
ContextLimit: 791/1024, Processing:18.15s (24.0ms/T), Generation:5.01s (147.5ms/T), Total:23.16s (1.47T/s)
ContextLimit: 890/1024, Processing:18.15s (21.5ms/T), Generation:5.92s (131.6ms/T), Total:24.07s (1.87T/s)
ContextLimit: 661/1024, Processing:16.73s (26.1ms/T), Generation:2.47s (137.0ms/T), Total:19.20s (0.94T/s)
ContextLimit: 1024/1024, Processing:21.27s (22.6ms/T), Generation:12.63s (157.9ms/T), Total:33.90s (2.36T/s)
ContextLimit: 395/512, Processing:7.54s (22.3ms/T), Generation:7.94s (141.8ms/T), Total:15.48s (3.62T/s)
ContextLimit: 401/512, Processing:7.02s (23.4ms/T), Generation:13.82s (138.2ms/T), Total:20.85s (4.80T/s)
ContextLimit: 810/1024, Processing:18.03s (23.3ms/T), Generation:4.88s (139.3ms/T), Total:22.90s (1.53T/s)
ContextLimit: 948/1024, Processing:22.62s (24.5ms/T), Generation:3.53s (147.0ms/T), Total:26.15s (0.92T/s)
|
Beta Was this translation helpful? Give feedback.
-
A few results when running locally:
ContextLimit: 222/1024, Processing:4.06s (184.5ms/T), Generation:26.03s (130.2ms/T), Total:30.09s (6.65T/s)
ContextLimit: 319/1024, Processing:4.20s (182.6ms/T), Generation:14.71s (130.1ms/T), Total:18.90s (5.98T/s)
Again, the system includes an a i9-12900T, 64 gigs of DDR5 RAM, and an NVIDIA T1000 with 8 gigs of VRAM.
I don’t know if this machine is just underpowered, or if I’m not getting the best out of it for some reason. Not sure how to figure that out.
|
Beta Was this translation helpful? Give feedback.
-
Looks about the same. Its a bit faster because you are using smaller prompts compared to those on horde. I guess your setup just isn't too optimal for horde. Maybe make sure you're using a white listed model to increase Kudos earned |
Beta Was this translation helpful? Give feedback.
-
Thanks for your time. One last question if I may: What are white listed models, and where do I find that white list?
?
|
Beta Was this translation helpful? Give feedback.
-
Hi all,
Just wondering if there are any tricks to optimizing the performance of Koboldcpp’s internal horde worker, or for that matter, whether there is any documentation to help me set realistic expectations for hourly EarnRate?
For example, my system is running an i9-12900T, 64 gigs of memory, and an NVIDIA T1000 with 8 gigs of VRAM..
I’m using Debian, and am currently working with a 7B GGUF model.
If I launch koboldcpp with the following parameters: modelname, --gpulayers 40 --usecublas 0 –highpriority, I can run locally and generate anywhere from 12-15T/s. However, if I add the –hordeconfig values as shown in the Wiki, each job only seems to run at about .5T/s, and I get an earn rate similar to the below:
[Total:4081 kudos, Time:024h:03m:58s, Jobs:1860, EarnRate:170 kudos/hr].
Is this typical/acceptable/ If not, are there any tweaks I can try to improve performance?
Thanks,
Beta Was this translation helpful? Give feedback.
All reactions