Optimizing Performance of Internal Horde Worker? #555

CallMeAl1973 · 2023-12-10T00:46:35Z

CallMeAl1973
Dec 10, 2023

Hi all,
Just wondering if there are any tricks to optimizing the performance of Koboldcpp’s internal horde worker, or for that matter, whether there is any documentation to help me set realistic expectations for hourly EarnRate?
For example, my system is running an i9-12900T, 64 gigs of memory, and an NVIDIA T1000 with 8 gigs of VRAM..
I’m using Debian, and am currently working with a 7B GGUF model.
If I launch koboldcpp with the following parameters: modelname, --gpulayers 40 --usecublas 0 –highpriority, I can run locally and generate anywhere from 12-15T/s. However, if I add the –hordeconfig values as shown in the Wiki, each job only seems to run at about .5T/s, and I get an earn rate similar to the below:
[Total:4081 kudos, Time:024h:03m:58s, Jobs:1860, EarnRate:170 kudos/hr].
Is this typical/acceptable/ If not, are there any tweaks I can try to improve performance?
Thanks,

LostRuins · 2023-12-14T13:11:07Z

LostRuins
Dec 14, 2023
Maintainer

Why do you say it runs at .5T/s? How long does each job typically take for you?

170 kudos/hr is quite low. You are probably running a suboptimal setup.

0 replies

CallMeAl1973 · 2023-12-14T14:48:02Z

CallMeAl1973
Dec 14, 2023
Author

I should have said .5T/s is the slowest gen I’ve seen. The T/s is actually all over the map as shown below. I’m pretty sure my settings are not optimal, but I guess that’s my question. I’m not clear on which settings to tweak. ContextLimit: 405/1024, Processing:7.32s (24.1ms/T), Generation:13.04s (130.4ms/T), Total:20.36s (4.91T/s) ContextLimit: 791/1024, Processing:18.15s (24.0ms/T), Generation:5.01s (147.5ms/T), Total:23.16s (1.47T/s) ContextLimit: 890/1024, Processing:18.15s (21.5ms/T), Generation:5.92s (131.6ms/T), Total:24.07s (1.87T/s) ContextLimit: 661/1024, Processing:16.73s (26.1ms/T), Generation:2.47s (137.0ms/T), Total:19.20s (0.94T/s) ContextLimit: 1024/1024, Processing:21.27s (22.6ms/T), Generation:12.63s (157.9ms/T), Total:33.90s (2.36T/s) ContextLimit: 395/512, Processing:7.54s (22.3ms/T), Generation:7.94s (141.8ms/T), Total:15.48s (3.62T/s) ContextLimit: 401/512, Processing:7.02s (23.4ms/T), Generation:13.82s (138.2ms/T), Total:20.85s (4.80T/s) ContextLimit: 810/1024, Processing:18.03s (23.3ms/T), Generation:4.88s (139.3ms/T), Total:22.90s (1.53T/s) ContextLimit: 948/1024, Processing:22.62s (24.5ms/T), Generation:3.53s (147.0ms/T), Total:26.15s (0.92T/s)

2 replies

LostRuins Dec 14, 2023
Maintainer

And what results do you get when running locally without horde? Try a few gens with about 1024 context and maybe 100-200 tokens

LostRuins Dec 14, 2023
Maintainer

On first glance your processing speeds are a bit slow for horde. Horde requires processing large prompts that constantly change so you can't cache them. So you need to process much faster.

CallMeAl1973 · 2023-12-14T16:07:22Z

CallMeAl1973
Dec 14, 2023
Author

A few results when running locally: ContextLimit: 222/1024, Processing:4.06s (184.5ms/T), Generation:26.03s (130.2ms/T), Total:30.09s (6.65T/s) ContextLimit: 319/1024, Processing:4.20s (182.6ms/T), Generation:14.71s (130.1ms/T), Total:18.90s (5.98T/s) Again, the system includes an a i9-12900T, 64 gigs of DDR5 RAM, and an NVIDIA T1000 with 8 gigs of VRAM. I don’t know if this machine is just underpowered, or if I’m not getting the best out of it for some reason. Not sure how to figure that out.

0 replies

LostRuins · 2023-12-14T17:26:58Z

LostRuins
Dec 14, 2023
Maintainer

Looks about the same. Its a bit faster because you are using smaller prompts compared to those on horde. I guess your setup just isn't too optimal for horde. Maybe make sure you're using a white listed model to increase Kudos earned

0 replies

CallMeAl1973 · 2023-12-14T17:38:58Z

CallMeAl1973
Dec 14, 2023
Author

Thanks for your time. One last question if I may: What are white listed models, and where do I find that white list? ?

1 reply

LostRuins Dec 15, 2023
Maintainer

Check this list https://github.com/Haidra-Org/AI-Horde-text-model-reference

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizing Performance of Internal Horde Worker? #555

{{title}}

Replies: 5 comments 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Optimizing Performance of Internal Horde Worker? #555

CallMeAl1973 Dec 10, 2023

Replies: 5 comments · 3 replies

LostRuins Dec 14, 2023 Maintainer

CallMeAl1973 Dec 14, 2023 Author

LostRuins Dec 14, 2023 Maintainer

LostRuins Dec 14, 2023 Maintainer

CallMeAl1973 Dec 14, 2023 Author

LostRuins Dec 14, 2023 Maintainer

CallMeAl1973 Dec 14, 2023 Author

LostRuins Dec 15, 2023 Maintainer

CallMeAl1973
Dec 10, 2023

Replies: 5 comments 3 replies

LostRuins
Dec 14, 2023
Maintainer

CallMeAl1973
Dec 14, 2023
Author

LostRuins Dec 14, 2023
Maintainer

LostRuins Dec 14, 2023
Maintainer

CallMeAl1973
Dec 14, 2023
Author

LostRuins
Dec 14, 2023
Maintainer

CallMeAl1973
Dec 14, 2023
Author

LostRuins Dec 15, 2023
Maintainer