Breaking changes: increase performance #5
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR significantly increases training and winner-lookup speed. The speedup is primarily achieved by reducing heap allocations. However, the cost is that it introduces API-breaking changes.
The methodology was first to add benchmarking to the current code base and measuring performance deltas after every little tweak to the code. In some cases, removing intermediate heap allocations led to performance regressions. In those cases, I left comments explaining why they're necessary.
Because this PR introduces breaking changes, I added breaking change: the non-default feature
serde-1
. This helps with build times for people not interested in serialization. Building withserde-1
enables this crate's old[to, from]_json
support. I believe that those functions are out of this crate's scope, but as long as they are disabled by default, I see no harm.Benchmarks
I first ran
cargo bench
without any library modifications, and the output below is after rerunning it on the tip of this branch.Notes
I'm fairly certain that the lackluster speedup of random training is explained in this comment.