Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cigar parsing speedup #47

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

bkille
Copy link
Contributor

@bkille bkille commented Feb 3, 2025

According to perf, a good portion of runtime was spent converting slices of str to i32 in the parse_cigar_to_delta function.

Note that this PR gets rid of the check which makes sure that the number parses correctly (num_buf.parse::<i32>().map_err(|_| ParseErr::InvalidCigarFormat)?). As we are already checking if each character is a digit, I don't think there's any way that num_buf would throw anyways (unless parse throws if there are leading zeros, I am not sure...).

Benchmarks
Input:

> du -sh $PAF
78G     test-data/primates16.20231205_wfmash-v0.12.5.paf

Command (the .impg index was precomputed).

./target/release/impg partition --paf-file $PAF --sequence-prefix chm13#1 --window-size 1000000 -t 1

Main branch:

User time (seconds): 1081.02
System time (seconds): 61.03
Percent of CPU this job got: 97%
Elapsed (wall clock) time (h:mm:ss or m:ss): 19:29.24

This PR:

User time (seconds): 598.06
System time (seconds): 46.71
Percent of CPU this job got: 95%
Elapsed (wall clock) time (h:mm:ss or m:ss): 11:12.87

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant