How to create the dataset in the plot? #3

smu160 · 2024-07-16T05:52:12Z

Hi,

Thank you for your hard work. This is an awesome project! I'd be happy to contribute a plotting script to help to automate the process. Do you just collect the dataset via criterion? If so, are there any other flags/options I should enable during benchmarks?

Thank you!

LaihoE · 2024-07-16T13:53:03Z

Yes the dataset was created via criterion.

I produced the output with a half manual, half python script so a script to automate it all would be awesome.

Maybe the output could be something like this?

    dtype     n      scalar        simd operation
0      u8     0    0.660681    1.395930  contains
1      u8    10    2.833248    3.381811  contains
2      u8    20    5.068225    5.804822  contains
3      u8    30    7.793730    8.344281  contains
4      u8    40    9.950634    3.197494  contains
..    ...   ...         ...         ...       ...
795   u64  1950  423.881469  143.787244  contains
796   u64  1960  438.771648  153.698608  contains
797   u64  1970  441.591915  158.481679  contains
798   u64  1980  440.676417  157.837033  contains
799   u64  1990  448.819562  160.604921  contains

And nope, I didn't use anything special just make sure target-cpu is native for example:

RUSTFLAGS='-C target-cpu=native' cargo bench

I just fixed up the benches to now take input len. currently the sampling is every 10th:

for n in (0..200).map(|x| x * 10) {

This is not perfect as it misses the aligned (32,64,128...), but not sure what a good strategy would be. Do you have a suggestion? Anyway the benches take forever even with this setup 🤔

LaihoE · 2024-07-16T13:57:57Z

Maybe we could drop the signed versions?

benchmark_contains::<u8>(c, "u8", n);
benchmark_contains::<i8>(c, "i8", n);
benchmark_contains::<u16>(c, "u16", n);
benchmark_contains::<i16>(c, "i16", n);
benchmark_contains::<u32>(c, "u32", n);
benchmark_contains::<i32>(c, "i32", n);
benchmark_contains::<u64>(c, "u64", n);
benchmark_contains::<i64>(c, "i64", n);
benchmark_contains::<isize>(c, "isize", n);
benchmark_contains::<usize>(c, "usize", n);
benchmark_contains_floats::<f32>(c, "f32", n);
benchmark_contains_floats::<f64>(c, "f64", n);

to

benchmark_contains::<u8>(c, "u8", n);
benchmark_contains::<u16>(c, "u16", n);
benchmark_contains::<u32>(c, "u32", n);
benchmark_contains::<u64>(c, "u64", n);
benchmark_contains::<usize>(c, "usize", n);
benchmark_contains_floats::<f32>(c, "f32", n);
benchmark_contains_floats::<f64>(c, "f64", n);

smu160 · 2024-07-17T19:04:51Z

@LaihoE Thank you for getting back to me!

In my view, more benchmarks is always better. I also think we can use Criterion to take care of the plotting. That would be nice since that would not require the use of python. I'm running the benchmarks on a server right now to see how it looks.
I'll post the results here once they are complete.

LaihoE · 2024-07-17T20:08:55Z

sounds good!

smu160 · 2024-07-21T21:50:31Z

Since we need to use a python script anyway, I wanted to bring up divan. It would allow you to remove the repeat benchmark function calls to account for different types. Take a look at how convenient it is here.

The downside is divan currently doesn't support output to json/csv, so we'd have to parse the output manually. Just an option to consider.

LaihoE · 2024-07-28T15:57:53Z

Since we need to use a python script anyway, I wanted to bring up divan. It would allow you to remove the repeat benchmark function calls to account for different types. Take a look at how convenient it is here.

The downside is divan currently doesn't support output to json/csv, so we'd have to parse the output manually. Just an option to consider.

Divan certainly looks convenient but as Criterion is the go to benchmark tool, I feel like people may find the results less trustworthy. Divan also seems to be quite a young project.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to create the dataset in the plot? #3

How to create the dataset in the plot? #3

smu160 commented Jul 16, 2024

LaihoE commented Jul 16, 2024

LaihoE commented Jul 16, 2024

smu160 commented Jul 17, 2024

LaihoE commented Jul 17, 2024

smu160 commented Jul 21, 2024

LaihoE commented Jul 28, 2024

How to create the dataset in the plot? #3

How to create the dataset in the plot? #3

Comments

smu160 commented Jul 16, 2024

LaihoE commented Jul 16, 2024

LaihoE commented Jul 16, 2024

smu160 commented Jul 17, 2024

LaihoE commented Jul 17, 2024

smu160 commented Jul 21, 2024

LaihoE commented Jul 28, 2024