As promised, I worked out an implementation in C for 1brc. The goal is to provide some benchmarks with C code and challenge other implementations. I am specially interested in what performance can be achieved by implementations in different languages for this particular application.
This implementation shows, by using some simple and comprehensive optimization techniques (no SIMD and no dependencies), the performance can be considerably increased.
To know more about 1BRC, here are some pointers:
- What is 1BRC?: https://www.morling.dev/blog/one-billion-row-challenge/
- Git repository: https://github.com/gunnarmorling/1brc
git clone https://github.com/tieliao/1brc
cd 1brc
You may need to install jdk 21.
git clone https://github.com/gunnarmorling/1brc
cd 1brc
./mvnw clean verify
./create_measurements.sh 100000000
This will create a file, namely measurements.txt, with 100 million rows. This file contains 413 different city names.
To create a file (measurements3.txt) with 10k different city names:
./create_measurements3.sh 100000000
To test the baseline implementation:
time ./calculate_average_baseline.sh > output.txt
make
Or run it manually:
gcc -O3 -DNDEBUG -o summarize 1brc.c -lpthread
time ./summarize 1brc/measurements.txt $(nproc)
I was limited by the resources of my laptop which is an Intel 4-core i7-9750H CPU @ 2.60GHz/6GB RAM. So I can only test with 100 million rows. The first version of my code includes already some basic optimizations and gives an execution time of 3.6s. I did further optimizations with multi-threads, on hashing and parsing. The final version gives an execution time of 0.4s.
For comparaison, I ran the top 3 java best-performers and another C implementation as well.
Result (m:s.ms) | Implementation | Language | Author |
---|---|---|---|
00:00.390 | link | C | Tie Liao |
00:00.813 | link | C | Danny van Kooten |
00:00.880 | link | 21.0.1-graalce | Artsiom Korzun |
00:00.945 | link | 21.0.1-graalce | Thomas Wuerthinger, Quan Anh Mai, Alfonso² Peterssen |
00:00.961 | link | 21.0.1-graalce | Jaromir Hamala |
00:27.131 | link | 21.0.1-graalce | Baseline Gunnar Morling |
The above results are using 413 different city names. I also tested with 10k different city names and the results are:
Result (m:s.ms) | Implementation | Language | Author |
---|---|---|---|
00:00.677 | link | C | Tie Liao |
00:01.406 | link | 21.0.1-graalce | Artsiom Korzun |
00:01.440 | link | 21.0.1-graalce | Jaromir Hamala |
00:01.609 | link | 21.0.1-graalce | Thomas Wuerthinger, Quan Anh Mai, Alfonso² Peterssen |
00:02.345 | link | C | Danny van Kooten |
00:33.134 | link | 21.0.1-graalce | Baseline Gunnar Morling |
The accuracy of output was successfully checked against that of baseline in both cases.
For anyone who is interested in running the code with 1 billion rows, please let me know.