Skip to content

Commit

Permalink
Merge pull request #11 from Orange-OpenSource/pre-release-3.4
Browse files Browse the repository at this point in the history
Cool-chic 3.4: 30% less complex!
  • Loading branch information
theoladune authored Nov 8, 2024
2 parents f60ebbe + 520c5e7 commit 8a88b4e
Show file tree
Hide file tree
Showing 575 changed files with 5,026 additions and 2,842 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/static-pages.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ jobs:
uses: actions/checkout@v4
- name: Install dependencies
run: |
pip install -U torch fvcore einops psutil torchvision sphinx shibuya sphinx-autodoc-typehints sphinx-copybutton
pip install -U torch fvcore einops psutil torchvision sphinx shibuya sphinx-autodoc-typehints sphinx-copybutton sphinx-design
- name: Sphinx build
run: |
PYTORCH_JIT=0 sphinx-build docs/source/ docs/build
Expand Down
139 changes: 94 additions & 45 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,44 +27,48 @@
<a href="https://orange-opensource.github.io/Cool-Chic/"><strong>Explore the docs »</strong></a>
<br />
<br />
<a href="https://orange-opensource.github.io/Cool-Chic/getting_started/results.html">Decode provided bitstreams</a>
<a href="https://orange-opensource.github.io/Cool-Chic/getting_started/new_stuff.html">What's new in 3.4?</a>
·
<a href="https://orange-opensource.github.io/Cool-Chic/getting_started/results.html#clic20-pro-valid">Compression performance</a>
<a href="https://orange-opensource.github.io/Cool-Chic/getting_started/results.html">Decode some bitstreams</a>
·
<a href="https://orange-opensource.github.io/Cool-Chic/getting_started/results.html#clic20-pro-valid">Coding performance</a>
</p>
</div>

<!-- # What's Cool-chic? -->

Cool-chic (pronounced <span class="ipa">/kul ʃik/</span> as in French 🥖🧀🍷) is
is a low-complexity neural image codec based on overfitting. It offers image coding
performance competitive with *H.266/VVC for 2000 multiplications* per decoded
a low-complexity neural image codec based on overfitting. It offers image coding
performance competitive with **H.266/VVC for 1000 multiplications** per decoded
pixel.



<div align="center">

#### 🏆 **Coding performance**: Cool-chic compresses images as well as H.266/VVC 🏆
#### 🚀 **Fast CPU-only decoder**: Decode a 1280x720 image in 100 ms on CPU with our decoder written in C 🚀
#### 🔥 **Fixed-point decoder**: Fixed-point arithmetic at the decoder for bit-exact results on different hardwares 🔥
#### 🖼️ **I/O format**: Encode PNG, PPM and YUV file with a bitdepth of 8 to 16 bits 🖼️

</div>

#

### Current & future features

- Coding performance
- ✅ On par with VVC for image coding
- ❌ Upcoming improved Cool-chic video
- I/O format
- ✅ PPM for 8-bit RGB images, yuv420 8-bit and 10-bit
- ❌ yuv444
- ❌ Additional output precisions (12, 14 and 16-bit)
- ❌ Output PNG instead of PPM for the decoded images
- Decoder
- ✅ Fast C implementation
- ✅ Integer computation for the ARM
- ✅ Complete integerization
- ✅ Decrease memory footprint & faster decoding

### Latest release: 🚀 __Cool-chic 3.3: An even faster decoder!__ 🚀

- Make the **CPU-only decoder** even faster.
- Decode a 720p image in **100 ms**, **2x faster** than Cool-chic 3.2
- Full **integerization** of the decoder for replicability
- Reduce decoder **memory footprint**
- **Optimized** implementation of 3x3 convolutions & fusion of successive 1x1 convolutions

<div align="center">

### Latest release: 🎉 __Cool-chic 3.4: 30% less complex!__ 🎉

</div>

- New and improved latent **upsampling module**
- Leverage symmetric and separable convolution kernels to reduce complexity & parameters count
- Learn two filters per upsampling step instead of one for all upsampling steps
- 1% to 5% **rate reduction** for the same image quality
- **30% complexity reduction** using a smaller Auto-Regressive Module
- From 2000 MAC / decoded pixel to 1300 MAC / decoded pixel
- **10% faster** decoding speed

Check-out the [release history](https://github.com/Orange-OpenSource/Cool-Chic/releases) to see previous versions of Cool-chic.

Expand Down Expand Up @@ -97,42 +101,88 @@ You're good to go!
The Cool-chic page provides [comprehensive rate-distortion results and compressed bitstreams](https://orange-opensource.github.io/Cool-Chic/getting_started/results.html) allowing
to reproduce the results inside the ```results/``` directory.

| Dataset | Vs. Cool-chic 3.1 | Vs. [_C3_, Kim et al.](https://arxiv.org/abs/2312.02753) | Vs. HEVC (HM 16.20) | Vs. VVC (VTM 19.1) | Avg decoder MAC / pixel | Avg decoding time [ms] |
|------------------|----------------------------------------------|----------------------------------------------------------|----------------------------------------------|----------------------------------------------|----------------------------------|----------------------------------|
| kodak | <span style="color:green" > - 1.9 % </span> | <span style="color:green"> - 3.4 % </span> | <span style="color:green" > - 16.4 % </span> | <span style="color:#f50" > + 4.5 % </span> | 1880 | 96 |
| clic20-pro-valid | <span style="color:green" > - 4.2 % </span> | <span style="color:green"> - 1.0 % </span> | <span style="color:green" > - 24.8 % </span> | <span style="color:green"> - 1.9 % </span> | 1907 | 364 |
| jvet class B | <span style="color:green" > - 7.2 % </span> | <span style="color:gray"> / </span> | <span style="color:green" > - 10.8 % </span> | <span style="color:#f50"> + 19.5 % </span> | 1803 | 260 |
<table class="tg"><thead>
<tr>
<th class="tg-86ol" rowspan="2"></th>
<th class="tg-86ol" colspan="6">BD-rate of Cool-chic 3.4 vs. [%]</th>
<th class="tg-86ol" colspan="2">Avg. decoder complexity</th>
</tr>
<tr>
<th class="tg-86ol"><a href="https://arxiv.org/abs/2001.01568" target="_blank" rel="noopener noreferrer">Cheng</a></th>
<th class="tg-86ol"><a href="https://arxiv.org/abs/2203.10886" target="_blank" rel="noopener noreferrer">ELIC</a></th>
<th class="tg-dfl2"><span style="font-weight:bold">Cool-chic 3.3</span></th>
<th class="tg-86ol"><a href="https://arxiv.org/abs/2312.02753" target="_blank" rel="noopener noreferrer">C3</a></th>
<th class="tg-86ol">HEVC (HM 16)</th>
<th class="tg-86ol">VVC (VTM 19)</th>
<th class="tg-86ol">MAC / pixel</th>
<th class="tg-86ol">CPU Time [ms]</th>
</tr></thead>
<tbody>
<tr>
<td class="tg-86ol">kodak</td>
<td class="tg-qch7"><span style="color:green" > -4.2 % </span></td>
<td class="tg-xd3r"><span style="color:red" > +7.5 % </span></td>
<td class="tg-qch7"><span style="color:green" > -0.9 % </span></td>
<td class="tg-qch7"><span style="color:green" > -4.3 % </span></td>
<td class="tg-qch7"><span style="color:green" > -17.2 % </span></td>
<td class="tg-xd3r"><span style="color:red" > +3.4 % </span></td>
<td class="tg-dfl2">1303</td>
<td class="tg-dfl2">74</td>
</tr>
<tr>
<td class="tg-86ol">clic20-pro-valid</td>
<td class="tg-qch7"><span style="color:green" > -13.2 % </span></td>
<td class="tg-qch7"><span style="color:green" > -0.2 % </span></td>
<td class="tg-qch7"><span style="color:green" > -0.3 % </span></td>
<td class="tg-qch7"><span style="color:green" > -1.3 % </span></td>
<td class="tg-qch7"><span style="color:green" > -25.1 % </span></td>
<td class="tg-qch7"><span style="color:green" > -2.3 %<br> </span></td>
<td class="tg-dfl2">1357</td>
<td class="tg-dfl2">354</td>
</tr>
<tr>
<td class="tg-86ol">jvet </td>
<td class="tg-5niz"><span style="color:gray" >/</span></td>
<td class="tg-5niz"><span style="color:gray" >/</span></td>
<td class="tg-qch7"><span style="color:green" >-0.2 %</span></td>
<td class="tg-5niz"><span style="color:gray" >/</span></td>
<td class="tg-qch7"><span style="color:green" >-18.3 %</span></td>
<td class="tg-xd3r"><span style="color:red" >+18.6 %</span></td>
<td class="tg-dfl2">1249</td>
<td class="tg-dfl2">143</td>
</tr>
</tbody></table>

<br/>

_Decoding time are obtained on a single CPU core of an an AMD EPYC 7282 16-Core Processor_

_PSNR is computed in the RGB domain for kodak and CLIC20, in the YUV420 domain for jvet_


### Kodak

<div style="text-align: center;">
<!-- <img src="./results/image/kodak/rd.png" alt="Kodak rd results" width="800" style="centered"/> -->
<img src="./docs/source/assets/kodak/rd.png" alt="Kodak rd results" width="750" style="centered"/>
<img src="./docs/source/assets/kodak/perf_complexity.png" alt="Kodak performance complexity" width="750" style="centered"/>
<!-- <img src="./results/image/jvet/rd.png" alt="CLIC rd results" width="800" style="centered"/> -->
<img src="./docs/source/assets/kodak/concat_img.png" alt="Kodak rd results" width="90%" style="centered"/>
</div>
<br/>

### CLIC20 Pro Valid

<div style="text-align: center;">
<!-- <img src="./results/image/kodak/rd.png" alt="Kodak rd results" width="800" style="centered"/> -->
<img src="./docs/source/assets/clic20-pro-valid/rd.png" alt="CLIC20 rd results" width="750" style="centered"/>
<img src="./docs/source/assets/clic20-pro-valid/perf_complexity.png" alt="CLIC20 performance complexity" width="750" style="centered"/>
<!-- <img src="./results/image/jvet/rd.png" alt="CLIC rd results" width="800" style="centered"/> -->
<img src="./docs/source/assets/clic20-pro-valid/concat_img.png" alt="CLIC20 rd results" width="90%" style="centered"/>
</div>
<br/>

### JVET Class B

<div style="text-align: center;">
<!-- <img src="./results/image/kodak/rd.png" alt="Kodak rd results" width="800" style="centered"/> -->
<img src="./docs/source/assets/jvet/rd_classB.png" alt="JVET class B rd results" width="750" style="centered"/>
<img src="./docs/source/assets/jvet/perf_complexity_classB.png" alt="JVET class B performance complexity" width="750" style="centered"/>
<!-- <img src="./results/image/jvet/rd.png" alt="CLIC rd results" width="800" style="centered"/> -->
<img src="./docs/source/assets/jvet/concat_img_classB.png" alt="JVET class B rd results" width="90%" style="centered"/>
</div>
<br/>

</br>

# Thanks

Special thanks go to Hyunjik Kim, Matthias Bauer, Lucas Theis, Jonathan Richard Schwarz and Emilien Dupont for their great work enhancing Cool-chic: [_C3: High-performance and low-complexity neural compression from a single image or video_, Kim et al.](https://arxiv.org/abs/2312.02753)
Expand All @@ -154,7 +204,6 @@ Special thanks go to Hyunjik Kim, Matthias Bauer, Lucas Theis, Jonathan Richard

<div align="center">

</br>

#

Expand Down
9 changes: 4 additions & 5 deletions cfg/dec/hop.cfg
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
arm = 24,2
layers_synthesis = 40-1-linear-relu,3-1-linear-none,3-3-residual-relu,3-3-residual-none
arm = 16,2
layers_synthesis = 48-1-linear-relu,X-1-linear-none,X-3-residual-relu,X-3-residual-none
n_ft_per_res = 1,1,1,1,1,1,1
upsampling_kernel_size = 8
static_upsampling_kernel = False

ups_k_size = 8
ups_preconcat_k_size = 7
6 changes: 3 additions & 3 deletions cfg/dec/lop.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
arm = 8,2
layers_synthesis = 16-1-linear-relu,3-1-linear-none,3-3-residual-relu,3-3-residual-none
layers_synthesis = 16-1-linear-relu,X-1-linear-none,X-3-residual-relu,X-3-residual-none
n_ft_per_res = 1,1,1,1,1,1,1
upsampling_kernel_size = 4
static_upsampling_kernel = False
ups_k_size = 8
ups_preconcat_k_size = 7
6 changes: 3 additions & 3 deletions cfg/dec/mop.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
arm = 16,2
layers_synthesis = 16-1-linear-relu,3-1-linear-none,3-3-residual-relu,3-3-residual-none
layers_synthesis = 16-1-linear-relu,X-1-linear-none,X-3-residual-relu,X-3-residual-none
n_ft_per_res = 1,1,1,1,1,1,1
upsampling_kernel_size = 4
static_upsampling_kernel = False
ups_k_size = 8
ups_preconcat_k_size = 7
6 changes: 3 additions & 3 deletions cfg/dec/vlop.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
arm = 8,1
layers_synthesis = 8-1-linear-relu,3-1-linear-none,3-3-residual-none
layers_synthesis = 8-1-linear-relu,X-1-linear-none,X-3-residual-none
n_ft_per_res = 1,1,1,1,1,1,1
upsampling_kernel_size = 4
static_upsampling_kernel = False
ups_k_size = 8
ups_preconcat_k_size = 7
57 changes: 57 additions & 0 deletions coolchic/cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@

add_executable(ccdec)

set(CMAKE_INSTALL_PREFIX ${PROJECT_SOURCE_DIR})

if(WIN32)

message(STATUS "[ERROR] Cool-chic decoder not yet implemented for Windows...")

# Check Apple first, then UNIX (Apple + Linux) so that if we enter the UNIX if
# it means that we're on Linux.
elseif(APPLE)

if(CMAKE_SYSTEM_PROCESSOR STREQUAL "arm64")

# Changes when compiling for arm64 Apple Mac:
# - Remove all *_avx2.cpp and *_avx512.cpp files
# - Remove the -mfa from the compilation options
# - Remove all the target_link_options... what is this for??
#
# It only compiles using g++/gcc, not clang which defaults to
# an older version apparently?
# cmake -DCMAKE_C_COMPILER=/opt/homebrew/bin/gcc-13 -DCMAKE_CXX_COMPILER=/opt/homebrew/bin/g++-13 ..

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O3 -g -Wall -Winline")

target_sources(ccdec PRIVATE ccdecapi.cpp cc-bitstream.cpp cc-contexts.cpp arm_cpu.cpp syn_cpu.cpp BitStream.cpp TDecBinCoderCABAC.cpp Contexts.cpp)

else()

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O3 -g -mfma -Winline")

# For now, we compile *_avx2.cpp and files, but they are
# excluded from ccdec.cpp using quick & dirty #ifdef __APPLE__
target_sources(ccdec PRIVATE ccdecapi.cpp cc-bitstream.cpp cc-contexts.cpp arm_cpu.cpp arm_avx2.cpp ups_cpu.cpp ups_avx2.cpp syn_cpu.cpp syn_avx2.cpp BitStream.cpp TDecBinCoderCABAC.cpp Contexts.cpp)

set_source_files_properties(arm_avx2.cpp PROPERTIES COMPILE_FLAGS "-mavx2")
set_source_files_properties(ups_avx2.cpp PROPERTIES COMPILE_FLAGS "-mavx2")
set_source_files_properties(syn_avx2.cpp PROPERTIES COMPILE_FLAGS "-mavx2")

endif()

elseif(UNIX)

message(STATUS "Architecture: Linux")

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O3 -g -mfma -Wall -Winline -DCCDEC_EXE -DCCDECAPI_AVX2_OPTIONAL")

target_sources(ccdec PRIVATE ccdecapi.cpp cc-bitstream.cpp cc-contexts.cpp cc-frame-decoder.cpp frame-memory.cpp arm_cpu.cpp arm_avx2.cpp ups_cpu.cpp ups_avx2.cpp syn_cpu.cpp syn_avx2.cpp BitStream.cpp TDecBinCoderCABAC.cpp Contexts.cpp)
set(CMAKE_EXE_LINKER_FLAGS "-static")

set_source_files_properties(arm_avx2.cpp PROPERTIES COMPILE_FLAGS "-mavx2")
set_source_files_properties(ups_avx2.cpp PROPERTIES COMPILE_FLAGS "-mavx2")
set_source_files_properties(syn_avx2.cpp PROPERTIES COMPILE_FLAGS "-mavx2")

endif()

Loading

0 comments on commit 8a88b4e

Please sign in to comment.