Padding issues in video compression model (SSF: scale-space flow) #154

herok97 · 2022-07-13T14:37:54Z

herok97
Jul 13, 2022

Hello.
First of all, thank you for the good work.

For image compression models, considering the resolution of the hyperprior, I understand that the input image's spatial resolution should be a multiple of 64. (Kodak24 satisfy this condition)

For video compression model (SSF), since hyperprior encoder down-scale the latent y 3 times, so the input image's spatial resolution should be a multiple of 128. (I think)

But UVG dataset's spatial dimension is 1080x1920.

So i did just perform padding (reflection) the input frames so that the spatial resolution of the frames is a multiple of 128. (1152x1920)

But the result shows that P-frame compression performance is not much better than I-frame compression performance like below (quality 4).

Video name: Beauty (GOP 12)
frame# ----- Bpp ------- PSNR----| -Res_Bpp-Motion_Bpp
Beauty 000, 0.096373, 33.829868, | 0.096373, 0.000000
Beauty 001, 0.113426, 33.837170, | 0.107114, 0.006312
Beauty 002, 0.110293, 33.835434, | 0.104244, 0.006049
Beauty 003, 0.106605, 33.844116, | 0.100617, 0.005988
Beauty 004, 0.101451, 33.867897, | 0.095602, 0.005849
Beauty 005, 0.099583, 33.888126, | 0.093596, 0.005988
Beauty 006, 0.095417, 33.928127, | 0.089784, 0.005633
Beauty 007, 0.091497, 33.947746, | 0.086296, 0.005201
Beauty 008, 0.089059, 33.971870, | 0.083719, 0.005340
Beauty 009, 0.087022, 34.006771, | 0.081744, 0.005278
Beauty 010, 0.085957, 34.008808, | 0.080664, 0.005293
Beauty 011, 0.082716, 34.018856, | 0.077377, 0.005340

Beauty 012, 0.078920, 34.021229, | 0.078920, 0.000000
Beauty 013, 0.087793, 34.046371, | 0.082330, 0.005463
Beauty 014, 0.082145, 34.026760, | 0.076898, 0.005247
Beauty 015, 0.081883, 34.017952, | 0.076806, 0.005077
Beauty 016, 0.081312, 34.029579, | 0.076281, 0.005031
Beauty 017, 0.080370, 34.024151, | 0.075324, 0.005046
Beauty 018, 0.081497, 34.020878, | 0.076204, 0.005293
Beauty 019, 0.082068, 34.008186, | 0.076991, 0.005077
Beauty 020, 0.081019, 34.009117, | 0.075679, 0.005340
Beauty 021, 0.081852, 34.009365, | 0.076466, 0.005386
Beauty 022, 0.079491, 34.031116, | 0.074336, 0.005154
Beauty 023, 0.076559, 34.029755, | 0.071451, 0.005108

Then i try to analysis what happened by performing center crop UVG dataset to 768x768 resolution with same test condition. (no padding)
i got this results (well performed i guess).

Video name: Beauty (GOP 12)
frame# ----- Bpp ------- PSNR----| -Res_Bpp-Motion_Bpp
Beauty 000, 0.086643, 34.743198, | 0.086643, 0.000000
Beauty 001, 0.077854, 34.867981, | 0.055990, 0.005425
Beauty 003, 0.062283, 34.860504, | 0.056641, 0.005642
Beauty 004, 0.056044, 34.889980, | 0.050890, 0.005154
Beauty 005, 0.058485, 34.888424, | 0.053168, 0.005317
Beauty 006, 0.056044, 34.859280, | 0.051161, 0.004883
Beauty 007, 0.057020, 34.869595, | 0.052192, 0.004829
Beauty 008, 0.054145, 34.831181, | 0.049371, 0.004774
Beauty 009, 0.050836, 34.863861, | 0.046224, 0.004612
Beauty 010, 0.050727, 34.832378, | 0.045953, 0.004774
Beauty 011, 0.051107, 34.835670, | 0.046441, 0.004666

Beauty 012, 0.072862, 34.713913, | 0.072862, 0.000000
Beauty 013, 0.064724, 34.843842, | 0.060113, 0.004612
Beauty 014, 0.051866, 34.804825, | 0.047201, 0.004666
Beauty 015, 0.051215, 34.842400, | 0.046604, 0.004612
Beauty 016, 0.047092, 34.867538, | 0.042860, 0.004232
Beauty 017, 0.048611, 34.870323, | 0.044434, 0.004178
Beauty 018, 0.045193, 34.853519, | 0.040961, 0.004232
Beauty 019, 0.047092, 34.858543, | 0.043077, 0.004015
Beauty 020, 0.045898, 34.857796, | 0.041558, 0.004340
Beauty 021, 0.043620, 34.851822, | 0.039117, 0.004503
Beauty 022, 0.039876, 34.868626, | 0.035590, 0.004286
Beauty 023, 0.040744, 34.894634, | 0.036404, 0.004340

What kind of method is the best solution for me? (for solving padding issues)

(+Add: Bpp and PSNR calculation codes )

            # Counting
            if len(imgs) == 12:
                bpp_list = []
                bpp_r_list = []
                bpp_m_list = []
                psnr_list = []

                frame_strings, shape_infos = model.compress(imgs)

                # Bitrate
                for j, string in enumerate(frame_strings):
                    bpp_r = 0
                    bpp_m = 0

                    if j == 0:  # for key frame
                        for sub_string in string:
                            bpp_r += len(sub_string[0]) * 8
                        bpp_r /= h * w
                        bpp_list.append(bpp_r)
                        bpp_r_list.append(bpp_r)
                        bpp_m_list.append(0)

                    else:  # for inter frame
                        r_strings = string['residual']
                        m_strings = string['motion']

                        for sub_string in r_strings:
                            bpp_r += len(sub_string[0]) * 8

                        for sub_string in m_strings:
                            bpp_m += len(sub_string[0]) * 8

                        bpp_r /= h * w
                        bpp_m /= h * w
                        bpp = bpp_r + bpp_m
                        bpp_list.append(bpp)
                        bpp_r_list.append(bpp_r)
                        bpp_m_list.append(bpp_m)

SSF pre-trained model results (quality 3, 4) compare to the paper's results

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Padding issues in video compression model (SSF: scale-space flow) #154

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Padding issues in video compression model (SSF: scale-space flow) #154

herok97 Jul 13, 2022

Replies: 0 comments

herok97
Jul 13, 2022