Reason to set min-scene-len == TimeCodeValue("0.6s") as a default value #477

awkrail · 2025-01-30T01:48:45Z

Description:

Thank you for your work! I downloaded a video from here and applied HistDetector to it via both Python API and command line. I found that they outputs different scene cuts, and the parameter of min-scene-len is different. In Python API version, the default value =15 is set, but in command line version, value = TimeCodeValue("0.6s") is set in the code.
Could you tell me why min-scene-len is set to be TimeCodeValue("0.6s") as a default?
Example:

Python API

from scenedetect import detect, HistogramDetector
scene_list = detect('RoripwjYFp8_60.0_210.0.mp4', HistogramDetector())

Command line

scenedetect -i RoripwjYFp8_60.0_210.0.mp4 list-scenes detect-hist

Environment:
PySceneDetect: 0.6.5
Ubuntu

Media/Files:
Input video file:

RoripwjYFp8_60.0_210.0.mp4

The text was updated successfully, but these errors were encountered:

Breakthrough · 2025-01-30T02:48:19Z

I should start this answer off by pointing out, the current default values were both chosen very emperically by myself or another developer. I haven't had the time unfortunately to find a good dataset to use to find an optimal value, but I am more than open to any suggestions for how we should approach that. Feedback on the default values is always welcome too.

As for why the values are different, right now the SceneDetector interface assumes frame numbers are ints. This is something that I think could and should be fixed so that it takes a FrameTimecode instead. However, in order to create a FrameTimecode you must have the video framerate. This means we cannot set the default value for min_scene_len in the API in seconds, as we do not know the FPS of the video we are going to process statically.

The config file/command line parser handles this by storing the value as a string until after the video is loaded. As for why the values are the way they are, I recall finding 0.6 seconds seemed like a good reasonable default between what one would consider strobing/flashing images (e.g. as in #35), and was using a 23.976 FPS video to test. I rounded the frame count up from 14.3 to 15 and set that as the default for the API. I can't recall if it's ever been changed since to be honest.

These are things I think should be fixed, namely:

SceneDetector classes should take FrameTimecode objects instead of integer frame numbers. The unit of time between frames is also potentially useful information for a detector.
FrameTimecode objects should be able to be created in the absence of a framerate. In the event an operation can't be calculated due to missing information (e.g. you want to add frames with seconds, but neither have a defined framerate), an error can be thrown.
Eventually, a better default for min_scene_len should be chosen more systematically, even if the end result is still somewhat empirical (e.g. a survey).

I'll make sure the above items at least have a Github issue filed against them before closing this out.

Lastly, how do you find the default values for your purpose? Do they work well or are there any changes you would make?

Thanks for the question, hope this helps!

awkrail · 2025-01-30T03:14:51Z

@Breakthrough
Thank you for quick and detailed response! I understand how the values are determined.

The config file/command line parser handles this by storing the value as a string until after the video is loaded. As for why the values are the way they are, I recall finding 0.6 seconds seemed like a good reasonable default between what one would consider strobing/flashing images (e.g. as in #35), and was using a 23.976 FPS video to test. I rounded the frame count up from 14.3 to 15 and set that as the default for the API. I can't recall if it's ever been changed since to be honest.

I applied different detectors to several videos, and most of cuts are accurate, so min_scene_len=15 seems to be a reasonable score. But as you pointed out, probably these values, including threshold, may be determined based on somehow statistical approach. I will investigate whether the suitable training datasets exist.

SceneDetector classes should take FrameTimecode objects instead of integer frame numbers. The unit of time between frames is also potentially useful information for a detector.

FrameTimecode objects should be able to be created in the absence of a framerate. In the event an operation can't be calculated due to missing information (e.g. you want to add frames with seconds, but neither have a defined framerate), an error can be thrown.

Eventually, a better default for min_scene_len should be chosen more systematically, even if the end result is still somewhat empirical (e.g. a survey).

It seems interesting. I will think about how to treat these issue.

Lastly, how do you find the default values for your purpose? Do they work well or are there any changes you would make?

Inspired by PySceneDetect, I am developing Shutoh, which is yet another scene detector written in C++20. This is my personal project, and to learn C++, I thought that implementing PySceneDetect in C++ is a good starting point. In addition, PySceneDetect is very great project, so I think about how to contribute to PySceneDetect.

Breakthrough · 2025-02-01T01:21:25Z

Inspired by PySceneDetect, I am developing Shutoh, which is yet another scene detector written in C++20. This is my personal project, and to learn C++, I thought that implementing PySceneDetect in C++ is a good starting point. In addition, PySceneDetect is very great project, so I think about how to contribute to PySceneDetect.

Very cool - I've wanted to do something similar for a while, but haven't had time to pursue it. It looks like you have made a lot of progress, that is great to see!

There are many API changes I wish I could make, but can't anymore (at least not without causing significant churn) due to how many packages depend on the scenedetect API. If you are curious, I can discuss with you further some of those changes if you think it would help your project. If you are only making a CLI tool though, the API is less important than the user interface :)

awkrail · 2025-02-01T13:55:36Z

Very cool - I've wanted to do something similar for a while, but haven't had time to pursue it. It looks like you have made a lot of progress, that is great to see!

Thank you very much! As discussed in SceneStats, my major concern is speed. PySceneDetect works well with reasonable speed for 2-3 mins videos (2.0 secs via APIs), but it takes much time for longer 10- mins videos. This motivates me to develop the Shutoh.

There are many API changes I wish I could make, but can't anymore (at least not without causing significant churn) due to how many packages depend on the scenedetect API. If you are curious, I can discuss with you further some of those changes if you think it would help your project. If you are only making a CLI tool though, the API is less important than the user interface :)

Thank you! I am very interested in it. Currently I am focusing on developing CLI tool, but am going to develop APIs in the future. I will join the PySceneDetect discord channel, and glad to talk with you on it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reason to set min-scene-len == TimeCodeValue("0.6s") as a default value #477

Reason to set min-scene-len == TimeCodeValue("0.6s") as a default value #477

awkrail commented Jan 30, 2025

Breakthrough commented Jan 30, 2025 •

edited

Loading

awkrail commented Jan 30, 2025

Breakthrough commented Feb 1, 2025

awkrail commented Feb 1, 2025 •

edited

Loading

Reason to set min-scene-len == TimeCodeValue("0.6s") as a default value #477

Reason to set min-scene-len == TimeCodeValue("0.6s") as a default value #477

Comments

awkrail commented Jan 30, 2025

Breakthrough commented Jan 30, 2025 • edited Loading

awkrail commented Jan 30, 2025

Breakthrough commented Feb 1, 2025

awkrail commented Feb 1, 2025 • edited Loading

Breakthrough commented Jan 30, 2025 •

edited

Loading

awkrail commented Feb 1, 2025 •

edited

Loading