Consider video length #68

DuendeInexistente · 2024-09-03T00:19:44Z

I find it odd that video length isn't taken into consideration- at least, videos with different length should be given greater dupe distance.

ianwal · 2024-09-28T19:17:00Z

I thought about considering videos of different length to be non-duplicates, but some people requested that the small clip in larger video be included so I just left it as is.

Is there a specific reason that you would prefer to not consider them? Are you getting lots of small clips that are included in larger videos marked as duplicates?

DuendeInexistente · 2024-09-28T19:36:05Z

It's more that it creates a lot of false positives, and I imagine getting distance based on difference would be a nightmare in terms of processing power and programming it. So making it so length affects the distance should be a good compromise.

IMO the best implementation, if we want to base it on available data alone, would be only videos with the same number of frames and length are distance 0, only one matching (IE a gif with the same frames but slower so different time, or same length but twice the frames) are distance 2, and stuff that doesn't match either is distance 4. This keeps the clips of larger videos as positives but gives you degrees of closeness to filter out the same video from different sources. Resolution could be counted as a data point too, but one of the main purposes of deduplicating is picking the highest resolution. Maybe make resolution-frames-time distance 0 and frames-time distance 2, with only-one as 4 and neither as 6?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider video length #68

Consider video length #68

DuendeInexistente commented Sep 3, 2024

ianwal commented Sep 28, 2024

DuendeInexistente commented Sep 28, 2024

Consider video length #68

Consider video length #68

Comments

DuendeInexistente commented Sep 3, 2024

ianwal commented Sep 28, 2024

DuendeInexistente commented Sep 28, 2024