YTDL2TRANSCRIPT

Experiments in converting YT subtitle formats to a JSON transcript with word-level timings.

Setup

git clone, cd into repo

Run npm install or yarn,

Usage

run make

System Architecture

ytdl subtitles and automatic captions formats

ytdl can get subtitles or automatic captions in several formats: ttml, vtt, srv1, srv2, srv3; the 1st ytdl in the makefile tries to get srv3/ttml/vtt as preference order; the rest of the ytdl forces to get also ttml and vtt just to have something to test the conversion on the conversion script converts from srv3, ttml or vtt.

But from YT STT only srv3 and vtt has word timing, for now I process the real word timing only fron srv3, all the other ttml or vtt gets interpolated.

With youtubedl, you don't know if you always have srv3. So it falls back to the other formats. the precedence is srv3/ttml/vtt.

### non word level timing Now, on the non-word-level timing, the timings per line overlap because stuff is displayed in 2 lines that shift up.

Basically I have to discard the end times for each line and set them to the start of the next line then I interpolate words.

I think I can fix the repetition, then next is to lift the timecodes for the words, it won't be in all the words; then most likely use stt-align-node to spread that to the rest of the words.

Development env

Requires node >= 12
youtube-dl.

Build

NA

Tests

NA

Deployment

on npm, TBC

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github		.github
debug		debug
input		input
output		output
src		src
.editorconfig		.editorconfig
.gitignore		.gitignore
.nvmrc		.nvmrc
.prettierrc		.prettierrc
Makefile		Makefile
README.md		README.md
package.json		package.json
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YTDL2TRANSCRIPT

Setup

Usage

System Architecture

ytdl subtitles and automatic captions formats

Development env

Build

Tests

Deployment

About

Releases

Packages

Contributors 3

Languages

hyperaudio/ytdl2transcript

Folders and files

Latest commit

History

Repository files navigation

YTDL2TRANSCRIPT

Setup

Usage

System Architecture

ytdl subtitles and automatic captions formats

Development env

Build

Tests

Deployment

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages