video_utils.subtitles package

Submodules

video_utils.subtitles.ccextract module

Wrapper utils for running ccextractor CLI

video_utils.subtitles.ccextract.ccextract(in_file: str, out_base: str, text_info: dict) list[str] | None

Wrapper for the ccextrator CLI

Simply calls ccextractor using subprocess.Popen

Parameters:
  • in_file (str) – File to extract closed captions from

  • out_base (str) – Base name for output file(s)

  • text_info (dict) – Text information from call to mediainfo

Keyword Arguments:

None.

Returns:

Paths to files that ccextractor created

Return type:

list

video_utils.subtitles.ccextract.dir_list(path: str)

Generate path to all files in a directory

Parameters:

path (str) – Path to directory, or file, for directory to search. If is file, will get dirname of path

Returns:

File path generator

Return type:

generator

video_utils.subtitles.opensubtitles module

Interface with opensubtitles API

Various utilities for try to download SRT subtitle files from opensubtitles website using information entered by user or from information parsed from file name

class video_utils.subtitles.opensubtitles.OpenSubtitles(username: str | None = None, userpass: str | None = None, verbose: bool = False, **kwargs)

Bases: ServerProxy

A python class to download SRT subtitles for opensubtitles.org.

api_url = 'https://api.opensubtitles.org:443/xml-rpc'
attempts = 10
check_status(resp)

Check request status

Anything other than “200 OK” raises a UserWarning

download(sub)

Download subtitle file and return the decompressed data.

get_forced = False
get_subtitles(fpath: str, **kwargs) list[str] | None

Attempt to log-in to, download from, and log-out of the server.

No user interaction requried.

Parameters:

fpath (str) – Full path to the movie file to download SRT file for.

Keyword Arguments:
  • title (str) – Set to title of movie to search for. Default is to use title from file.

  • imdb (str) – Set to IMDb id of moive to search for. Default is to try to get IMDb id from file name.

  • lang (str,list) – String of list of strings to language to download subtitle in using ISO 639-2 code. Default is english (eng).

  • verbose (bool) – Set to True to increase verbosity. Default isFalse

  • nsubs (int) – Set to the number of files subtitles to download for each file. Default is one (1).

  • sort (str) –

    Set the sorting method used for downloading. Sorted in descending order. Options are

    • score : Sort based on score (default)

    • downloads : Sort based on number of times downloaded

    • date : Sort based on upload date

  • track_num (int) – Set to specific ‘track’ number for labeling. Default is to start at zero.

  • get_forced (bool) – Set to True to get only forced subtitles. Default is to get full.

Returns:

Save an SRT subtitle file with same convetion as movie

file IF a subtitle is found.

imdb = None
lang = ['eng']
login()

Log in to OpenSubtitles

logout()

Log out from OpenSubtitles

nsubs = 1
save_srt(fpath='')

Save the SRT subtitle data to file

search_subs(**kwargs) None

search for, download, and save subtitles.

server_lang = 'en'
sort = 'score'
sort_subs(sub_data)

Sort subtitles by score, download count, and date.

title = None
track_num = None
user_agent = 'makemkv_to_mp4'
verbose = False

video_utils.subtitles.pgs_to_srt module

Parser/converter for PGS (.sup) files

Makes use of pgsrip package to make a ‘simplified’ converter for PGS files

class video_utils.subtitles.pgs_to_srt.MPath(path, lang: str = 'und')

Bases: MediaPath

Overrides some of MediaPath functionality

Wanted to be able to explicitly set the language and have the string representation be a bit different

class video_utils.subtitles.pgs_to_srt.MyPgsToSrtRipper(pgs: Pgs, options: Options)

Bases: PgsToSrtRipper

process(subs, items, post_process, confidence, max_width, oem, psm)

Slightly smarter processing

This processing method looks through subtitle items, checking to see if combining them will be too large. When combining will be too large, we run super().process() on the subtitles. Any unprocessed subtitles are returned as a list.

rip(post_process: Callable[[str], str])

Slightly smarter ripper

We use the new process() method to ensure that images for tesseract are not too large. Then, for any subtitles that are not processed, we keep reducing the confidence level to try to get some kind of match.

class video_utils.subtitles.pgs_to_srt.PgsParser(pgs, lang)

Bases: object

Parse information from PGS file

gen_display_sets()

Generate DisplaySet(s) from PGS file

gen_pgs_subtitle_items()

Generate PgsSubtitleItem(s) from PGS file

gen_segments()

Generate segments from PGS file

video_utils.subtitles.pgs_to_srt.pgs_to_srt(out_file, text_info, delete_source=False, **kwargs)

Convert PGS (.sup) to SRT

Parameters:
  • File (path-like) – Location of source PGS file to convert to SRT

  • lang (str) – 2-character code for subtitle language.

video_utils.subtitles.pgs_to_srt.rmfile(*args)

Remove file(s) without errors

Try to remove file(s); suppress errors

Parameters:

*args – Any number of files to remove

Returns:

None

video_utils.subtitles.srt_utils module

Utilities for working with SRT files

These tools can be used to clean up SRT files after converstion from image based formats, or it interact/update existing SRT files.

class video_utils.subtitles.srt_utils.SRTsubs(fpath)

Bases: object

A python class to parse SRT subtitle files

adjust_timing(offset: float) None

Adjust timing of subtitles

Parameters:

offset (float) – Change in time in seconds. Positive numbers shift to later time, negative to earlier.

Keyword Arguments:

None

Returns:

Updates the self.subs array

Return type:

None

parse_subs()

Parse subtitles from an SRT file into a list of dictionaries

Parameters:

None

Keyword Arguments:

None

Returns:

Adds a list of dictionaries to the class

Return type:

None

write_file(raw: bool = False) None

Write subtitle data to SRT file.

Parameters:

None

Keyword Arguments:

raw – Set to True to write raw data.

Returns:

Updates SRT file input

Return type:

None

video_utils.subtitles.srt_utils.srt_cleanup(fname, **kwargs) int

Fix some known bad characters in SRT file

A python function to replace J’ characters at the beginning or ending of a subtitle line with a musical note character as this seems to be an issue with the vobsub2srt program.

Parameters:

fname – Path to a file. This file will be overwritten.

Keyword Arguments:

verbose – Set to increase verbosity.

Returns:

Outputs a file to the same name as input;

i.e., over writes the file.

Return type:

int

video_utils.subtitles.subtitle_extract module

Utilities for extracting subtitles

video_utils.subtitles.subtitle_extract.check_files_exist(files: list[str]) bool

Check that all files exist

If any of the files in the input list do NOT exist, than will return False

Parameters:

files (array-like) – List of files to check if exist

Returns:

True if all of the files exists, False otherwise

Return type:

bool

video_utils.subtitles.subtitle_extract.gen_sub_info(out_base: str, stream: dict) dict | None

Generate information for subtitle streams

Build a dict that contains subtitle type and list of output subtitle files after extraction

Parameters:
  • out_base (str) – Base output file name that information for subtitle files will be appended to

  • stream (dict) – Information about given text stream for build/extrating text files

Returns:

dict

video_utils.subtitles.subtitle_extract.subtitle_extract(in_file: str, out_base: str, text_info: dict, **kwargs) tuple

Extract subtitle(s) from a file and convert them to SRT file(s).

If a file fails to convert, the VobSub files are removed and the program continues.

Parameters:
Keyword Arguments:

srt (bool) – Set to convert image based subtitles to srt format; Default does NOT convert file

Returns:

Updates subtitle_status and creates/updates list of subtitle

files that failed extraction Return codes for success/failure of extraction. Codes are as follows:

  • 0 : Completed successfully

  • 1 : VobSub(s) already exist

  • 2 : No VobSub(s) to extract

  • 3 : Error extracting VobSub(s).

  • 10 : mkvextract not found/installed

Return type:

int

Dependencies:

mkvextract - A CLI for extracting streams for an MKV file.

video_utils.subtitles.vobsub_to_srt module

Utilities for converting VobSubs to SRT

video_utils.subtitles.vobsub_to_srt.vobsub_to_srt(out_file: str, text_info: dict | None, delete_source: bool = False, cpulimit: int | None = None, **kwargs) tuple

Convert VobSub(s) to SRT(s).

Will convert all VobSub(s) in the output directory as long as a matching SRT file does NOT exist.

Parameters:

None

Keyword Arguments:

None

Returns:

Updates vobsub_status and creates/updates list of VobSubs that

failed vobsub2srt conversion. Returns codes for success/failure of extraction. Codes are as follows:

  • 0 : Completed successfully.

  • 1 : SRT(s) already exist

  • 2 : No VobSub(s) to convert.

  • 3 : Some VobSub(s) failed to convert.

Return type:

int

Dependencies:

vobsub2srt - A CLI for converting VobSub images to SRT

Module contents

Subtitle utilities

Various functions and class for extracting subtitle files from video files, converting image based subtitles to SRT, and download subtitles from OpenSubtitles.com

video_utils.subtitles.sub_to_srt(out_file: str, text_info: list[dict], **kwargs) list[str]

Convert image based subtitles to srt

DVDs and BluRays make use of image based subtitles in the form of VobSubs and PGS files, respectively. This wrapper function is used to convert either format to SRT using the vobsub_to_srt and pgs_to_srt functions, respectively.

Parameters:
  • out_file (str) – Base path and name for the output video file.

  • text_info (list) – List of dicts containing information about text streams in video file.

Keyword Arguments:

**kwargs – Passed directory to the converter functions.

Returns:

Paths to SRT files created. Will be empty if no

files to convert OR if all conversions failed.

Return type:

list