Home / Miscellaneous / How to quickly search autogenerated subtitles for my YouTube channel

How to quickly search autogenerated subtitles for my YouTube channel

Check out my courses Learn Dialogflow ES and Learn Dialogflow CX if you would like to learn Dialogflow in depth. 

I have created a GitHub repo which is basically a bunch of Markdown files which allows me and others to search across the auto-generated subtitles from my YouTube videos.

Check out the GitHub repo

Since YouTube does not auto-generate subtitles for all my videos, and I have excluded Unlisted videos from the search, this repo does not cover all the videos in my channel.

How to use the GitHub repo

By making use of GitHub’s support for Markdown files and search etc, I have converted the autogenerated subtitles into a nice searchable repo.

Here is a list of things you can do:

Skim the transcript of an individual video

You can quickly skim the transcript of a video to see what the video is talking about. As you can see in the example below, it is quite easy to quickly skim the transcript and get an idea of what the video is talking about.

You can see that there is a timestamp and a small blurb used as the heading for each paragraph.

You can click on the timestamp and it will jump directly to that position in the video.

Search across all video subtitles

You can just use GitHub’s builtin search to search across a single repo.

Make sure you select the option “In this repository”. By the way, searching inside the current repo is the default option, so you don’t have to select it specifically.

Search only file name

GitHub allows you to search a specific repo for a filename containing specific words.

The name of the Markdown file I have uploaded includes the date the video was published. If you want to find videos selected on a given month or year, you can narrow the search by using the filename: prefix.

For example, the following search command will search for all files which include 2021 in the title, so you can use it like a date filter.

These are the results from executing the search.

Why Markdown?

I have generated Markdown corresponding to the subtitles for each video, and created one Markdown file per video.

GitHub automatically converts Markdown to HTML

Every Markdown file uploaded to a GitHub repo is automatically rendered as HTML. This is how you are able to skim the subtitles so easily.

Autogenerated table of contents for individual Markdown files

In addition, GitHub automatically turns the headings inside Markdown files into a Table Of Contents. You can access it by clicking on the “Filter Headings” feature on the top left of the file preview.

It would have been even better if the filter dropdown was wider, but this is still pretty handy.

Markdown support in IDEs

I use the WebStorm editor and you can see that it allows me to edit the Markdown and see an immediate preview of it in the right pane. In other words, the Markdown is now editable easily, plus the IDE is already connected to GitHub so making edits and checking them in is very easy.

All IDEs have this kind of Markdown editing feature nowadays, so it is quite simple to make updates if you create such a repo for your own YouTube videos.

Search and replace text

One of the clear problems with the autogenerated subtitles is the fact that they do have quite a lot of errors.

Using Markdown (i.e. plain text without any tags) format makes it very easy to do a global search and replace for words. In fact this feature is well supported in most IDEs, including WebStorm.

For example, this is what a search for Chatbase looks like before doing a global find and replace. Notice that there are only 3 results.

I then did a search and replace and updated “chat base” to “Chatbase”.

Then I checked in all my changes, and after a few minutes (GitHub needs a few minutes to build its search index), you can now see more search results for the same query.

All I had to do was run a global “Find and Replace” inside my WebStorm IDE, and I have made the transcripts more searchable.

The BotFlo app gives you many tools which can speed up your Dialogflow bot development

This website contains affiliate links. See the disclosure page for more details. 
"The magic key I needed as a non-programmer"

The custom payload generator was the magic key I needed (as a non-programmer) to build a good demo with rich responses in DialogFlow Messenger. I've only used it for 30 minutes and am thrilled. I've spent hours trying to figure out some of the intricacies of DialogFlow on my own. Over and over, I kept coming back to Aravind's tutorials available on-line. I trust the other functionalities I learn to use in the app will save me additional time and heartburn.

- Kathleen R
Cofounder, gathrHealth
Check out my YouTube courses

Dialogflow CX Beginner Tutorial

Dialogflow ES vs CX using a Decision Tree Bot

Intro to NLU for technical non-programmers

Better Dialogflow ES Bots Using the CTFS Framework

Search the autogenerated transcripts of all my YouTube videos
In this free course, I provide some tips for managing large Dialogflow ES bots without compromising on accuracy.

Similar Posts