Use spaCy to add timestamps to video chapters

[00:00:00] Adding Timestamps to Subheadings

Okay, so now that you have added subheadings, what would make it even better is if you can get the timestamp for where that section starts and add it as the timestamp for that particular chapter, like what you see over here. Now, as it turns out, because we are not dealing with the original text and you know the timestamps, which are generated by the Whisper library, it’s based on the initial audio and the timestamps are corresponding to the original set of words that it was able to transcribe. There will be some differences between the timestamps which are corresponding to the first step, which were, or the second step where you converted the MP3 to text file, and then the text that you already have after doing the other two steps. These two texts are not the same, there is enough difference between them that this task is a bit tricky, that is, finding these timestamps and adding them into the appropriate place.

[00:01:05] Using NLP Libraries for Timestamp Mapping

That is why you need to use an NLP library like spaCy. A library like spaCy can be really helpful and I’ll explain this in the lesson where I actually go through the code. The basic idea here is that spaCy has all these similarities between sentences, it’s able to detect similarities between sentences and features like that. These are all built-in features in spaCy itself which allows you to do this mapping pretty easily.

[00:01:39] Limitations of Large Language Models

Now, I’ll also bring up one more thing which is quite relevant here. There is a tendency to use these LLMs like GPT for every task where you need to do this natural language processing stuff and this is a good example where you can see there are some limitations for the LLMs themselves. What I’m doing here, where I use spaCy to map the chapter’s subheading to the exact timestamp, is something which is actually pretty tricky to do if you just try to do it using the prompt. Prompt engineering is not going to help you find the timestamp because there are a lot of reasons for that. I can even point you to a video where somebody, the spaCy founders rather, are talking about this concept called LLM maximalism. This is the idea that people use these large language models where they don’t even make sense. One of the things that they talk about is this precision stuff.

[00:02:46] Precision Mapping with spaCy

So, you want to add this timestamp, you want to have exact timestamps. You can see that, for example, I added this at the 33rd second and if you, in fact, go to the 33rd second, you’ll find that it does start with this text. The way I know that is because I do a precision mapping and that’s something that is possible, I’ll say, only using spaCy. At least till now, that’s what I think. I don’t think you can do something that precise using just the GPT API. But the bigger point that I’m making is that you need to learn multiple tools to be able to do these kinds of things and not get stuck with just one tool and try to do everything using that one thing. The more tools you have in your toolbox, the better the output is going to be.

[00:03:35] Enhancing Transcripts with spaCy and GPT4

So, in this case, using the spaCy library will allow you to add these timestamps to the subheadings that GPT4 generated. As you can see, that makes it much easier for people to skim the transcript and also get an idea of what each paragraph is talking about.

About this website

BotFlo1 was created by Aravind Mohanoor as a website which provided training and tools for non-programmers who were2 building Dialogflow chatbots.

This website has now expanded into other topics in Natural Language Processing, including the recent Large Language Models (GPT etc.) with a special focus on helping non-programmers identify and use the right tool for their specific NLP task. 

For example, when not to use GPT

1 BotFlo was previously called MiningBusinessData. That is why you see that name in many videos

2 And still are building Dialogflow chatbots. Dialogflow ES first evolved into Dialogflow CX, and Dialogflow CX itself evolved to add Generative AI features in mid-2023