Summary

[00:00:00] Summary of the Transcript Formatting Process

Now, I want to summarize what we have learned till now. All the steps that we had to perform to get started from the raw transcript into something which is the formatted transcript with the subheadings, with all the timestamps, and all that. The easiest way to do that is to show a side-by-side comparison. So, what you see on the left-hand side here is the raw output that happens at the step where you convert the MP3 file to text by using speech recognition, that is the VSPR library. What you see on the right is after you do the post-processing, asking GPT to add formatting and then add subheadings, and then using the spaCy library to find out the timestamps for these subheadings and all that. What you see on the right-hand side is what the fully processed transcript looks like.

[00:00:53] Quality and Usefulness of Automated Transcripts

As I mentioned before, it is not a 100 percent quality. Like if you go for professional transcription, you will probably find that they can improve this even more and find a few mistakes in this. But what I will say is that this is probably good enough for the 99 percent use case. There will be a few mistakes, but it is not going to be too noticeable. More importantly, because this is an online course that we are talking about, you will find that it is not likely that people are going to rely on the transcript to understand the material. They are usually going to watch the video, they are going to see the visuals, they are going to hear your voice and the words that you speak, and all that. This is going to be a sort of supplement to help them in case they are not able to follow what you are saying or maybe if they want to quickly jump to a specific point in the video and things like that. So, you may not need 100 percent accuracy for these kinds of transcripts.

[00:02:00] Exceptions and Final Thoughts

Now, I will just add one more thing. There are some use cases where this 100 percent accuracy is a requirement, so I am obviously not talking about those scenarios. But for the most part, people who are creating online courses should be able to just do these steps and automate this whole process. You will find that these steps, the sequence of steps, is enough. It is sufficient to produce a very professional transcript.


About this website

BotFlo1 was created by Aravind Mohanoor as a website which provided training and tools for non-programmers who were2 building Dialogflow chatbots.

This website has now expanded into other topics in Natural Language Processing, including the recent Large Language Models (GPT etc.) with a special focus on helping non-programmers identify and use the right tool for their specific NLP task. 

For example, when not to use GPT

1 BotFlo was previously called MiningBusinessData. That is why you see that name in many videos

2 And still are building Dialogflow chatbots. Dialogflow ES first evolved into Dialogflow CX, and Dialogflow CX itself evolved to add Generative AI features in mid-2023