In case you are wondering, the reason I have added this into my "Intro to spaCy" course is because I think it is important people use the right tool for the job before learning how to use a tool. Obviously, I don't think (as of August 2023) that spaCy is a good choice for any kind of generative text creation tasks. On the other hand, people are using the GPT API for stuff even when spaCy and other tools are better choices.
GPT is very impressive, but people sometimes mistakenly believe that it will be used for every Natural Language Processing (NLP) task.
In fact, Dialogflow is still better than GPT in many ways. So I don’t think GPT is the best tool even for building chatbots.
On top of that, there are some NLP tasks where GPT is actually a poor choice.
For example, things which you would normally do using the spaCy library are not good candidates for GPT.
spaCy vs ChatGPT
A student of mine recently asked “How is spaCy different from ChatGPT?”
spaCy is a NLP library, and ChatGPT is a large language model-based chatbot with some interesting “instruction following” capabilities.
So they are not really comparable.
However, it is possible to use GPT (the OpenAI API to be more precise) to do some tasks which are much better suited to spaCy. When you use GPT for tasks where it does not make much sense, it is called LLM Maximalism (this term was coined by the folks who built the spaCy library).
spaCy + GPT
But this does not mean you shouldn’t use GPT for those tasks. In fact, it is possible to use GPT to automate the process of generating training data for use within spaCy.
This helps you do two things simultaneously:
a) it reduces the time taken to generate high quality training data for your NLP tasks
b) it reduces the cost of doing the task, especially if you want to do it over a high volume of text
You can do this using the spacy-llm Python library.
I am creating a course about spacy-llm to explain some use cases, and will explain this in more detail in that course.