Home / Learn spaCy / What are spaCy models?
Learn spaCy

What are spaCy models?

When you are using spaCy you will see that they refer to the word “model” quite often.

So what are models?

A “model” in machine learning is the output of a machine learning algorithm run on data.

Source

It is easiest to explain using an example.

Create a file called spacy_model.py and add the following code

import spacy

nlp = spacy.blank("en")
text = 'Dialogflow, previously known as api.ai, is a chatbot framework provided by Google. Google acquired API.AI in 2016.'
doc = nlp(text)

print('Printing entities using blank model....')

for ent in doc.ents:
    print(ent)

print('Completed....')

nlp = spacy.load("en_core_web_sm")
doc = nlp(text)

print("Printing entities using default model....")

for ent in doc.ents:
    print(ent)

print('Completed....')

This is the output when you run this program

So here is an explanation of what we are doing here:

First, we load a blank model (line 3).

Then we print all the entities in the document. As you can see, spacy does not recognize any entities at all.

Then we load the pretrained en_core_web_sm model (line 14). Once the pretrained models are used, spaCy is able to identify some entities in the same text.

Larger models can do better named entity recognition

As a general rule of thumb, the larger the file size of the spaCy model, the more entities it should be able to identify in your text.

For example, here is a comparison between en_core_web_sm which we have used before and en_core_web_md, a larger sized model

The en_core_web_md is a larger model and this means it will usually take a little longer to load the model. On the other hand, we usually expect it to find more entities.

Create a new file called large_model.py and add the following code to it

import spacy
text = 'Google, headquartered in Mountain View (1600 Amphitheatre Pkwy, Mountain View, CA 940430), unveiled the new Android phone for $799 at the Consumer Electronic Show. Sundar Pichai said in his keynote that users love their new Android phones.'

nlp = spacy.load("en_core_web_sm")
doc = nlp(text)

print("Printing entities using small model....")
counter = 1
for ent in doc.ents:
    print(f'{counter} {ent}')
    counter += 1

print('Completed....')

nlp = spacy.load("en_core_web_md")
doc = nlp(text)

print("Printing entities using medium model....")
counter = 1
for ent in doc.ents:
    print(f'{counter} {ent}')
    counter += 1

print('Completed....')

Here is the output when you run this program

As you can see, the en_core_web_md does better than en_core_web_sm by identifying 10 entities in the same text versus 8.

This may not seem like a big difference, but when there is a large quantity of text the difference is usually be much more obvious.

<— End of article —>


This website contains affiliate links. See the disclosure page for more details. 
"The magic key I needed as a non-programmer"

The custom payload generator was the magic key I needed (as a non-programmer) to build a good demo with rich responses in DialogFlow Messenger. I've only used it for 30 minutes and am thrilled. I've spent hours trying to figure out some of the intricacies of DialogFlow on my own. Over and over, I kept coming back to Aravind's tutorials available on-line. I trust the other functionalities I learn to use in the app will save me additional time and heartburn.

- Kathleen R
Cofounder, gathrHealth
"Much clearer than the official documentation to be honest"

Thanks a lot for the advice (of buying and following your videos)! They helped a lot indeed. Everything is very clear when you explain, much clearer than the official documentation to be honest 🙂

Neuraz T
Review for Learn Dialogflow CX
"I will strongly recommend this course because even I can learn how to design chatbot (no programming background)"

I think Aravind really did a great job to introduce dialogflow to people like me, without programming background. He organizes his course in very clear manner since I have been a college professor for 20 years. It is very easy for me to recognize how great Aravind’s course is! Very use-friend and very easy to follow. He doesn’t have any strong accent when he gives the lectures. It is so easy for me to understand. Really appreciate it.

Yes, I will strongly recommend this course because even I can learn how to design chatbot (no programming background) after studying Avarind’s course, you definitely can!

Ann Cai
Review for Learn Dialogflow ES

Similar Posts