How to split text into sentences using spaCy
When you are using spaCy to process text, one of the first things you want to do is split the text (paragraph, document etc) into individual sentences.
I will explain how to do that in this tutorial.
First, download and install spaCy
Create a new file in the same project called sentences.py

Add the following code into the sentences.py file
import spacy
nlp = spacy.load("en_core_web_sm")
doc = nlp('This is the first sentence. This is the second sentence.')
for sent in doc.sents:
print(sent)
Run the Python script.
Here is what you will see in the output.

As you can see, the paragraph has been split into the two sentences.
You can also iterate over each token in a sentence.
For example, create a new Python file called sentence_tokens.py and add the following code into it:
import spacy
nlp = spacy.load("en_core_web_sm")
doc = nlp('This is the first sentence. This is the second sentence.')
for sent in doc.sents:
for tok in sent:
print(tok)
When you run this script, this is what you see

You must be logged in to post a comment.