How to download and install spaCy
In this tutorial I explain how to get started with spaCy. This is intended for programmers who are familiar with Python and are interested in using the PyCharm IDE.
Create a new Python project (note, PyCharm automatically creates it in a virtual environment)
Add a requirements.txt file into the project
In the requirements.txt add spaCy as a requirement
Use pip to install spaCy
pip install -r requirements.txt
You also need to download the en_core_web_sm file to use spaCy.
What is en_core_web_sm?
en_core_web_smis a small English pipeline trained on written web text (blogs, news, comments), that includes vocabulary, syntax and entities
python -m spacy download en_core_web_sm
Now add a file called main.py and add the following code
import spacy nlp = spacy.load("en_core_web_sm") doc = nlp('This is the first sentence') for tok in doc: print(tok)
Run the script
Here is the output from the Run window inside PyCharm
As you can see, the code takes the sentence, splits it into words (we refer to them as tokens in NLU) and then prints the tokens one per line.