Classifier units are normally considered to separate a database into various classes. The Naive Bayes classifier scheme is widely considered in literature to segregate the texts based on the trained model. This section of the chapter initially considers a text database with keywords; feature extraction extracts the key phrases from the text and trains the classifier system. Then, term frequency-inverse document frequency (tf-idf) transformation is implemented to specify the importance of the word. Finally, the output is predicted and printed using the classifier system.
Building a text classifier
How to do it...
- Include the following lines in a new Python file to add datasets:
from sklearn.datasets import fetch_20newsgroups...