The Yelp Polarity dataset
In this experiment, we will work with the Yelp Polarity dataset. This dataset is made up of a training size of 560,000 reviews and 38,000 reviews for testing, with each entry consisting of a text-based review and a label (positive – 1 and negative – 0). The data was drawn from customer reviews of restaurants, hair salons, locksmiths, and so on. This dataset presents some real challenges – for example, the reviews are made up of text with varying lengths, from short reviews to very long reviews. Also, the data contains the use of slang and different dialects. The dataset is available at this link: https://www.tensorflow.org/datasets/catalog/yelp_polarity_reviews.
Let’s start building our model:
- We will begin by loading the required libraries:
import tensorflow as tf
import tensorflow_datasets as tfds
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences...