Data preparation is an important part of any data science project. In this section, we will prepare the data for analysis. Data preparation helps us achieve better accuracy:
- We will start by shuffling the dataset:
from sklearn.utils import shuffle
X_train, y_train = shuffle(X_train, y_train)
- Now, we will transform the data into grayscale and normalize it:
X_train_gray = np.sum(X_train/3, axis=3, keepdims=True)
X_test_gray = np.sum(X_test/3, axis=3, keepdims=True)
X_validation_gray = np.sum(X_validation/3, axis=3, keepdims=True)
X_train_gray_norm = (X_train_gray - 128)/128
X_test_gray_norm = (X_test_gray - 128)/128
X_validation_gray_norm = (X_validation_gray - 128)/128
- Next, we will check the images following the grayscale conversion:
i = 610
plt.imshow(X_train_gray[i].squeeze(), cmap='gray')
plt.figure()
plt.imshow(X_train[i])
The output image should look as follows:
Fig 7.3: Grayscale image
In the next section, we will start the model training process...