Now, let's explore how to implement YOLO v3 with Python. We will be using an implementation of YOLO v3 that has been trained on the COCO dataset.
The COCO dataset contains over 1.5 million object instances within 80 different object categories. We will use a pre-trained model that has been trained on the COCO dataset and explore its capabilities. Realistically, it would take many hours of training, even after using a high-end GPU, to achieve a reasonable model that can predict the required classes with good accuracy. Therefore, we will download the weights of the pre-trained network. This network is hugely complex, and the actual H5 file for the weights is over 200 MB in size.
Common objects in content (COCO) is a large-scale object detection, segmentation, and captioning dataset. The official website for COCO is http://cocodataset.org/#home.
COCO has several features:
- Object segmentation
- Recognition in context
- Superpixel stuff...