The next architecture we're going to discuss is Visual Geometry Group (VGG) (from Oxford's Visual Geometry Group, Very Deep Convolutional Networks for Large-Scale Image Recognition, https://arxiv.org/abs/1409.1556). The VGG family of networks remains popular today and is often used as a benchmark against newer architectures. Prior to VGG (for example, LeNet-5: http://yann.lecun.com/exdb/lenet/ and AlexNet), the initial convolutional layers of a network used filters with large receptive fields, such as 11×11. Additionally, the networks usually had alternating single convolutional and pooling layers. The authors of the paper observed that a convolutional layer with a large filter size can be replaced with a stack of two or more convolutional layers with smaller filters (factorized convolution). For example, we can replace...





















































