Convolutional Net
Basic
A Beginner's Guide to ConvoNet
filter/neuron/kernel
Depth of the filter/kernel has to be the same as the depth of the input. For image, the depth is the third dimension, storing RGB
kernel and receptive fields -> element wise multi -> a single number
Each of these filters/kernels are feature identifiers.
stride: the amount by which the filter shifts is the stride.
padding: in the early layers of our network, we want to preserve as much information about the original input volume so that we can extract those low level features. zero paddings help preserving the input volume size.
The world is compositional: the first layer extracts low level features(a curve, a straight line etc), .... higher level of features... (the semi-circle make up of curves)....
FC layer looks at what high level features most strongly correlate to a particular.
Back-propagation: how convolutional network is trained, and knowing what values to be in the filters/kernels. Forward-pass --> Loss function --> find out which inputs (weights in this case) most directly contributed to the loss & update (Gradient descent) -> backward - pass.
Choosing Hyperparameters: isn't a set standard. the network will largely depends on the type of data that you have, complexity of images, type of processing tasks etc. Guide: right combination that creates abstraction of the image at a proper scale.
ReLU layer: introduce nonlinearity after convolutions(linearity because just element-wise multiplication and summaztion). ReLU trains faster (computational efficiency).
Pooling layer: Once we know that a specific feature is in the original input volume, its exact location is not as important as its relative location to the other features. Pooling helps drastically reduces the spatial dimension of the input volume (not the depth). Amount of parameters is reduced / less computational cost. It also helps control overfitting.
Dropout layer: drop out a random set of activations in that layer by setting them to zero.
This forces the network to be redundant: it should be able to provide the right classification or output for a specific example even if some of the activations are dropped out.It makes sure that the network isn't getting too "fitted" to the training data and thus alleviate the overfitting problem.
convolution is equivalent to shifts.
Convolutional layer is for feature extraction
pooling layer is for invariant translation
fully connected layer? why we need it? for global statistic ... for classification.
Lab: Feb 13
Permutate: apply same permutation to all images...
It doesn't hurt fully - connected network because it doesn't take into order/locality