First layer weights for transfer learning with new input tensor in keras.applications models?

2017-12-04 12:44:32

In the pre-implemented models in keras (VGG16 ect) it is specified that we can change shape of the inputs of the models and still load the pre-trained imagenet weights.

What I am confused about is then what happens to the first layer weights? If the input tensor has a different shape, then the number of weights will be different than for the pre-trained models. So more granular questions are:

If there are less weights, are they discarded at random?

If there are more weights, are they randomly initialised?

Should we always set the first layer as trainable when doing transfer learning and changing the input tensor shape?

Here is the implementation of the Keras VGG16 model for reference.

The first layers are convolution and pooling ones:

For the convolutional layers, the only weights are the kernels and the biases, and they have fixed size (e.g. 3x3x3, 5x5x3) and do not depend on the input tensor shape.

The pooling layers do not have weights at all.

That's

  • The first layers are convolution and pooling ones:

    For the convolutional layers, the only weights are the kernels and the biases, and they have fixed size (e.g. 3x3x3, 5x5x3) and do not depend on the input tensor shape.

    The pooling layers do not have weights at all.

    That's why you can reuse the weights independently from the input tensor shape.

    With dense layers (i.e. the final layers), you need shapes to match, so you cannot reuse them if they do not.

    2017-12-04 14:39:37