I want to ask the dimension change in different convolution and max-pooling layer. I am referring to the example in TensorFlow tutorial:
http://tensorflow.org/tutorials/mnist/pros/index.html#deep-mnist-for-experts
The original image is a 28x28x1
The first convolutional layer:
- apply convolution to a 5x5 patch with 32 features -> 24x24x32
- apply max-pooling 2x2 -> 12x12x32
Second convolutional layer:
- apply convolution to a 5x5 patch with 64 features -> 8x8x64
- apply max-pooling 2x2 -> 4x4x64
But it said "Now that the image size has been reduced to 7x7" but my calculation seems to claim that it is a 4x4
Did I miss some concept? I am new to CNN so it may be a beginner question.
Thanks
Asked By : LKS
Answered By : Wandering Logic
Your calculation would be correct if the example were following the "usual" approach of having convolution chop off the edges.
Instead the example you pointed to says:
How do we handle the boundaries? What is our stride size? In this example, we're always going to choose the vanilla version. Our convolutions uses a stride of one and are zero padded so that the output is the same size as the input.
So they are:
- zero-padding the 28x28x1 image to 32x32x1
- applying 5x5x32 convolution to get 28x28x32
- max-pooling down to 14x14x32
- zero-padding the 14x14x32 to 18x18x32
- applying 5x5x32x64 convolution to get 14x14x64
- max-pooling down to 7x7x64.
They probably have an option to turn the zero padding off. In other infrastructures I've used zero padding is not the default. (In several of the infrastructures I've used zero-padding isn't even possible.)
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/49658
0 comments:
Post a Comment
Let us know your responses and feedback