Free AI web copilot to create summaries, insights and extended knowledge, download it at here
1271
Abstract
</figure></iframe></div></div></figure><p id="2951">Day 34–35: 2020.05.15–16
Paper: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size Category: Model/Optimization</p><h1 id="3957">SqueezeNet</h1><h2 id="43c3">Strategy</h2><ol><li>Replace 3x3 filters with 1x1 filters</li><li>Decrease the number of input channels to 3x3 filters using <b><i>squeeze layers</i></b></li><li>Downsample late in the network so that convolution layers have large activation maps</li></ol><h2 id="8896">Fire Module, comprised of</h2><ul><li>a squeeze convolution layer (which has only 1x1 filters) (as per Strategy 1)</li><li>feeding into an expand layer that has a mix of 1x1 and 3x3 convolution filters</li><li>We expose three tunable dimensions (hyperparameters) in a Fire module: s_(1x1), e_(1x1), and e_(3x3).</li><li>In a Fire module, s_(1x1) is the number of filters in the squeeze layer (all 1x1), e_(1x1) is the number of 1x1 filters in the expand layer, and e_(3x3) is the number of 3x3 filters in the expand layer.</li><li>When we use Fire modules we set <b>s_(1x1) to be less than [e_(1x1) + e_(3x3)]</b>, so the squeeze layer helps to limit the number of input channels to the 3x3 filters, as per Strategy 2.</li></ul><figure id="5d9
Options
e"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*geoa4bSP7Xj3A70MS1CxVw.png"><figcaption></figcaption></figure><h2 id="3a92">SqueezeNet Architecture</h2><ul><li>begins with a standalone convolution layer (conv1)</li><li>followed by 8 Fire modules (fire2–9)</li><li>ending with a final conv layer (conv10)</li><li>gradually increase the number of filters per fire module from the beginning to the end of the network</li><li>performs max-pooling with a stride of 2 after layers conv1, fire4, fire8, and conv10; these relatively late placements of pooling are per Strategy 3</li></ul><figure id="94f4"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*vng_yuHSKPR9N6SXQXNn4g.png"><figcaption></figcaption></figure><p id="b986">Other details</p><ul><li>To have the same height and width in the output activations from 1x1 and 3x3 filters, add a 1-pixel border of zero-padding in the input data to 3x3 filters of expand modules.</li><li>ReLU is applied to activations from squeeze and expand layers.</li><li>Dropout with a ratio of 50% is applied after the fire9 module.</li><li>Lack of fully-connected layers</li><li>begin with a learning rate of 0.04, and linearly decrease the learning rate throughout training</li></ul></article></body>
Day 34–35: 2020.05.15–16 Paper: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size Category: Model/Optimization


Other details