
Lei Yang - Sensetime
Aug 19 2015
BN is proposed by Sergey Ioffe and Christian Szegedy. Which one of the following papers is also published by Christian Szegedy?
A. (Deepid2)Deep Learning Face Representation by Joint Identification-Verification
B. (Joint Bayesian)Bayesian Face Revisited: A Joint Formulation
C. Robust Multi-Resolution Pedestrian Detection in Traffic Scenes
D. RASL: Robust alignment by sparse and low-rank decomposition for linearly correlated images
E. (Googlenet)Going Deeper with Convolutions
For each output channel k
G(k) is a corresponding subset of input channels
G(k)⊂{1,2,...,D} yijk=xijk(k+α∑m∈G(k)x2ijm)βFor each output channel k
n2ijk=1H′W′∑1≤i′≤H′,1≤j′≤W′x2i+i′−⌊(H′−1)/2⌋,j+j′−1−⌊(W′−1)/2⌋ yijk=xijk(1+αn2ijk)βTwo modes:
Two modes:
Problem: internal covariate shift
Change in the distribution of network activations due to the change in network parameters during training
Idea: ensure the distribution of nonlinearity inputs remains more stable
For one feature map:
N=H×W×M ∂l∂γ=N∑i=1∂l∂yi⋅ˆx ∂l∂β=N∑i=1∂l∂yiParameters of BN layer can be updated by above equations.
For one feature map k:
∂l∂xijkm=∑i′j′km′∂l∂yi′j′km′∂yi′j′km′∂xijkm ∂yi′j′km′∂xijkm=γk((1−∂μk∂xijkm)1√σ2+ϵ−12(xi′j′km′−μk)(σ2k+ϵ)−3/2∂σ2k∂xijkm) ∂μk∂xijkm=1HWM=1N ∂σ2k∂xijkm=2N(xijkm−μk)Diff can be passed down through BN layer by above equations.
Network: xd_net_12m
Network: xd_net_12m
The picture below describes what kind of Normalization?
A. Cross-Channel Normalization
B. Spatial Normalization
C. Batch Normalization
D. Local Response Normalization
E. Mean-Variance Normalization
The number of parameter γ in a BN layer equals to?
A. Batch size
B. The number of feature maps
C. The number of activations
Network | lr | iter | pcadim | accuracy |
---|---|---|---|---|
(tile conv)sn01 bn | 0.01 | 150000 | 300 | 0.984000 |
(tile conv)sn02 bn | 0.05 | 150000 | 400/500/700 | 0.991167 |
(full conv)tn03 7x6x1024(3)->7x6x256(1)->512 bn | 0.03 | 150000 | 300 | 0.989000 |
(full conv)tn01 7x6x1024(3)->7x6x256(1)->512 bn | 0.05 | 150000 | 400 | 0.992667 |
(full conv)tn04 7x6x1024(3)->7x6x256(1)->512 bn | 0.1 | 150000 | 800/900 | 0.990167 |
(full conv)np04 tn01 -> no bn | 0.05 | 150000 | 400 | 0.990500 |
(full conv)np05 tn01 -> no bn | 0.01 | 150000 | 500/800 | 0.990000 |