クルマで想定されるスペック4クラウド エッジMany classes (1000s) Few classes(<10)Large workloads Frame rates (15‐30 FPS)High efficiency(Performance/W)Low cost & low power(1W‐5W)Server form factor Custom form factorJ. Freeman (Intel), “FPGA Acceleration in the era of high level design”, 2017
• Normalizing theresultof MAC operations• Batch normalization isnecessary for theBinarized CNN toimprove its accracy20Normalization for Binarized DNN BatchNorm 0204060801001 80 160 200Error rate[%]epochWithout BNWith BNH. Nakahara, H. Yonekawa, T. Sasao, H. Iwamoto, and M. Motomura, "A Memory‐Based Realization of a Binarized Deep Convolutional Neural Network," The International Conference on Field‐Programmable Technology (FPT 2016), pp.273‐76, 2016.meanvarianceScaling Shift
21.
• Batch Normalizationis implemented by fixedpoint adders and multipliers21バッチ正規化を導⼊した回路Adder treeBatch normalizationSign bitXNOR gate
22.
• The outputfrom batchnormalization( ) is theinput to sign functionConstant factor canbe ignored• The input from batchnormalization( ) is theinteger valueTo integer22バッチ正規化をバイアスで実現