Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

MobileNet

From Wikipedia, the free encyclopedia
Family of computer vision models designed for efficient inference on mobile devices
MobileNet
Developer(s)Google
Initial releaseApril 2017
Stable release
v4 / September 2024
Repositorygithub.com/tensorflow/models/tree/master/research/slim/nets/mobilenet
Written inPython
LicenseApache License 2.0

MobileNet is a family ofconvolutional neural network (CNN) architectures designed forimage classification,object detection, and other computer vision tasks. They are designed for small size, low latency, and low power consumption, making them suitable for on-device inference andedge computing on resource-constrained devices likemobile phones andembedded systems. They were originally designed to be run efficiently on mobile devices withTensorFlow Lite.

The need for efficient deep learning models on mobile devices led researchers atGoogle to develop MobileNet. As of October 2024[update], the family has four versions, each improving upon the previous one in terms of performance and efficiency.

Features

[edit]

V1

[edit]

MobileNetV1 was published in April 2017.[1][2] Its main architectural innovation was incorporation ofdepthwise separable convolutions. It was first developed by Laurent Sifre during an internship atGoogle Brain in 2013 as an architectural variation onAlexNet to improve convergence speed and model size.[3]

The depthwise separable convolution decomposes a single standard convolution into two convolutions: a depthwise convolution that filters each input channel independently and a pointwise convolution (1×1{\displaystyle 1\times 1} convolution) that combines the outputs of the depthwise convolution. This factorization significantly reduces computational cost.

The MobileNetV1 has two hyperparameters: awidth multiplierα{\displaystyle \alpha } that controls the number of channels in each layer. Smaller values ofα{\displaystyle \alpha } lead to smaller and faster models, but at the cost of reduced accuracy, and aresolution multiplierρ{\displaystyle \rho }, which controls the input resolution of the images. Lower resolutions result in faster processing but potentially lower accuracy.

V2

[edit]

MobileNetV2 was published in March 2019.[4][5] It usesinverted residual layers andlinear bottlenecks.

Inverted residuals modify the traditional residual block structure. Instead of compressing the input channels before the depthwise convolution, theyexpand them. This expansion is followed by a1×1{\displaystyle 1\times 1} depthwise convolution and then a1×1{\displaystyle 1\times 1} projection layer that reduces the number of channels back down. This inverted structure helps to maintain representational capacity by allowing the depthwise convolution to operate on a higher-dimensional feature space, thus preserving more information flow during the convolutional process.

Linear bottlenecks removes the typical ReLU activation function in the projection layers. This was rationalized by arguing that that nonlinear activation loses information in lower-dimensional spaces, which is problematic when the number of channels is already small.

V3

[edit]

MobileNetV3 was published in 2019.[6][7] The publication included MobileNetV3-Small, MobileNetV3-Large, and MobileNetEdgeTPU (optimized forPixel 4). They were found by a form ofneural architecture search (NAS) that takes mobile latency into account, to achieve good trade-off between accuracy and latency.[8][9] It used piecewise-linear approximations ofswish andsigmoidactivation functions (which they called "h-swish" and "h-sigmoid"),squeeze-and-excitation modules,[10] and the inverted bottlenecks of MobileNetV2.

V4

[edit]

MobileNetV4 was published in September 2024.[11][12] The publication included a large number of architectures found by NAS. Compared to the architectural modules used in V3, the V4 series included the "universal inverted bottleneck", which includes both inverted residual and inverted bottleneck as special cases, and attention modules with multi-query attention.[13]

See also

[edit]

External links

[edit]

References

[edit]
  1. ^Howard, Andrew G.; Zhu, Menglong; Chen, Bo; Kalenichenko, Dmitry; Wang, Weijun; Weyand, Tobias; Andreetto, Marco; Adam, Hartwig (2017-04-16),MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications,arXiv:1704.04861
  2. ^"MobileNets: Open-Source Models for Efficient On-Device Vision".research.google. June 14, 2017. Retrieved2024-10-18.
  3. ^Chollet, François (2017)."Xception: Deep Learning with Depthwise Separable Convolutions".2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 1800–1807.arXiv:1610.02357.doi:10.1109/CVPR.2017.195.ISBN 978-1-5386-0457-1.
  4. ^Sandler, Mark; Howard, Andrew; Zhu, Menglong; Zhmoginov, Andrey; Chen, Liang-Chieh (2019-03-21),MobileNetV2: Inverted Residuals and Linear Bottlenecks,arXiv:1801.04381
  5. ^"MobileNetV2: The Next Generation of On-Device Computer Vision Networks".research.google. April 3, 2018. Retrieved2024-10-18.
  6. ^"Introducing the Next Generation of On-Device Vision Models: MobileNetV3 and Mobi".research.google. November 13, 2019. Retrieved2024-10-18.
  7. ^Howard, Andrew; Sandler, Mark; Chu, Grace; Chen, Liang-Chieh; Chen, Bo; Tan, Mingxing; Wang, Weijun; Zhu, Yukun; Pang, Ruoming; Vasudevan, Vijay; Le, Quoc V.; Adam, Hartwig (2019)."Searching for MobileNetV3":1314–1324.arXiv:1905.02244.{{cite journal}}:Cite journal requires|journal= (help)
  8. ^Tan, Mingxing; Chen, Bo; Pang, Ruoming; Vasudevan, Vijay; Sandler, Mark; Howard, Andrew; Le, Quoc V. (June 2019)."MnasNet: Platform-Aware Neural Architecture Search for Mobile".2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE. pp. 2815–2823.arXiv:1807.11626.doi:10.1109/CVPR.2019.00293.ISBN 978-1-7281-3293-8.
  9. ^Yang, Tien-Ju; Howard, Andrew; Chen, Bo; Zhang, Xiao; Go, Alec; Sandler, Mark; Sze, Vivienne; Adam, Hartwig (2018)."NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications":285–300.arXiv:1804.03230.{{cite journal}}:Cite journal requires|journal= (help)
  10. ^Hu, Jie; Shen, Li; Sun, Gang (2018)."Squeeze-and-Excitation Networks":7132–7141.{{cite journal}}:Cite journal requires|journal= (help)
  11. ^Qin, Danfeng; Leichner, Chas; Delakis, Manolis; Fornoni, Marco; Luo, Shixin; Yang, Fan; Wang, Weijun; Banbury, Colby; Ye, Chengxi (2024-09-29),MobileNetV4 -- Universal Models for the Mobile Ecosystem,arXiv:2404.10518
  12. ^Wightman, Ross."MobileNet-V4 (now in timm)".huggingface.co. Retrieved2024-10-18.
  13. ^Shazeer, Noam (2019-11-05),Fast Transformer Decoding: One Write-Head is All You Need,arXiv:1911.02150
Computer programs
AlphaGo
Versions
Competitions
In popular culture
Other
Machine learning
Neural networks
Other
Generative AI
Chatbots
Language models
Other
See also
Differentiable computing
General
Hardware
Software libraries
Retrieved from "https://en.wikipedia.org/w/index.php?title=MobileNet&oldid=1255562872"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp