Evaluating current state of monocular 3D pose models for golf
Authors
- Christian Keilstrup IngwersenTrackMan A/S & Technical University of Denmark
- Janus Nørtoft JensenTechnical University of Denmark
- Morten Rieger HannemoseTechnical University of Denmark
- Anders Bjorholm DahlTechnical University of Denmark
DOI:
https://doi.org/10.7557/18.6793Keywords:
Human pose estimation, smpl, sport, 3D pose, 2D pose, kinematic analysisAbstract
Monocular 3D human pose estimation has reached an impressive performance. State-of-the-art mod- els predict joint locations that can be accurately reprojected back into the image, resulting in vi- sually convincing detections. However, our aim is to use the predicted poses in a domain with high- frequency movements, that is, for video of ath- letes performing golf swings. Our investigation is based on accurate marker-based motion capture data. Also, for our data, the predicted 3D joint locations look convincing when we reproject them into the image. However, by quantitatively com- paring the results with the motion capture data, we see significant model errors that are too erroneous to be used for any kinematic analysis of the move- ments. Thus we conclude that the current models cannot be used out of the box for advanced golf analytics.
References
Q. AB. Qualisys.https://www.qualisys.com/.
I. Akhter and M. J. Black. Pose-conditioned joint angle limits for 3D human pose recon- struction. Proceedings of the IEEE Com- puter Society Conference on Computer Vi- sion and Pattern Recognition, 07-12-June- 2015:1446–1455, 2015. ISSN 10636919. doi: 10.1109/CVPR.2015.7298751.
M. Andriluka, L. Pishchulin, P. Gehler, and B. Schiele. 2d human pose estimation: New benchmark and state of the art analysis. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, pages 3686–3693, 2014. doi: 10.1109/CVPR.2014.471.
B. Artacho and A. Savakis. Unipose: Uni- fied human pose estimation in single images and videos. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 7033–7042, 2020. doi: 10.1109/CVPR42600.2020.00706.
F. Bogo, A. Kanazawa, C. Lassner, P. Gehler, J. Romero, and M. J. Black. Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image BT - Com- puter Vision – ECCV 2016. pages 561–578, Cham, 2016. Springer International Publish- ing. ISBN 978-3-319-46454-1. doi: 10.1007/ 978-3-319-46454-1 34. URLhttps://doi.org/10.1007/978-3-319-46454-1_34.
A. Bulat, J. Kossaifi, G. Tzimiropoulos, and M. Pantic. Toward fast and accurate hu- man pose estimation via soft-gated skip con- nections. In 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020), pages 8–15, 2020. doi: 10.1109/FG47880.2020.00014.
X. T. B.V. Xsens.https://www.xsens.com/.
H. Choi, G. Moon, J. Y. Chang, and K. M. Lee. Beyond static features for temporally consistent 3d human pose and shape from a video. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1964–1973, 2021. doi: 10. 1109/CVPR46437.2021.00200.
J. C. Gower. Generalized procrustes analysis. Psychometrika, 40(1):33–51, 1975. ISSN 00333123. doi: 10.1007/BF02291478.
S. Guan, J. Xu, M. Z. He, Y. Wang, B. Ni, and X. Yang. Out-of-domain human mesh recon- struction via dynamic bilevel online adapta- tion. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 1–16, 2022. doi: 10.1109/TPAMI.2022.3194167.
H. Gulgin, C. Armstrong, and P. Gribble. Hip rotational velocities during the full golf swing. Journal of Sports Science and Medicine, 8(2): 296–299, 2009. ISSN 13032968. doi: 10.1249/00005768-200605001-02539.
R. A. Gu ̈ler, N. Neverova, and I. Kokkinos. Densepose: Dense human pose estimation in the wild. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7297–7306, 2018. doi: 10.1109/CVPR. 2018.00762.
K. He, X. Zhang, S. Ren, and J. Sun. Iden- tity Mappings in Deep Residual Networks. In B. Leibe, J. Matas, N. Sebe, and M. Welling, editors, Computer Vision – ECCV 2016, pages 630–645, Cham, 2016. Springer International Publishing. ISBN 978-3-319-46493-0. doi: 10. 1007/978-3-319-46493-0 38. URLhttps://doi.org/10.1007/978-3-319-46493-0_38.
Y. He, R. Yan, K. Fragkiadaki, and S.-I. Yu. Epipolar transformers. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 7776–7785, 2020. doi: 10.1109/CVPR42600.2020.00780.
S. A. Horan, K. Evans, N. R. Morris, and J. J. Kavanagh. Thorax and pelvis kinematics dur- ing the downswing of male and female skilled golfers. Journal of Biomechanics, 43(8):1456– 1462, 2010. ISSN 00219290. doi: 10.1016/j. jbiomech.2010.02.005. URLhttp://dx.doi.org/10.1016/j.jbiomech.2010.02.005.
C. Ionescu, D. Papava, V. Olaru, and C. Smin- chisescu. Human3.6m: Large scale datasets and predictive methods for 3d human sensing in natural environments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(7):1325–1339, 2014. doi: 10.1109/TPAMI. 2013.248.
K. Iskakov, E. Burkov, V. Lempitsky, and Y. Malkov. Learnable triangulation of human pose. In 2019 IEEE/CVF International Con- ference on Computer Vision (ICCV), pages 7717–7726, 2019. doi: 10.1109/ICCV.2019.00781.
A. Kanazawa, M. J. Black, D. W. Jacobs, and J. Malik. End-to-end recovery of human shape and pose. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7122–7131, 2018. doi: 10.1109/CVPR.2018.00744.
A. Kanazawa, J. Y. Zhang, P. Felsen, and J. Malik. Learning 3d human dynamics from video. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5607–5616, 2019. doi: 10.1109/CVPR.2019.00576.
M. Kocabas, N. Athanasiou, and M. J. Black. Vibe: Video inference for human body pose and shape estimation. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5252–5262, 2020. doi: 10.1109/CVPR42600.2020.00530.
M. Kocabas, C.-H. P. Huang, O. Hilliges, and M. J. Black. Pare: Part attention re- gressor for 3d human body estimation. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 11107–11117, 2021. doi: 10.1109/ICCV48922.2021.01094.
N. Kolotouros, G. Pavlakos, M. Black, and K. Daniilidis. Learning to reconstruct 3d hu- man pose and shape via model-fitting in the loop. In 2019 IEEE/CVF International Con- ference on Computer Vision (ICCV), pages 2252–2261, 2019. doi: 10.1109/ICCV.2019.00234.
K. Lin, L. Wang, and Z. Liu. Mesh graphormer. In 2021 IEEE/CVF Inter- national Conference on Computer Vision (ICCV), pages 12919–12928, 2021. doi: 10. 1109/ICCV48922.2021.01270.
K. Lin, L. Wang, and Z. Liu. End-to-end human pose and mesh reconstruction with transformers. In 2021 IEEE/CVF Confer- ence on Computer Vision and Pattern Recog- nition (CVPR), pages 1954–1963, 2021. doi: 10.1109/CVPR46437.2021.00199.
M. Loper, N. Mahmood, J. Romero, G. Pons- Moll, and M. J. Black. Smpl: A skinned multi- person linear model. ACM Trans. Graph., 34 (6), nov 2015. ISSN 0730-0301. doi: 10. 1145/2816795.2818013. URLhttps://doi.org/10.1145/2816795.2818013.
N. Mahmood, N. Ghorbani, N. F. Troje, G. Pons-Moll, and M. Black. Amass: Archive of motion capture as surface shapes. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 5441–5450, 2019. doi: 10.1109/ICCV.2019.00554.
J. Martinez, R. Hossain, J. Romero, and J. J. Little. A simple yet effective baseline for 3d human pose estimation. In 2017 IEEE In- ternational Conference on Computer Vision (ICCV), pages 2659–2668, 2017. doi: 10.1109/ ICCV.2017.288.
C. M. A. H. Matthew Trumble, An- drew Gilbert and J. Collomosse. Total cap- ture: 3d human pose estimation fusing video and inertial sensors. In G. B. Tae-Kyun Kim, Stefanos Zafeiriou and K. Mikolajczyk, edi- tors, Proceedings of the British Machine Vi- sion Conference (BMVC), pages 14.1–14.13. BMVA Press, September 2017. ISBN 1- 901725-60-X. doi: 10.5244/C.31.14. URLhttps://dx.doi.org/10.5244/C.31.14.
S. Mehdizadeh, H. Nabavi, A. Sabo, T. Arora, A. Iaboni, and B. Taati. Concurrent valid- ity of human pose tracking in video for mea- suring gait parameters in older adults: a pre- liminary analysis with multiple trackers, view- ing angles, and walking directions. Journal of NeuroEngineering and Rehabilitation, 18(1): 1–16, 2021. ISSN 17430003. doi: 10.1186/s12984-021-00933-0.
K. Mitchell, S. Banks, D. Morgan, and H. Sugaya. Shoulder Motions During the Golf Swing in Male Amateur Golfers. Journal of Orthopaedic & Sports Physical Therapy, 33 (4):196–203, 2003. doi: 10.2519/jospt.2003.33.4.196. URLhttps://doi.org/10.2519/jospt.2003.33.4.196.
F. Moreno-Noguer. 3d human pose estima- tion from a single image via distance matrix regression. CoRR, abs/1611.09010, 2016. URLhttp://arxiv.org/abs/1611.09010.
S. M. Nesbit. A three dimensional kinematic and kinetic study of the golf swing. Journal of Sports Science and Medicine, 4(4):499–519, 2005. ISSN 13032968.
G. Pavlakos, V. Choutas, N. Ghorbani, T. Bolkart, A. A. Osman, D. Tzionas, and M. J. Black. Expressive body capture: 3d hands, face, and body from a single image. In 2019 IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition (CVPR), pages 10967–10977, 2019. doi: 10.1109/CVPR.2019.01123.
V. Ramakrishna, T. Kanade, and Y. Sheikh. Reconstructing 3D human pose from 2D im- age landmarks. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7575 LNCS(PART 4):573– 586, 2012. ISSN 03029743. doi: 10.1007/978-3-642-33765-9_41.
N. D. Reddy, L. Guigues, L. Pishchulin, J. Ele- dath, and S. G. Narasimhan. Tessetrack: End- to-end learnable multi-person articulated 3d pose tracking. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recog- nition (CVPR), pages 15185–15195, 2021. doi: 10.1109/CVPR46437.2021.01494.
L. Sigal, A. O. Balan, and M. J. Black. HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion. In- ternational Journal of Computer Vision, 87 (1):4, 2009. ISSN 1573-1405. doi: 10.1007/s11263-009-0273-6. URLhttps://doi.org/10.1007/s11263-009-0273-6.
J. Stenum, C. Rossi, and R. T. Roemmich. Two-dimensional video-based anal- ysis of human gait using pose estimation. PLoS Computational Biology, 17(4), 2021. ISSN 15537358. doi: 10.1371/journal.pcbi. 1008935. URLhttp://dx.doi.org/10.1371/journal.pcbi.1008935.
Z. Su, M. Ye, G. Zhang, L. Dai, and J. Sheng. Cascade feature aggregation for human pose estimation, 2019.
K. Sun, B. Xiao, D. Liu, and J. Wang. Deep high-resolution representation learning for hu- man pose estimation. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5686–5696, 2019. doi: 10.1109/CVPR.2019.00584.
Y. Sun, Q. Bao, W. Liu, Y. Fu, M. J. Black, and T. Mei. Monocular, one-stage, regression of multiple 3d people. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 11159–11168, 2021. doi: 10.1109/ICCV48922.2021.01099.
V. M. S. L. UK. Vicon.https://www.vicon.com/.
T. von Marcard, R. Henschel, M. J. Black, B. Rosenhahn, and G. Pons-Moll. Recover- ing Accurate 3D Human Pose in the Wild Us- ing IMUs and a Moving Camera BT - Com- puter Vision – ECCV 2018. pages 614–631, Cham, 2018. Springer International Publish- ing. ISBN 978-3-030-01249-6. doi: 10.1007/ 978-3-030-01249-6 37. URL https://doi. org/10.1007/978-3-030-01249-6_37.
S. Yang, Z. Quan, M. Nie, and W. Yang. Transpose: Keypoint localization via trans- former. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 11782–11792, 2021. doi: 10.1109/ICCV48922.2021.01159.
C. Zheng, W. Wu, C. Chen, T. Yang, S. Zhu, J. Shen, N. Kehtarnavaz, and M. Shah. Deep Learning-Based Human Pose Estimation: A Survey. arXiv e-prints, art. arXiv:2012.13392, Dec. 2020.
X. Zhou, M. Zhu, S. Leonardos, K. G. Der- panis, and K. Daniilidis. Sparseness meets deepness: 3d human pose estimation from monocular video. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 4966–4975, 2016. doi: 10.1109/CVPR.2016.537.
X. Zhou, M. Zhu, G. Pavlakos, S. Leonardos, K. G. Derpanis, and K. Daniilidis. Mono- cap: Monocular human motion capture using a cnn coupled with a geometric prior. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(4):901–914, 2019. doi: 10.1109/TPAMI.2018.2816031.
Downloads
Published
Issue
Section
License
Copyright (c) 2023 Christian Keilstrup Ingwersen, Janus Nørtoft Jensen, Morten Rieger Hannemose, Anders Bjorholm Dahl

This work is licensed under aCreative Commons Attribution 4.0 International License.