[edit]
Ske2Grid: Skeleton-to-Grid Representation Learning for Action Recognition
Dongqi Cai, Yangyuxuan Kang, Anbang Yao, Yurong ChenProceedings of the 40th International Conference on Machine Learning, PMLR 202:3431-3441, 2023.
Abstract
This paper presents Ske2Grid, a new representation learning framework for improved skeleton-based action recognition. In Ske2Grid, we define a regular convolution operation upon a novel grid representation of human skeleton, which is a compact image-like grid patch constructed and learned through three novel designs. Specifically, we propose a graph-node index transform (GIT) to construct a regular grid patch through assigning the nodes in the skeleton graph one by one to the desired grid cells. To ensure that GIT is a bijection and enrich the expressiveness of the grid representation, an up-sampling transform (UPT) is learned to interpolate the skeleton graph nodes for filling the grid patch to the full. To resolve the problem when the one-step UPT is aggressive and further exploit the representation capability of the grid patch with increasing spatial size, a progressive learning strategy (PLS) is proposed which decouples the UPT into multiple steps and aligns them to multiple paired GITs through a compact cascaded design learned progressively. We construct networks upon prevailing graph convolution networks and conduct experiments on six mainstream skeleton-based action recognition datasets. Experiments show that our Ske2Grid significantly outperforms existing GCN-based solutions under different benchmark settings, without bells and whistles. Code and models are available at https://github.com/OSVAI/Ske2Grid.
Cite this Paper
BibTeX
@InProceedings{pmlr-v202-cai23c, title = {{S}ke2{G}rid: Skeleton-to-Grid Representation Learning for Action Recognition}, author = {Cai, Dongqi and Kang, Yangyuxuan and Yao, Anbang and Chen, Yurong}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {3431--3441}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/cai23c/cai23c.pdf}, url = {https://proceedings.mlr.press/v202/cai23c.html}, abstract = {This paper presents Ske2Grid, a new representation learning framework for improved skeleton-based action recognition. In Ske2Grid, we define a regular convolution operation upon a novel grid representation of human skeleton, which is a compact image-like grid patch constructed and learned through three novel designs. Specifically, we propose a graph-node index transform (GIT) to construct a regular grid patch through assigning the nodes in the skeleton graph one by one to the desired grid cells. To ensure that GIT is a bijection and enrich the expressiveness of the grid representation, an up-sampling transform (UPT) is learned to interpolate the skeleton graph nodes for filling the grid patch to the full. To resolve the problem when the one-step UPT is aggressive and further exploit the representation capability of the grid patch with increasing spatial size, a progressive learning strategy (PLS) is proposed which decouples the UPT into multiple steps and aligns them to multiple paired GITs through a compact cascaded design learned progressively. We construct networks upon prevailing graph convolution networks and conduct experiments on six mainstream skeleton-based action recognition datasets. Experiments show that our Ske2Grid significantly outperforms existing GCN-based solutions under different benchmark settings, without bells and whistles. Code and models are available at https://github.com/OSVAI/Ske2Grid.}}
Endnote
%0 Conference Paper%T Ske2Grid: Skeleton-to-Grid Representation Learning for Action Recognition%A Dongqi Cai%A Yangyuxuan Kang%A Anbang Yao%A Yurong Chen%B Proceedings of the 40th International Conference on Machine Learning%C Proceedings of Machine Learning Research%D 2023%E Andreas Krause%E Emma Brunskill%E Kyunghyun Cho%E Barbara Engelhardt%E Sivan Sabato%E Jonathan Scarlett%F pmlr-v202-cai23c%I PMLR%P 3431--3441%U https://proceedings.mlr.press/v202/cai23c.html%V 202%X This paper presents Ske2Grid, a new representation learning framework for improved skeleton-based action recognition. In Ske2Grid, we define a regular convolution operation upon a novel grid representation of human skeleton, which is a compact image-like grid patch constructed and learned through three novel designs. Specifically, we propose a graph-node index transform (GIT) to construct a regular grid patch through assigning the nodes in the skeleton graph one by one to the desired grid cells. To ensure that GIT is a bijection and enrich the expressiveness of the grid representation, an up-sampling transform (UPT) is learned to interpolate the skeleton graph nodes for filling the grid patch to the full. To resolve the problem when the one-step UPT is aggressive and further exploit the representation capability of the grid patch with increasing spatial size, a progressive learning strategy (PLS) is proposed which decouples the UPT into multiple steps and aligns them to multiple paired GITs through a compact cascaded design learned progressively. We construct networks upon prevailing graph convolution networks and conduct experiments on six mainstream skeleton-based action recognition datasets. Experiments show that our Ske2Grid significantly outperforms existing GCN-based solutions under different benchmark settings, without bells and whistles. Code and models are available at https://github.com/OSVAI/Ske2Grid.
APA
Cai, D., Kang, Y., Yao, A. & Chen, Y.. (2023). Ske2Grid: Skeleton-to-Grid Representation Learning for Action Recognition.Proceedings of the 40th International Conference on Machine Learning, inProceedings of Machine Learning Research 202:3431-3441 Available from https://proceedings.mlr.press/v202/cai23c.html.