Computer Science > Computer Vision and Pattern Recognition
arXiv:2403.11959 (cs)
[Submitted on 18 Mar 2024 (v1), last revised 20 Mar 2024 (this version, v2)]
Title:IVAC-P2L: Leveraging Irregular Repetition Priors for Improving Video Action Counting
View a PDF of the paper titled IVAC-P2L: Leveraging Irregular Repetition Priors for Improving Video Action Counting, by Hang Wang and 3 other authors
View PDFHTML (experimental)Abstract:Video Action Counting (VAC) is crucial in analyzing sports, fitness, and everyday activities by quantifying repetitive actions in videos. However, traditional VAC methods have overlooked the complexity of action repetitions, such as interruptions and the variability in cycle duration. Our research addresses the shortfall by introducing a novel approach to VAC, called Irregular Video Action Counting (IVAC). IVAC prioritizes modeling irregular repetition patterns in videos, which we define through two primary aspects: Inter-cycle Consistency and Cycle-interval Inconsistency. Inter-cycle Consistency ensures homogeneity in the spatial-temporal representations of cycle segments, signifying action uniformity within cycles. Cycle-interval inconsistency highlights the importance of distinguishing between cycle segments and intervals based on their inherent content differences. To encapsulate these principles, we propose a new methodology that includes consistency and inconsistency modules, supported by a unique pull-push loss (P2L) mechanism. The IVAC-P2L model applies a pull loss to promote coherence among cycle segment features and a push loss to clearly distinguish features of cycle segments from interval segments. Empirical evaluations conducted on the RepCount dataset demonstrate that the IVAC-P2L model sets a new benchmark in VAC task performance. Furthermore, the model demonstrates exceptional adaptability and generalization across various video contents, outperforming existing models on two additional datasets, UCFRep and Countix, without the need for dataset-specific optimization. These results confirm the efficacy of our approach in addressing irregular repetitions in videos and pave the way for further advancements in video analysis and understanding.
Comments: | Source code:this https URL |
Subjects: | Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM) |
Cite as: | arXiv:2403.11959 [cs.CV] |
(orarXiv:2403.11959v2 [cs.CV] for this version) | |
https://doi.org/10.48550/arXiv.2403.11959 arXiv-issued DOI via DataCite |
Submission history
From: Zhi-Qi Cheng [view email][v1] Mon, 18 Mar 2024 16:56:47 UTC (661 KB)
[v2] Wed, 20 Mar 2024 11:58:23 UTC (661 KB)
Full-text links:
Access Paper:
- View PDF
- HTML (experimental)
- TeX Source
- Other Formats
View a PDF of the paper titled IVAC-P2L: Leveraging Irregular Repetition Priors for Improving Video Action Counting, by Hang Wang and 3 other authors
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer(What is the Explorer?)
Connected Papers(What is Connected Papers?)
Litmaps(What is Litmaps?)
scite Smart Citations(What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv(What is alphaXiv?)
CatalyzeX Code Finder for Papers(What is CatalyzeX?)
DagsHub(What is DagsHub?)
Gotit.pub(What is GotitPub?)
Hugging Face(What is Huggingface?)
Papers with Code(What is Papers with Code?)
ScienceCast(What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower(What are Influence Flowers?)
CORE Recommender(What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community?Learn more about arXivLabs.