Movatterモバイル変換


[0]ホーム

URL:


Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
Thehttps:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

NIH NLM Logo
Log inShow account info
Access keysNCBI HomepageMyNCBI HomepageMain ContentMain Navigation
pubmed logo
Advanced Clipboard
User Guide

Full text links

MDPI full text link MDPI Free PMC article
Full text links

Actions

Share

.2020 Jul 26;22(8):817.
doi: 10.3390/e22080817.

Automatic Recognition of Human Interaction via Hybrid Descriptors and Maximum Entropy Markov Model Using Depth Sensors

Affiliations

Automatic Recognition of Human Interaction via Hybrid Descriptors and Maximum Entropy Markov Model Using Depth Sensors

Ahmad Jalal et al. Entropy (Basel)..

Abstract

Automatic identification of human interaction is a challenging task especially in dynamic environments with cluttered backgrounds from video sequences. Advancements in computer vision sensor technologies provide powerful effects in human interaction recognition (HIR) during routine daily life. In this paper, we propose a novel features extraction method which incorporates robust entropy optimization and an efficient Maximum Entropy Markov Model (MEMM) for HIR via multiple vision sensors. The main objectives of proposed methodology are: (1) to propose a hybrid of four novel features-i.e., spatio-temporal features, energy-based features, shape based angular and geometric features-and a motion-orthogonal histogram of oriented gradient (MO-HOG); (2) to encode hybrid feature descriptors using a codebook, a Gaussian mixture model (GMM) and fisher encoding; (3) to optimize the encoded feature using a cross entropy optimization function; (4) to apply a MEMM classification algorithm to examine empirical expectations and highest entropy, which measure pattern variances to achieve outperformed HIR accuracy results. Our system is tested over three well-known datasets: SBU Kinect interaction; UoL 3D social activity; UT-interaction datasets. Through wide experimentations, the proposed features extraction algorithm, along with cross entropy optimization, has achieved the average accuracy rate of 91.25% with SBU, 90.4% with UoL and 87.4% with UT-Interaction datasets. The proposed HIR system will be applicable to a wide variety of man-machine interfaces, such as public-place surveillance, future medical applications, virtual reality, fitness exercises and 3D interactive gaming.

Keywords: Gaussian mixture model; cross entropy; depth sensors; maximum entropy Markov model.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
System architecture of proposed human interaction recognition model.
Figure 2
Figure 2
Example of RGB silhouette segmentation for an Exchanging object interaction of SBU dataset: (a) original image; (b) detected silhouettes; (c) skin coloring on cropped left and right silhouette; (d) binary thresholding over silhouettes and (e) segmented RGB silhouettes.
Figure 3
Figure 3
Depth silhouette segmentation of kicking interaction from SBU dataset: (a) original image; (b) initial ROI; (c) binary image atT = 0.25, (d) binary image atT = 0.22 (e) binary image atT = 0.20, (f) binary image atT = 0.13, (g) segmented binary silhouette atT = 0.19; (h) segmented depth silhouette.
Figure 4
Figure 4
Spatial feature extraction: (a) method to find a spatial feature point; (b) depth silhouette of approaching interaction with marked feature points.
Figure 5
Figure 5
Cases of spatial feature point extraction: (a) three cases of 45° change in direction; (b) three cases of 90° change in direction.
Figure 6
Figure 6
Key-body points on RGB and depth frames of: (a) pushing; (b) approaching; (c) punching; (d) kicking interaction of SBU dataset.
Figure 7
Figure 7
Orthogonal projection of 3D views of DS for: (a) hugging, (b) shaking hands, (c) kicking and (d) pushing interactions.
Figure 7
Figure 7
Orthogonal projection of 3D views of DS for: (a) hugging, (b) shaking hands, (c) kicking and (d) pushing interactions.
Figure 8
Figure 8
A bar graph showing HOG feature vector of (a) hugging and (b) kicking interactions.
Figure 9
Figure 9
Energy features applied over (a) shaking hands; (b) kicking; (c) punching interactions of SBU dataset; (d) color bar showing energy range.
Figure 10
Figure 10
Three-dimensional clusters of FVC over: (a) SBU dataset and (b) UoL dataset.
Figure 11
Figure 11
Cross entropy between probability distributions of interaction classes of SBU dataset: (a) approaching; (b) departing; (c) exchanging object; (d) punching; (f) kicking; (g) hugging; (h) shaking hands.
Figure 11
Figure 11
Cross entropy between probability distributions of interaction classes of SBU dataset: (a) approaching; (b) departing; (c) exchanging object; (d) punching; (f) kicking; (g) hugging; (h) shaking hands.
Figure 12
Figure 12
Overall flow of MEMM recognizer engine at different interaction classes of SBU Kinect interaction dataset.
Figure 13
Figure 13
RGB and depth snapshots of interaction classes of SBU dataset. (a) Approaching; (b) departing; (c) exchanging object; (d) pushing; (e) punching; (f) kicking; (g) hugging; (h) shaking hands.
Figure 14
Figure 14
RGB and depth snapshots of interaction classes of UoL 3D dataset: (a) fight; (b) help stand-up; (c) push; (d) conversation; (e) hug; (f) help walk; (g) handshake; (h) call attention.
Figure 15
Figure 15
Few examples of interaction classes of UT-Interaction dataset. (a) Shake hands; (b) push; (c) point; (d) hug; (e) kick; (f) punch.
Figure 16
Figure 16
Comparison of other classifiers with MEMM over interaction classes of SBU dataset.
Figure 17
Figure 17
Comparison of other classifiers with MEMM over interaction classes of UoL dataset.
Figure 18
Figure 18
Comparison of other classifiers with MEMM over interaction classes of UT-Interaction: (a) Set 1 and (b) Set 2.
See this image and copyright information in PMC

Similar articles

See all similar articles

Cited by

See all "Cited by" articles

References

    1. Li J., Tian L., Wang H., An Y., Wang K., Yu L. Segmentation and Recognition of Basic and Transitional Activities for Continuous Physical Human Activity. IEEE Access. 2019;7:42565–42576. doi: 10.1109/ACCESS.2019.2905575. - DOI
    1. Jalal A., Kamal S., Kim D. A Depth Video Sensor-Based Life-Logging Human Activity Recognition System for Elderly Care in Smart Indoor Environments. Sensors. 2014;14:11735–11759. doi: 10.3390/s140711735. - DOI - PMC - PubMed
    1. Ajmal M., Ahmad F., Naseer M., Jamjoom M. Recognizing Human Activities from Video Using Weakly Supervised Contextual Features. IEEE Access. 2019;7:98420–98435. doi: 10.1109/ACCESS.2019.2929262. - DOI
    1. Susan S., Agrawal P., Mittal M., Bansal S. New shape descriptor in the context of edge continuity. CAAI Trans. Intell. Technol. 2019;4:101–109. doi: 10.1049/trit.2019.0002. - DOI
    1. Shokri M., Tavakoli K. A review on the artificial neural network approach to analysis and prediction of seismic damage in infrastructure. Int. J. Hydromechatron. 2019;4:178–196. doi: 10.1504/IJHM.2019.104386. - DOI

LinkOut - more resources

Full text links
MDPI full text link MDPI Free PMC article
Cite
Send To

NCBI Literature Resources

MeSHPMCBookshelfDisclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.


[8]ページ先頭

©2009-2025 Movatter.jp