Repetitive Action Counting Through Joint Angle Analysis and Video Transformer Techniques

Repetitive Action Counting

Goals

  • Enhance Repetitive Action Counting Accuracy: Develop a method that integrates joint angles with pose landmarks to improve the precision of repetitive action counting in various applications.
  • Address Common Challenges: Mitigate issues such as instability under varying camera viewpoints, over-counting, under-counting, and difficulty in distinguishing sub-actions through advanced analysis techniques.
  • Leverage Transformer Networks: Utilize Transformer-based models to effectively capture temporal patterns and spatial relationships in action sequences.
  • Validate on Public Datasets: Demonstrate the method’s effectiveness by achieving superior performance metrics on the RepCount public dataset.
Project Goals

Key Findings

  • Integration of Joint Angles and Pose Landmarks: Combined joint angle analysis with pose landmarks to provide a comprehensive understanding of human movements, enhancing action counting accuracy.
  • Transformer-Based Architecture: Implemented a Transformer network to effectively model the temporal dynamics and spatial relationships in repetitive actions.
  • Improved Performance Metrics: Achieved a Mean Absolute Error (MAE) of 0.211 and an Off-By-One Accuracy (OBOA) of 0.599 on the RepCount dataset, surpassing existing state-of-the-art methods.
  • Robustness to Environmental Variations: Enhanced the model’s stability under different camera viewpoints and various video effects, ensuring reliable performance across diverse scenarios.
  • Comprehensive Evaluation: Conducted extensive experiments to validate the effectiveness of different joint angle configurations and their impact on repetitive action counting.
Addressing the inability issue to stably deal with varying camera viewpoints

Addressing the inability issue to stably deal with varying camera viewpoints

Addressing the under-counting issue

Addressing the under-counting issue

Technologies Utilized

  • Pose Estimation: Employed Google Mediapipe BlazePose model to extract 33 pose landmarks with high accuracy.
  • Joint Angle Calculation: Computed five key joint angles (elbow, shoulder, hip, knee, ankle) from the pose landmarks to capture movement dynamics.
  • Transformer Networks: Utilized Transformer-based models for processing skeletal and joint angle data, enabling effective temporal pattern recognition.
  • Video Processing: Applied advanced video transformer techniques to handle varying camera viewpoints and video effects.
  • Data Acquisition and Annotation: Corrected and refined annotations on the RepCount dataset to ensure data integrity and reliability.
  • Software and Frameworks: Python, TensorFlow, Keras, and other deep learning libraries for model development and training.
  • Hardware Setup: High-performance computing resources including NVIDIA GeForce RTX 3090 for accelerated training and inference.

Impact

This project significantly advances the field of repetitive action counting by introducing a robust method that combines joint angle analysis with pose landmarks using Transformer networks. The improved accuracy and robustness of the system have substantial implications for various applications:

  • Fitness and Rehabilitation: Enhances the effectiveness of training and rehabilitation programs by ensuring accurate tracking of exercise repetitions.
  • Manufacturing Monitoring: Provides reliable monitoring of assembly operations, ensuring consistency and quality in manufacturing processes.
  • Human-Computer Interaction: Facilitates more accurate and intuitive interaction systems that rely on repetitive action recognition.
  • Assistive Technologies: Benefits individuals with physical impairments by providing precise action counting, aiding in therapy and daily activities.

The method’s ability to handle environmental variations and differentiate sub-actions makes it a versatile tool for real-world applications, paving the way for future innovations in action recognition and monitoring systems.

References

Please refer to the for detailed references and further reading. 1.Chen, H, Zendehdel, N, Leu, MC, Moniruzzaman, M, Yin, Z, & Hajmohammadi, S. “Repetitive Action Counting Through Joint Angle Analysis and Video Transformer Techniques.” Proceedings of the 2024 International Symposium on Flexible Automation. 2024 International Symposium on Flexible Automation. Seattle, Washington, USA. July 21–24, 2024. V001T08A003. ASME. doi full publication