Artificial Vision applied to physical activity
It is now accepted by the scientific community that physical activity has beneficial effects that help to improve our well-being at all stages of our lives and, in particular, mitigate the signs of ageing. For example, regular physical activity delays the decline of lung capacity [Burtscher et al., 2022] and also helps to prevent and even reverse the loss of muscle mass [ABC-link]. Biomechanics of the locomotor system is an area of scientific knowledge that deals with the study of human movement from an interdisciplinary perspective [Gianikellis et al., 2002]. Its objectives include understanding the behaviour of the human locomotor apparatus, preventing injuries, improving the technique of execution of sporting gestures, among others [Gianikellis et al., 2002], and it finds application in different fields, such as work, sport, or recreation.
In the study of human movement, the first stage consists of capturing data through a wide variety of measurement systems, among which techniques based on image analysis stand out. The area of Artificial Vision and image analysis has undergone a spectacular revolution in recent years, as a result of the application of very significant methodological advances that have taken place in the field of Artificial Intelligence and Machine Learning [Stanford-Link]. At present, these methods make it possible to design and build Computer Vision systems capable of solving real-world problems that, just a few years ago, were unaffordable with existing techniques. In particular, motion capture systems have benefited enormously from this revolution. So much so that low-cost systems are now available to automate the capture of human motion [XNeck-link] through a camera [Ramírez et al. 2020], a set of cameras [Nuñez et al., 2019], or RGBD devices, which simultaneously capture an image and the depth at which the objects in the scene are located. In the Advanced Computation Perception and Optimisation (CAPO) group at the Universidad Rey Juan Carlos (URJC) we have a long history of research in computer vision systems applied to human motion capture and its applications. This research work has borne fruit in the form of prototypes of various kinds, applicable to the capture of human movement in the field of physical activity. In [Nuñez et al., 2017], a human motion capture system using several RGBD-type sensors is proposed. Each of the sensors provides a proposed pose in the form of an articulated model. Then, an information fusion method effectively combines the information from the sensors in real time to propose an improved pose (see Figure 1). In [Núñez et al., 2019] we pursue the same goal, but using multiple cameras as input to the system. The system proposes a 2D pose for each of the images received from the cameras and, subsequently, a machine learning-based system integrates all this information to estimate the 3D pose of the subject over time (a schematic of the proposal is shown in Figure 2). In [Ramírez et al., 2020] we went a step further, tackling a very complex problem: estimating the 3D subject pose from a single 2D image. This problem is difficult because the complete information is not available, since a single image provides information in two dimensions and we aim to estimate the pose in three dimensions. To do so, we designed and trained a machine learning system and the results obtained were very promising (see Figure 3).
Figure 1: Schematic of the articulated motion tracking model with multiple RGBD devices proposed in [Nuñez et al., 2017].
Figure 2: Schematic of the multi-camera articulated motion tracking model proposed in [Nuñez et al., 2019].
Figure 3: Results obtained with the single camera motion capture system proposed in [Ramírez et al., 2020]. For each image (left), the output of our system (centre) and the actual pose (right) are shown.
Once the automatic capture of the subject pose over time has been performed, this information can be used for different purposes. For example, in [Núñez et al., 2018] a system is proposed that, from a temporal sequence of poses of the subject, recognises the activity that is being developed. In the specific field of assistive systems applied to physical activity, [Rivero FJ, 2013] proposed a prototype of a virtual trainer to monitor the correct execution of a specific physical exercise. It is well known that poor technique in the execution of an exercise can reduce the benefit associated with it and lead to the appearance of injuries. The prototype virtual trainer uses an RGBD device to capture the pose of a subject exercising. From this pose, it recognises the exercise being performed and monitors for correct execution. If it detects a bad technique, it proposes specific actions to the user to correct the execution (the attached video shows a demonstration of how the virtual trainer prototype works). Consequently, these systems are general enough to extend their use to a wider range of applications such as, for example, automated supervision in a rehabilitation process or the evaluation of progress in an activity over time.
In short, current technological advances are bringing about a revolutionary change in our way of life and the way we interact with our environment. Specifically, in the field of physical activity, a multitude of training systems have appeared in recent years that help us to plan our training according to our objectives. From a technological point of view, we are ready to take a further leap, consisting of developing systems that monitor the correct execution of the exercises to maximise the benefit, reducing the possibility of injuries.
References:
[ABC-link] https://www.abc.es/salud/enfermedades/abci-si-perdida-masa-muscular-env…
[Burtscher et al., 2002] Burtscher, J., Millet, G.P., Gatterer, H. et al. “Does Regular Physical Activity Mitigate the Age-Associated Decline in Pulmonary Function?”. Sports Medicine 52, 963–970 (2022)
[Gianikellis et al., 2002] Gianikellis K, Pantrigo JJ, Bote A, Vara A. “El desarrollo del paquete BiomSoft y sus aplicaciones en el análisis biomecánico del movimiento humano”. Biomecánica, 10 (2) 38–43 (2002)
[Stanford-Link] https://www.gsb.stanford.edu/insights/andrew-ng-why-ai-new-electricity
[XNect-link] https://vcai.mpi-inf.mpg.de/projects/XNect/
[Ramírez et al. 2020] Ramírez I, Cuesta A, Schiavi E, Pantrigo JJ. "Bayesian Capsule Networks for 3D human pose estimation from single 2D images". Neurocomputing, 379:64-73 (2020)
[Nuñez et al., 2019] Nuñez JC, Cabido R, Velez, J, Montemayor AS, Pantrigo JJ. "Multiview 3D human pose estimation using improved least-squares and LSTM networks". Neurocomputing, 323:335-343 (2019).
[Nuñez et al., 2017] Nuñez JC, Cabido R, Montemayor AS, Pantrigo JJ. "Real-time human body tracking based on data fusion from multiple RGB-D sensors". Multimedia Tools and Applications 76(3): 4249–4271 (2017).
[Nuñez et al., 2018] Nuñez-Moreno JC, Cabido R, Pantrigo JJ, Montemayor AS, Vélez J. "Convolutional Neural Networks and Long Short-Term Memory for Skeleton-Based Human Activity and Hand Gesture Recognition". Pattern Recognition. 76C: 80-94 (2018).
[Rivero FJ, 2013] Reconocimiento y corrección de actividades físicas con Kinect. Trabajo de Fin de Máster del Máster Oficial en Visión Artificial de la URJC (2013)