Maximilian Vogel

Maximilian Vogel

Master's Thesis

A Compressed Deep Learning Model for Human Activity Recognition (HAR) Using the Hearing Aid Integrated Inertial Sensors

Ann-Kristin Seifer (M.Sc.)An Nguyen (M.Sc.)Prof. Dr. Björn Eskofier

15 / 2021 – 06 / 2022

Modern hearing aids are more than just amplifying devices. Most of them are already equipped with inertial sensors. The data of these sensors opens the option to use those devices for Human Activity Recognition (HAR). While today HAR is mostly performed based on sensors on the wrist or waist, the head has also proven to be a viable location that provides good results [1, 2]. The recognition of certain activities could then be used to adapt the hearing aid amplification settings based on the current activity or to monitor the user’s health. As most people who wear hearing aids are elderly, this can include early diagnosis of diseases or the observation of rehabilitation [3] and health treatments, which require patient movement such as diabetes or heart diseases [4]. A great advantage of performing HAR in hearing aids is that it would be entirely unobtrusive for the users because they would not need any additional device that they might forget to wear.

HAR is a very active field of research in which deep learning (DL) models have shown superior results compared to traditional machine learning models as they do not require experts to extract manual features and improved recognition rates on temporal features [5]. However, those models are often computational complex, making integration into mobile and embedded devices an open challenge [6].
The main goal of this thesis is to create a DL model for HAR based on the hearing aid’s Inertial Measurement Unit (IMU) sensor data. The model should achieve reasonable recognition rates on the one hand and be light weighted and efficient enough to run on the hearing aid on the other. In addition, we want to implement a traditional feature-based machine learning model and compare that in model size and recognition rate with our DL model.

Therefore, the first step will be to design and implement at least two DL models and a featurebased machine learning model. This will include the fundamental steps of the activity recognition chain, which are data acquisition, segmentation, feature calculation, modeling and inference, and classification [7]. The data for our models have already been recorded by another study and contains accelerometer and gyroscope data. As gyroscopes are not as commonly used in hearing aid as accelerometers and need a significant amount of additional energy, we also want to evaluate the performance differences with and without the gyroscope data. Other works have shown different results on performing HAR with and without gyroscope data. For instance, Wolff et al. [8] and Haescher et al. [9] only show a slight difference in the recognition performance, while Ordóñez et al. [10] showed that there was a noticeable difference. Whereas the first two also used the head as a sensor location, Ordóñez et al. [10] had a more complex setup with different body sensors.

In the second step, we want to implement and evaluate different compression techniques on the DL models regarding the trade-off in model performance and model compression. That can include methods like pruning, quantization [11], or a possible sparsification of the model, as shown by Bhattacharya et al. [12]. They demonstrated that a Convolutional Neural Network (CNN) could run on a mobile processor such as the Qualcomm Snapdragon 400.We will compare the compressed models to our feature-based machine learning models regarding model size, inference time, and recognition performance.

[1] Louis Atallah, Benny Lo, Rachel King, and Guang-Zhong Yang. Sensor placement for activity detection using wearable accelerometers. In 2010 International conference on body sensor networks, pages 24–29. IEEE, 2010.
[2] Darrell Loh, Tien J Lee, Shaghayegh Zihajehzadeh, Reynald Hoskinson, and Edward J Park. Fitness activity classification by using multiclass support vector machines on head-worn sensors. In 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pages 502–505. IEEE, 2015.
[3] Akin Avci, Stephan Bosch, Mihai Marin-Perianu, Raluca Marin-Perianu, and Paul Havinga. Activity recognition using inertial sensing for healthcare, wellbeing and sports applications: A survey. In 23th International conference on architecture of computing systems 2010, pages 1–10. VDE, 2010.
[4] Oscar D Lara and Miguel A Labrador. A survey on human activity recognition using wearable sensors. IEEE communications surveys & tutorials, 15(3):1192–1209, 2012.
[5] E Ramanujam, Thinagaran Perumal, and S Padmavathi. Human activity recognition with smartphone and wearable sensors using deep learning techniques: A review. IEEE Sensors Journal, 2021.
[6] Fuqiang Gu, Mu-Huan Chung, Mark Chignell, Shahrokh Valaee, Baoding Zhou, and Xue Liu. A survey on deep learning for human activity recognition. ACM Computing Surveys (CSUR), 54(8):1–34, 2021.
[7] Andreas Bulling, Ulf Blanke, and Bernt Schiele. A tutorial on human activity recognition using body-worn inertial sensors. ACM Computing Surveys (CSUR), 46(3):1–33, 2014.
[8] Johann P Wolff, Florian Grützmacher, Arne Wellnitz, and Christian Haubelt. Activity recognition using head worn inertial sensors. In Proceedings of the 5th international Workshop on Sensor-based Activity Recognition and Interaction, pages 1–7, 2018.
[9] Marian Haescher, John Trimpop, Denys JC Matthies, Gerald Bieber, Bodo Urban, and Thomas Kirste. ahead: considering the head position in a multi-sensory setup of wearables to recognize everyday activities with intelligent sensor fusions. In International Conference on Human-Computer Interaction, pages 741–752. Springer, 2015.
[10] Francisco Javier Ordóñez and Daniel Roggen. Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition. Sensors, 16(1):115, 2016.
[11] Song Han, Huizi Mao, and William J Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint ar-Xiv:1510.00149, 2015.
[12] Sourav Bhattacharya and Nicholas D Lane. Sparsification and separation of deep learning layers for constrained resource inference on wearables. In Proceedings of the 14th ACM Conference on Embedded Network Sensor Systems CD-ROM, pages 176–189, 2016.