Andreas Wagner

Bachelor's Thesis

User adaptive online handwritten text recognition using inertial data

Advisors

Fatemeh Salehi (M. Sc.), Mohamad Wehbi (M. Sc.), Prof. Dr. Björn Eskofier

Duration

02 / 2023 – 07 / 2023

Abstract

Handwriting recognition is a challenging task due to the wide variety in writing styles of different individuals [1]. Recognition systems are separated between two types, writer-dependent (WD) and writer-independent (WI). WD systems are trained and tested on the same users, while WI ones are tested on data from users not included in the training set [2]. This enables better generalization of a system and allows the recognition of handwriting from new users. WD models have shown to produce better results, in comparison to WI ones, by learning specific writer characteristics [3].

Writer adaptation is the process of transforming a WI model into a WD one. This is achieved by re-training the whole or parts of the WI model on a small data set collected from a specific writer, updating the parameters towards that particular user and thus improving recognition rates [2]. In the handwriting recognition field, this adaptation generally requires re-training the complete models on small data sets and evaluating the re-trained models on the newly introduced writer accordingly [4]. However, this process requires extensive computational loads, since WI models usually include a large set of parameters to be able to generalize to different users.

In this thesis, we plan to investigate adaptation methods that do not require complete model re-training. We utilize methods, inspired from the field of speech recognition, which introduce specific layers in which the adaptation is modeled. Two methods, linear hidden unit contribution [5] (LHUC) and linear input network [6] (LIN), will be investigated. LHUC adds a layer in between two hidden layers and scales the given input with a scalar from 0 to 2, while LIN adds a layer in front of the network that performs a linear mapping of the input in order to get a different input space. Both of these methods leave the rest of the model untouched and only train the weights within the added layers.

These adaptation methods are investigated for online handwriting recognition using inertial data acquired with STABILO DigiPen. We collect data samples from new users and adapt a pretrained model, that was trained on a large dataset, for each user separately. The final evaluation is conducted by comparing the recognition rates obtained between the WI pre-trained model, and the WD adapted models. Furthermore, a detailed evaluation is conducted to determine the optimal number of newly introduced samples that is required to improve the models.

References

[1] Noman Islam, Zeeshan Islam, Nazia Noor A Survey on Optical Character Recognition System, 2017
[2] Connell, S.D. and Jain, A.K. Writer adaptation for online handwriting recognition IEEE Transactions on Pattern Analysis and Machine Intelligence, volume 24, pp. 329-346, 2002
[3] C. C. Tappert, C. Y. Suen and T. Wakahara The state of the art in online handwriting recognition IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, pp. 787-808, 1990
[4] Bhunia, Ayan Kumar and Ghose, Shuvozit and Kumar, Amandeep and Chowdhury, Pinaki Nath and Sain, Aneeshan and Song, Yi-Zhe MetaHTR: Towards Writer-Adaptive Handwritten Text Recognition Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15830-15839, 2021
[5] P. Swietojanski and S. Renals Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models IEEE Spoken Language Technology Workshop (SLT), pp. 171-176, 2014
[6] J. P. Neto, C. Martins and L. B. Almeida Speaker-adaptation in a hybrid HMM-MLP recognizer 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, volume 6, pp. 3382-3385, 1996