Nils Steinlein

Nils Steinlein

Master's Thesis

The creation and evaluation of a video-based polar bear Re-Identification dataset on current Re-ID methods


Matthias Zürl (M. Sc.), Richard Dirauf (M. Sc.), Prof. Dr. Björn Eskofier


04 / 2023 – 10 / 2023


Keeping animals in zoological institutions introduces many challenges for animal caretakers. One of them is monitoring and improving the health and well-being of the animals. Animal observation is a common task carried out by biologists to improve animal welfare. They observe captive animals’ actions and behavioural patterns to gain insight into potential health problems [1]. Although it would be ideal, it is impossible to observe each animal 24/7 due to the laborious nature of the task. Therefore, researchers at the Machine Learning and Data Analytics lab developed algorithms to automate this process and produce a video-based monitoring system available 24/7 [2]. This camera-based method is non-invasive since cameras won’t disturb the animals compared to body-worn sensors like accelerometers and GPS. However, the invasive measurements can be directly assigned to the correct individual [3]. This distinction is more complex for vision-based systems. For most animal species, visual differentiation between individuals is only possible for trained professionals due to missing recognizable landmarks. This is why a crucial stage for visualbased automated systems is the re-identification (Re-ID) of individuals across different cameras.

Re-ID methods are currently a fast-evolving research topic. As a consequence of video-based surveillance, Re-ID for persons has become popular in recent years. While large datasets drive the development of video-based person Re-ID methods, most animal research today focuses on image-based approaches for Re-ID where animal landmarks get utilized as features [4][5][6]. However,
the accuracy of video-based Re-ID methods surpassed the image-based approaches, with the inclusion of video-specific features like the gait of an individual [7][8]. Extensive datasets are required for benchmarking, developing and refining these video-based deep-learning algorithms for animals.

Because polar bears are endangered, keeping them is essential to ensure their persistence. This is challenging due to their personalities and extensive range of movement. However, the research on polar-bear Re-ID is restrained because they do not have distinctive visual landmarks that could be used as features in image-based approaches. Therefore, this thesis will describe the creation of a video-based animal Re-ID dataset for captive polar bears. It is based on a small dataset created for a previous master thesis [9]. The previous dataset includes recordings and sequences of three zoological institutions. To address this limitation, we aim to double the number of collaborating zoos to at least six within this thesis. Sequences are generated from raw video recordings of individuals in different zoos during the dataset creation process. The human labelling effort is reduced compared to related image-based approaches. Challenges such as collecting data in zoos and dealing with different file and compression standards will also be discussed.

Upcoming video-based animal Re-ID methods currently have no possibilities for comparison to current state-of-the-art models due to the lack of benchmark datasets. Researchers have to create their own datasets to show the application. Contrasting with that is the research on person Re-ID algorithms where newly designed and enhanced algorithms can get benchmarked on various available datasets [10][11]. Therefore, creating a video-based Re-ID dataset will enhance research on this topic and encourage the further creation of landmark independent Re-ID methods on animals. Based on the developed dataset within this thesis, we will benchmark a video-based person Re-ID approach like Multi-direction and Multi-scale Pyramid in Transformer for Video-based Pedestrian Retrieval and compare the results with the GLTR approach used in the previous thesis [12][9][8].


[1] M. Stamp Dawkins, Observing Animal Behaviour. Oxford University Press, Oct. 2007, isbn: 978-0-19-856935-0. doi: 10.1093/acprof:oso/9780198569350.001. 0001. [Online]. Available: https : / / academic . oup . com / book / 5180 (visited on 12/21/2022).
[2] M. Zurl, P. Stoll, I. Brehm, et al., ”Automated Video-Based Analysis Framework for Behavior Monitoring of Individual Animals in Zoos Using Deep Learning—A Study on Polar Bears“, Animals, vol. 12, p. 692, Mar. 2022. doi: 10.3390/ani12060692.
[3] J. Rushen, Chapinal, and A. M. de Passille, ”Automated monitoring of behaviouralbased animal welfare indicators“, Animal Welfare, vol. 21, pp. 339–350, Aug. 2012. doi: 10.7120/09627286.21.3.339.
[4] K. Papafitsoros, L. Adam, V. ˇCerm´ak, and L. Picek, SeaTurtleID: A novel longspan dataset highlighting the importance of timestamps in wildlife re-identification, arXiv:2211.10307 [cs], Nov. 2022. [Online]. Available: 10307 (visited on 12/19/2022).
[5] M. Stennett, D. I. Rubenstein, and T. Burghardt, Towards Individual Grevy’s Zebra Identification via Deep 3D Fitting and Metric Learning, arXiv:2206.02261 [cs], Aug. 2022. [Online]. Available: (visited on 12/21/2022).
[6] E. Nepovinnykh, I. Chelak, T. Eerola, and H. K¨alvi¨ainen, NORPPA: NOvel Ringed seal re-identification by Pelage Pattern Aggregation, arXiv:2206.02498 [cs], Jun. 2022. [Online]. Available: (visited on 12/21/2022).
[7] O. Elharrouss, N. Almaadeed, S. Al-Maadeed, and A. Bouridane, ”Gait recognition for person re-identification“, en, The Journal of Supercomputing, vol. 77, no. 4, pp. 3653–3672, Apr. 2021, issn: 1573-0484. doi: 10.1007/s11227-020-03409-5. [Online]. Available: (visited on 01/03/2023).
[8] J. Li, J. Wang, Q. Tian, W. Gao, and S. Zhang, ”Global-Local Temporal Representations For Video Person Re-Identification“, IEEE Transactions on Image Processing, vol. 29, pp. 4461–4473, 2020, arXiv:1908.10049 [cs], issn: 1057-7149, 1941-0042. doi: 10.1109/TIP.2020.2972108. [Online]. Available:
10049 (visited on 12/19/2022).
[9] R. Dirauf, Video-based Re-Identification of Captive Polar Bears. FAU, 2021.
[10] H. Luo, Y. Gu, X. Liao, S. Lai, and W. Jiang, Bag of Tricks and A Strong Baseline for Deep Person Re-identification, arXiv:1903.07071 [cs], Apr. 2019. [Online]. Available: Bag_of_Tricks_and_a_Strong_Baseline_for_Deep_Person_CVPRW_2019_paper. pdf (visited on 12/19/2022).
[11] L. Zheng, Z. Bie, Y. Sun, et al., MARS: A Video Benchmark for Large-Scale Person Re-Identification. Oct. 2016, vol. 9910, Pages: 884, isbn: 978-3-319-46465-7. doi: 10.1007/978-3-319-46466-4_52.
[12]X. Zang, G. Li, and W. Gao, ”Multi-direction and Multi-scale Pyramid in Transformer for Video-based Pedestrian Retrieval“, IEEE Transactions on Industrial Informatics, vol. 18, no. 12, pp. 8776–8785, Dec. 2022, arXiv:2202.06014 [cs], issn: 1551-3203, 1941-0050. doi: 10.1109/TII.2022.3151766. [Online]. Available: http: // (visited on 02/01/2023).