Deep Siamese Similarity Learning: a highly scalable approach to searching unordered sets of trajectories

Master's Thesis

Deep Siamese Similarity Learning: a highly scalable approach to searching unordered sets of trajectories


Christoffer Löffler (M.Sc.), Prof. Dr. B. Eskofier




Current annotation and scene retrieval in sports is mainly based on manual annotation by game analysts or commercial companies. There are various attempts to automatically detect scenes in TV broadcast materials (video and audio), with “highlight extraction” being the most prominent use case. These days data providers in sports like Cheyron Hego deliver far more data than only video material. Especially tracking data of ball and players becomes ubiquitously available and can be a valuable data source to make game scenes searchable. The main goal of this thesis is to detect similar scenes according to a “query scene” like in [1]. The query scene could either be a snippet of tracking data or a class of scenes like “counter attack”. Therefore, the functionality to search the official DFL tracking data of the whole 2014/15 German Bundesliga season shall be implemented in this thesis. A literature review for both approaches (query tracking snippet vs. class query) shall be executed to get to a theoretic assessment which approach is most likely to succeed with the available dataset.

Prior work showed similarity between scenes is well described by the Euclidean distance between player trajectories [2]. One major issue with this approach is the comparability between trajectories: during a match, players temporarily fill various roles [2], raising the question which trajectories should be compared. Sha et al. [2] and Klabjan et al. [1] resolve this by aligning trajectories to a fixed template based on mean role positions. Sha et al. [3] on the other hand align trajectories to a learned hierarchy of templates significantly improving on [2]. The proposed thesis approaches this problem by learning an embedding that approximates the scene similarity metric as used in [2]. To learn said embedding, neural networks will be utilized, which have shown to be a viable option for metric learning [4]. Among the potential upsides of this approach are lowered computational cost during inference, simple integration of new data into the system as well as providing distinctive features for scenes usable for further processing like clustering and ranking.



[1] Di, Mingyang, et al. “Large-Scale Adversarial Sports Play Retrieval with Learning to Rank.” ACM Transactions on Knowledge Discovery from Data (TKDD)6 (2018): 1-18.

[2] Sha, Long, et al. “Chalkboarding: A new spatiotemporal query paradigm for sports play retrieval.” Proceedings of the 21st International Conference on Intelligent User Interfaces.

[3] Sha, Long, et al. “Fine-grained retrieval of sports plays using tree-based alignment of trajectories.” arXiv preprint arXiv:1710.02255 (2017).

[4] Hoffer, Elad, and Nir Ailon. “Deep metric learning using triplet network.” International Workshop on Similarity-Based Pattern Recognition. Springer, Cham, 2015.