Osman Demir

Osman Demir

Master's Thesis

Anomaly Semantic Segmentation with Learnable Wavelet Filters

Leo Schwinn (M.Sc.), Franz Köferl (M.Sc.), Prof. Dr. Björn Eskofier

12/2019 – 05/2020

Unbalanced semantic segmentation tackles the problem of classifying pixels in an image according to their label, where one class is occurring only infrequently. In manufacturing or medical imaging, the detection of such anomalies is still challenging and is mostly done by experts [1, 2].
The most successful algorithms for automatic semantic segmentation are convolutional neural networks [3, 4, 5, 6]. Some of these approaches classify the images in the frequency domain, where an irregular pattern may be visible by a change in frequency [7, 8]. Usually, the transformation into the frequency domain is done as a preprocessing step and is not integrated into the neural network [9]. However, it is possible to integrate the frequency transformation into the neural network and create a new end-to-end network, where no expert knowledge is necessary to tune the parameters of the frequency transformation. This approach has shown superior performance in the speech processing domain, over applying the frequency transformation as a preprocessing step [10].

In this thesis, a wavelet convolutional layer is presented, which extends the approach to images. Classification in the frequency domain is already explored but not in an end-to-end manner. The wavelet convolutional layer implements a continuous wavelet transform in 2D, where the scale of the wavelets is learned end-to-end with the rest of the neural network. It is expected that the different scales of the wavelets help to identify error patterns of varying sizes. All implementations are evaluated and compared on at least two different data sets, including
MVTecAd [11] and a dataset provided by ISRA VISION consisting of 79 images of car keys [13] with different types of production errors.


  1.  L. Mart, N. Sanchez-Pi, J. M. Molina, and A. Cr. B. Garcia, “Anomaly detection based on sensor data in petroleum industry applications”, in Sensors, 2015, pp. 2774-2797.
  2.  E. Schubert, A. Zimek and H.-P. Kriegel, “Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection”, in Data Mining and Knowledge Discovery, 2014, pp. 190-237.
  3. K. He, “Deep residual learning for image recognition”, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016. pp. 770-778.
  4. V. Badrinarayanan, A. Kendall and R. Cipolla, “SegNet: A Deep Convolutional EncoderDecoder Architecture for Image Segmentation”, in IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, pp. 2481-2495.
  5. C. Farabet, C. Couprie, L. Najman and Y. LeCun, “Learning Hierarchical Features for Scene Labeling”, in IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, pp. 1915-1929.
  6. L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff and H. Adam, “Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation”, in The European Conference on Computer Vision (ECCV), 2018, pp. 801-818.
  7. F. Franzen, “Image Classification in the Frequency Domain with Neural Networks and Absolute Value DCT”, in International Conference on Image and Signal Processing, Springer, 2018, pp. 301-309.
  8. A. Rajan, G.P. Ramesh and J. Yuvaraj, “Glaucomatous image Classification using Wavelet Transform”, in IEEE International Conference on Advanced Communications, Control and Computing Technologies, 2014.
  9. Y.-D. Zhang, Z. Dong, L. Wu and S. Wang, “A hybrid method for MRI brain image classification”, in Expert Systems with Applications, 2011, pp. 10049-10053.
  10. H. Khan and B. Yener, “Learning filter widths of spectral decompositions with wavelets”, in Neural Information Processing Systems (NIPS), 2018, pp. 4606-4617.
  11. MVTech Software GmbH, MVTec Anomaly Detection Dataset [online]. Available: https://www.mvtec.com/de/unternehmen/forschung/datasets/mvtec-ad/ (visited on 11/12/2019).
  12. N. Otsu, “A threshold selection method from gray-level histograms”, in IEEE transactions on systems, man, and cybernetics, 1979, pp. 62-66.
  13. ISRA VISION AG, Anomaly Detection Dataset [confidential], 2019