12/2019 – 05/2020
Unbalanced semantic segmentation tackles the problem of classifying pixels in an image according
to their label, where one class is occurring only infrequently. In manufacturing or medical imaging,
the detection of such anomalies is still challenging and is mostly done by experts [1, 2].
The most successful algorithms for automatic semantic segmentation are convolutional neural networks [3, 4, 5, 6]. Some of these approaches classify the images in the frequency domain, where an
irregular pattern may be visible by a change in frequency [7, 8]. Usually, the transformation into
the frequency domain is done as a preprocessing step and is not integrated into the neural network
. However, it is possible to integrate the frequency transformation into the neural network and
create a new end-to-end network, where no expert knowledge is necessary to tune the parameters
of the frequency transformation. This approach has shown superior performance in the speech
processing domain, over applying the frequency transformation as a preprocessing step .
In this thesis, a wavelet convolutional layer is presented, which extends the approach to images.
Classification in the frequency domain is already explored but not in an end-to-end manner. The
wavelet convolutional layer implements a continuous wavelet transform in 2D, where the scale of
the wavelets is learned end-to-end with the rest of the neural network. It is expected that the
different scales of the wavelets help to identify error patterns of varying sizes.
All implementations are evaluated and compared on at least two different data sets, including
MVTecAd  and a dataset provided by ISRA VISION consisting of 79 images of car keys 
with different types of production errors.
- L. Mart, N. Sanchez-Pi, J. M. Molina, and A. Cr. B. Garcia, “Anomaly detection based on
sensor data in petroleum industry applications”, in Sensors, 2015, pp. 2774-2797.
- E. Schubert, A. Zimek and H.-P. Kriegel, “Local outlier detection reconsidered: a generalized
view on locality with applications to spatial, video, and network outlier detection”, in Data
Mining and Knowledge Discovery, 2014, pp. 190-237.
- K. He, “Deep residual learning for image recognition”, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016. pp. 770-778.
- V. Badrinarayanan, A. Kendall and R. Cipolla, “SegNet: A Deep Convolutional EncoderDecoder Architecture for Image Segmentation”, in IEEE Transactions on Pattern Analysis
and Machine Intelligence, 2017, pp. 2481-2495.
- C. Farabet, C. Couprie, L. Najman and Y. LeCun, “Learning Hierarchical Features for Scene
Labeling”, in IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, pp.
- L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff and H. Adam, “Encoder-Decoder with
Atrous Separable Convolution for Semantic Image Segmentation”, in The European Conference on Computer Vision (ECCV), 2018, pp. 801-818.
- F. Franzen, “Image Classification in the Frequency Domain with Neural Networks and Absolute Value DCT”, in International Conference on Image and Signal Processing, Springer,
2018, pp. 301-309.
- A. Rajan, G.P. Ramesh and J. Yuvaraj, “Glaucomatous image Classification using Wavelet
Transform”, in IEEE International Conference on Advanced Communications, Control and
Computing Technologies, 2014.
- Y.-D. Zhang, Z. Dong, L. Wu and S. Wang, “A hybrid method for MRI brain image
classification”, in Expert Systems with Applications, 2011, pp. 10049-10053.
- H. Khan and B. Yener, “Learning filter widths of spectral decompositions with wavelets”,
in Neural Information Processing Systems (NIPS), 2018, pp. 4606-4617.
- MVTech Software GmbH, MVTec Anomaly Detection Dataset [online]. Available: https://www.mvtec.com/de/unternehmen/forschung/datasets/mvtec-ad/ (visited on
- N. Otsu, “A threshold selection method from gray-level histograms”, in IEEE transactions
on systems, man, and cybernetics, 1979, pp. 62-66.
- ISRA VISION AG, Anomaly Detection Dataset [confidential], 2019