This paper presented a novel segmentation-based deep-learning architecture that is designed for the detection and segmentation of surface anomalies using fully-supervised learning. The proposed model was designed for use with a small number of training samples, which is an important requirement for practical applications. Comparison to other related deep-learning methods, including the state-of-the-art commercial software, showed that the proposed approach outperforms the related methods. This is demonstrated on a newly created dataset based on a real-world quality control case, where the proposed approach was possible to learn with a small number of defective surfaces, using only approximately 25-30 defective training samples, instead of hundreds or thousands as is usually the case in deep-learning applications. This makes the deep-learning method practical for use in industry where the number of available defective samples is limited. The research community has shown a large interest in this work, both, in the source code as well as in the image dataset that we have both made publicly available; the latter has already been downloaded more than 3000 times. The proposed method achieves excellent results in a discriminant supervised learning setup and forms a firm basis for further extensions in the directions of weakly, semi, and unsupervised learning.
COBISS.SI-ID: 1538225859
This paper proposed a novel displaced aggregation unit (DAU) for deep convolutional networks, which introduces novel compositional properties into the deep models. In contrast to classical filters with units (pixels) placed on a fixed regular grid, the displacement of the DAUs are learned, which resulted in deep networks with novel properties, such as decoupling of the parameters from the receptive field, learning of the receptive field sizes and automatic adjustment of the spatial focus of features. Those properties resulted in more efficient deep networks with fewer number of operations and parameters, and also enabled novel analysis of the parameters and the spatial coverage of features. The strength of DAUs were extensively demonstrated on classification, semantic segmentation and blind image deblurring tasks. Results showed that DAUs efficiently allocate parameters resulting in up to 4-times more compact networks in terms of the number of parameters at similar or better performance. The proposed method is therefore suitable for modeling consistency in images in fully-convolutional models for segmentation as well as for generative models, since it is able to adapt the size of receptive fields to the content of the images, i.e. to the consistency of the content of the imageset, much easier than the standard convolution.
COBISS.SI-ID: 1538492611
This paper addressed the issue of detecting and recognizing a large number of object categories, applied to the problem of traffic-sign sign detection and recognition. A convolutional neural network (CNN) approach, Mask R-CNN, was adapted to address the full pipeline of detection and recognition with automatic end-to-end learning. Several proposed improvements were evaluated on the detection of 200 traffic-sign categories from a newly created dataset. Evaluation focused on highly challenging traffic-sign categories that have not yet been considered in previous works. A comprehensive analysis of the deep learning method for the detection of traffic signs with large intra-category appearance variation showed below 3% error rates for the proposed approach, thus outperforming the related approaches. We started with this research before the beginning of this project, but we concluded with it as part of the work on the project, thus also evaluating the two-phase detection approach (detection of regions proposals, followed by classification of individual regions) as a complement to the fully convolutional segmentation approach, which is predominant in this project.
COBISS.SI-ID: 1538227907