A representation of objects by their parts is the dominant strategy for representing complex 3D objects in many disciplines. In computer vision and robotics, superquadrics are among the most widespread part models. Superquadrics are a family of parametric models that cover a wide variety of smoothly changing 3D symmetric shapes, which are controlled with a small number of parameters and which can be augmented with the addition of global and local deformations. The book covers, in depth, the geometric properties of superquadrics. The main contribution of the book is an original approach to the recovery and segmentation of superquadrics from range images. Several applications of superquadrics in computer vision and robotics are thoroughly discussed and, in particular, the use of superquadrics for range image registration is demonstrated. We used this method for 3D documentation in underwater archeology. We reported about this activity in an article published in a prestigious archeological journal (see achievement 22.1), in another journal article, in two conference papers and in a book chapter. The method is very popular since all out publications related to this method have about 1500 citations in Google scholar, 100 citations after 2014. Others use this method for grasp planning in robotics, modeling of shapes of organs in medicine etc. Due to its popularity we started its reimplementation using CNNs to make it much faster.
COBISS.SI-ID: 256892416
In this work we present a new approach to interest point detection. Different types of features in images are detected by using a common computational concept. The proposed approach consider the total variability of local regions. Results obtained on a wide variety of image sets compare favourably with the results obtained by the leading interest point detectors from the literature. The proposed approach gives a rich set of highly distinctive local regions that can be used for object recognition and image matching.
COBISS.SI-ID: 41304418
Automatic identity recognition from ear images represents an active field of research within the biometric community. The ability to capture ear images from a distance and in a covert manner makes the technology an appealing choice for surveillance and security applications as well as other application domains. Significant contributions have been made in the field over recent years, but open research problems still remain and hinder a wider (commercial) deployment of the technology. This work presents an overview of the field of automatic ear recognition (from 2D images) and focuses specifically on the most recent, descriptor-based methods proposed in this area. Open challenges are discussed and potential research directions are outlined with the goal of providing the reader with a point of reference for issues worth examining in the future. In addition to a comprehensive review on ear recognition technology, this work also introduces a new, fully unconstrained dataset of ear images gathered from the web and a toolbox implementing several state-of-the-art techniques for ear recognition. The dataset and toolbox are meant to address some of the open issues in the field and are made publicly available to the research community. - In the last three years we were very active in this field of research. We published one more impact factor journal paper (one is also in the final revision stage) and six conference papers. We organised a very well accepted challenge at the most prestigious conference on biometrics and are editing a special issue about unconstrained ear recognition in an impact factor journal IET Biometrics.
COBISS.SI-ID: 1537395395
Image and video data is today being shared between government entities and other relevant stakeholders on a regular basis and requires careful handling of the personal information contained therein. A popular approach to ensure privacy protection in such data is the use of deidentification techniques, which aim at concealing the identity of individuals in the imagery while still preserving certain aspects of the data after deidentification. In this work, we propose a novel approach towards face deidentification, called k-Same-Net, which combines recent Generative Neural Networks (GNNs) with the well-known k-Anonymity mechanism and provides formal guarantees regarding privacy protection on a closed set of identities. Our GNN is able to generate synthetic surrogate face images for deidentification by seamlessly combining features of identities used to train the GNN model. Furthermore, it allows us to control the image-generation process with a small set of appearance-related parameters that can be used to alter specific aspects (e.g., facial expressions, age, gender) of the synthesized surrogate images. We demonstrate the feasibility of k-Same-Net in comprehensive experiments on the XM2VTS and CK+ datasets. We evaluate the efficacy of the proposed approach through reidentification experiments with recent recognition models and compare our results with competing deidentification techniques from the literature. We also present facial expression recognition experiments to demonstrate the utility-preservations capabilities k-Same-Net. Our experimental results suggest that k-Same-Net is a viable option for facial deidentification that exhibits several desirable characteristics when compared to existing solutions in this area. - This paper represents the last result in a series of publications in this relatively new area of research conducted by the members of the research group. We also published one more journal paper with impact factor and two conference papers.
COBISS.SI-ID: 1537688771
We introduce the channel and spatial reliability concepts to DCF tracking and provide a learning algorithm for its efficient and seamless integration in the filter update and the tracking process. The spatial reliability map adjusts the filter support to the part of the object suitable for tracking. This both allows to enlarge the search region and improves tracking of non-rectangular objects. Reliability scores reflect channel-wise quality of the learned filters and are used as feature weighting coefficients in localization. Using only hand-crafted features, the tracker achieves state-of-the-art performance on VOT 2016, VOT 2015 and OTB100. The tracker implemented in Matlab runs close to real-time on a CPU. The method was ported to C++ after the publication and now runs confidently in realtime. The tracker achieved first place on the VOT2017 challenge in the category of real-time trackers among 51 submissions. Due to high exposure of VOT we expect the tracker will attract significant attention of the computer vision as well as robotic community.
COBISS.SI-ID: 1537691075
We addressed the problem of single-target tracker performance evaluation. We consider the performance measures, the dataset and the evaluation system to be the most important components of tracker evaluation and propose requirements for each of them. The requirements are the basis of a new evaluation methodology that aims at a simple and easily interpretable tracker comparison. The ranking-based methodology addresses tracker equivalence in terms of statistical significance and practical differences. A fully-annotated dataset with per-frame annotations with several visual attributes is introduced. The diversity of its visual properties is maximized in a novel way by clustering a large number of videos according to their visual attributes. This makes it the most sophistically constructed and annotated dataset to date. A multi-platform evaluation system allowing easy integration of third-party trackers is presented as well. The performance evaluation methodology is the basis for performance evaluation in the largest visual object tracking challenge series VOT and is currently one of the major evaluation methodologies in visual object tracking. The proposed methodology was used by the challenge issued by the largest open source computer vision library openCV to select state-of-the-art trackers that were contributed to the library (https://opencv.org/opencv-vision-challenge.html). In just two years the paper collected 86 citations on Google scholar and 267 reads on Researchgate.
COBISS.SI-ID: 1536872643
We have proposed a new method for a supervised online estimation of probabilistic discriminative models for classification tasks. The method estimates the class distributions from a stream of data in the form of Gaussian mixture models (GMMs). The reconstructive updates of the distributions are based on the recently proposed online kernel density estimator (oKDE). We maintain the number of components in the model low by compressing the GMMs from time to time. We propose a new cost function that measures loss of interclass discrimination during compression, thus guiding the compression toward simpler models that still retain discriminative properties. The resulting classifier thus independently updates the GMM of each class, but these GMMs interact during their compression through the proposed cost function. We call the proposed method the online discriminative kernel density estimator (odKDE). We compare the odKDE to oKDE, batch state-of-the-art kernel density estimators (KDEs), and batch/incremental support vector machines (SVM) on the publicly available datasets. The odKDE achieves comparable classification performance to that of best batch KDEs and SVM, while allowing online adaptation from large datasets, and produces models of lower complexity than the oKDE.
COBISS.SI-ID: 9907284
We have proposed an improved model for visual tracking using an adaptive coupled model. The model's advantage lays within the capability to track articulated objects using simple local features that are connected to a weak geometrical constellation. The model can robustly add and remove local features depending on probability maps of high-level image properties such as motion and color. The model also enables inclusion of additional probability maps based on arbitrary high-level image properties. We have analyzed the proposed tracker on a large video database and compared it with the current state-of-the-art. The experiments have shown that the proposed tracker outperforms the reference trackers based on multiple criteria. The article has in Google Scholar 137 citations!
COBISS.SI-ID: 9431124
In the framework of EU FP7 project CogX our group was in charge of cross-modal learning and the development of an interactive system for learning in dialogue with a tutor. In this paper we presented representations and mechanisms that facilitate continuous learning of visual concepts in dialogue with a tutor and showed the implemented robot system. We presented how beliefs about the world are created by processing visual and linguistic information and show how they are used for planning system behaviour with the aim at satisfying its internal drive to extend its knowledge. The system facilitates different kinds of learning initiated by the human tutor or by the system itself. We demonstrated these principles in the case of learning about object colours and basic shapes. This distributed, heterogeneous, and very complex system was developed in collaboration with other project partners under our guidance and supervision, and served as a basis for further research and development of artificial cognitive systems in our group as well as in collaboration with other partners. Results of this research have been published in several papers, including in the article Skočaj etc., An integrated system for interactive continuous learning of categorical knowledge, Journal of experimental & theoretical artificial intelligence, 2016.
COBISS.SI-ID: 8305492
We present a new formulation of the constellation model with correlation filters that treats the geometric and visual constraints within a single convex cost function and derive a highly efficient optimization for maximum a posteriori inference of a fully connected constellation. We propose a tracker that models the object at two levels of detail. The coarse level approximately localizes the object, while the mid-level representation carries out fine localization. The model is capable of adapting to the target aspect change and partial occlusion, which is often the case in tracking in robotic applications. The resulting tracker is rigorously analyzed on a highly challenging OTB, VOT2014, and VOT2015 benchmarks, exhibits a state-of-the-art performance and runs in real-time.
COBISS.SI-ID: 1537625283