Computational modeling of the primate visual system yields insights of potential relevance to some of the challenges that computer vision is facing, such as object recognition and categorization, motion detection and activity recognition, or vision-based navigation and manipulation. This paper reviews some functional principles and structures that are generally thought to underlie the primate visual cortex, and attempts to extract biological principles that could further advance computer vision research. Organized for a computer vision audience, we present functional principles of the processing hierarchies present in the primate visual system considering recent discoveries in neurophysiology. The hierarchical processing in the primate visual system is characterized by a sequence of different levels of processing (on the order of 10) that constitute a deep hierarchy in contrast to the flat vision architectures predominantly used in today’s mainstream computer vision. We hope that the functional description of the deep hierarchies realized in the primate visual system provides valuable insights for the design of computer vision algorithms, fostering increasingly productive interaction between biological and computer vision research.
COBISS.SI-ID: 10445908
We have proposed a new method for a supervised online estimation of probabilistic discriminative models for classification tasks. The method estimates the class distributions from a stream of data in the form of Gaussian mixture models (GMMs). The reconstructive updates of the distributions are based on the recently proposed online kernel density estimator (oKDE). We maintain the number of components in the model low by compressing the GMMs from time to time. We propose a new cost function that measures loss of interclass discrimination during compression, thus guiding the compression toward simpler models that still retain discriminative properties. The resulting classifier thus independently updates the GMM of each class, but these GMMs interact during their compression through the proposed cost function. We call the proposed method the online discriminative kernel density estimator (odKDE). We compare the odKDE to oKDE, batch state-of-the-art kernel density estimators (KDEs), and batch/incremental support vector machines (SVM) on the publicly available datasets. The odKDE achieves comparable classification performance to that of best batch KDEs and SVM, while allowing online adaptation from large datasets, and produces models of lower complexity than the oKDE.
COBISS.SI-ID: 9907284
In this article, we propose a novel stroke width transform (SWT) voting-based color reduction method for detecting text in natural scene images. Unlike other text detection approaches that mostly rely on either text structure or color, the proposed method combines both by supervising text-oriented color reduction process with additional SWT information. SWT pixels mapped to color space vote in favor of the color they correspond to. Colors receiving high SWT vote most likely belong to text areas and are blocked from being mean-shifted away. Literature does not explicitly address SWT search direction issue; thus, we propose an adaptive sub-block method for determining correct SWT direction. Both SWT voting-based color reduction and SWT direction determination methods are evaluated on binary (text/non-text) images obtained from a challenging Computer Vision Lab optical character recognition database. SWT voting-based color reduction method outperforms the state-of-the-art text-oriented color reduction approach.
COBISS.SI-ID: 9854292
Ground based Mode-S radars can acquire a wide range of data from aircraft including meteorological readings. In this article, we analyze meteorological data collected by one Mode-S radar and compare it to corresponding radiosonde measurements. We show that temperature, wind direction and wind speed available from Mode-S radars are reliable enough to be used in different applications. Similar measurements are collected in the Aircraft Meteorological Data Relay (AMDAR) project but there are only a few flight companies who participate in the project. Since ground-based radars can fetch new data on every turn, the amount of meteorological data that they can obtain is much larger in comparison to AMDAR. However, from areas with no radar coverage, such as over the oceans, AMDAR can provide the data. Since there is no systematic way of alerting aircraft with uncalibrated or faulty sensors, errors in meteorological readings obtained my means of Mode-S radars must be identified and eliminated. Due to the large amount of correct data, identification of faulty data is possible. We selected the Kalman filter for data smoothing and for elimination of small measurement errors. To exclude large errors which are due to uncalibrated or biased sensors we developed an automatic error elimination method which is also based on the Kalman filter. We compared the obtained results with numerical weather predictions (NWP) calculated by a meteorological agency. The comparison shows surprisingly small differences. However, the data are available only over areas and on altitudes where aircraft are flying. Radiosondes in comparison give meteorological readings without altitude gaps that we typically get on altitudes where there is sparse aircraft traffic. On the other hand, meteorological measurements from aircraft are available practically during the whole day while a radiosonde is usually deployed only once or twice a day. We have demonstrated that meteorological data collected with the help of Mode-S radars can be used as a very good substitute for upper wind tables provided by meteorological agencies from NWP for air traffic control trajectory calculations. The data could also be fed into AMDAR or serve as a valuable input for NWP models.
COBISS.SI-ID: 10338900
We present a quantitative study of digital signage audience measurement using computer vision. We developed a camera enhanced digital signage display that acquires audience measurement metrics with computer vision algorithms. Temporal metrics of a person's dwell time, display in-view time, and attention time are extracted. The system also determines demographic metrics of gender and age group. The digital signage display was deployed in a real world environment of a clothing boutique, where demographic and viewership data of 1294 store customers was recorded, manually verified and analysed. The analysis shows that 35% of customers specifically looked-at the display, having the average attention time of 0.7s. Interestingly, the attention time was substantially higher for men (1.2s) as for women (0.4s). Age group comparison reveals that children (1-14 years) are the most responsive to the digital signage. Finally, the analysis shows that the average attention time is significantly higher when displaying dynamic content (0.9s) as compared to static content (0.6s).
COBISS.SI-ID: 9659732