A Sparse Structure for Fast Circle Detection

Publication date: Available online 26 August 2019

Source: Pattern Recognition

Author(s): Yuanqi Su, Xiaoning Zhang, Bonan Cuan, Yuehu Liu, Zehao Wang

Abstract

In the paper, we present a circle detector that achieves the state-of-art performance in almost every type of image. The detector represents each circle instance by a set of equally distributed arcs and searches for the same number of edge points to cover these arcs. The new formulation leads to the voting in minimizing/maximizing way which is different from the typical accumulative way adopted by Hough transform. From the formulation, circle detection is then decomposed into radius-dependent and -independent part. The calculation of independent part is computationally expensive but shared by different radii. This decomposition gets rid of the redundant computation in handling multiple radii and therefore speeds up the detection process. Originated from the sparse nature of independent part, we design a sparse structure for its batch computation which is fulfilled in just one sweep of the edge points. Circle detector based on this sparse structure is then proposed which achieves the comparable time complexity as the algorithm based on Hough transform using 2D accumulator array. For testing, we created an information-rich dataset with images coming from multiple sources. It contains five categories and covers a wide spectrum of images, ranging from true color images to the binary ones. The experimental results demonstrate that the proposed approach outperforms the solutions based on accumulative voting.

Time series classification using local distance-based features in multi-modal fusion networks

Publication date: Available online 26 August 2019

Source: Pattern Recognition

Author(s): Brian Kenji Iwana, Seiichi Uchida

Abstract

We propose the use of a novel feature, called local distance features, for time series classification. The local distance features are extracted using Dynamic Time Warping (DTW) and classified using Convolutional Neural Networks (CNN). DTW is classically as a robust distance measure for distance-based time series recognition methods. However, by using DTW strictly as a global distance measure, information about the matching is discarded. We show that this information can further be used as supplementary input information in temporal CNNs. This is done by using both the raw data and the features extracted from DTW in multi-modal fusion CNNs. Furthermore, we explore the effects of different prototype selection methods, prototype numbers, and data fusion schemes induce on the accuracy. We perform experiments on a wide range of time series datasets including three Unipen handwriting datasets, four UCI Machine Learning Repository datasets, and 85 UCR Time Series Classification Archive datasets.

Graphical abstract

Graphical abstract for this article

Robust Visual Tracking using Unlabeled Adversarial Instance Generation and Regularized Label Smoothing

Publication date: Available online 27 August 2019

Source: Pattern Recognition

Author(s): Yamin Han, Peng Zhang, Wei Huang, Yufei Zha, Garth Douglas Cooper, Yanning Zhang

Abstract

Recent studies have shown that deep neural networks have pushed visual tracking accuracy to new heights, but finding more robust long-term tracking is still challenging because of the dynamic foreground and background changes. This phenomenon affects the overall performance via online training sample generation. The dense sampling strategy has been widely used for its convenience, the appearance variation is severely limited by its highly spatial overlapping mechanism. The sample candidate evaluation with a classification score metric is not always reliable throughout the entire process, therefore, tracking failure is inevitable. As an effective solution, this paper proposes a novel sample-level generative adversarial network (GAN) to enrich the training data by generating massive amounts of sample-level GAN samples. These samples are not only similar to the real-life scenarios, but also could carry more diversity of deformation and motion blur to a certain degree. For occlusion invariance, a feature-level GAN is incorporated to generate more challenging feature-level GAN data by creating random occlusion masks in deep feature space. To facilitate the online learning process, a label smoothing loss regularization is introduced to achieve model regularization and over-fitting reduction by integrating the unlabeled GAN-generated training data with the realistically labeled ones. In addition, a re-detection correlation filter conservatively trained with reliable training data is employed to integrate a classification score metric to perform reliable model updates and avoid heavy degradation. Furthermore, we also carry out the re-detection correlation filter on the candidate region proposals to handle the tracking failures. The proposed tracker has shown superior performance in comparison to the other state-of-the-art tracking approaches on the OTB-2013, OTB-100, UAV123, UAV20L, and VOT2016 benchmark datasets.

Scalable Logo Detection by Self Co-Learning

Publication date: Available online 28 August 2019

Source: Pattern Recognition

Author(s): Hang Su, Shaogang Gong, Xiatian Zhu

Abstract

Existing logo detection methods usually consider a small number of logo classes, limited images per class and assume fine-gained object bounding box annotations. This limits their scalability to real-world dynamic applications. In this work, we tackle these challenges by exploring a web data learning principle without the need for exhaustive manual labelling. Specifically, we propose a novel incremental learning approach, called Scalable Logo Self-co-Learning (SL2), capable of automatically self-discovering informative training images from noisy web data for progressively improving model capability in a cross-model co-learning manner. Moreover, we introduce a very large (2,190,757 images of 194 logo classes) logo dataset “WebLogo-2M” by designing an automatic data collection and processing method. Extensive comparative evaluations demonstrate the superiority of SL2 over the state-of-the-art strongly and weakly supervised detection models and contemporary web data learning approaches.

Similarity Learning with Joint Transfer Constraints for Person Re-Identification

Publication date: Available online 28 August 2019

Source: Pattern Recognition

Author(s): Cairong Zhao, Xuekuan Wang, Wangmeng Zuo, Fumin Shen, Ling Shao, Duoqian Miao

Abstract

The inconsistency of data distributions among multiple views is one of the most crucial issues which hinder the accuracy of person re-identification. To solve the problem, this paper presents a novel similarity learning model by combining the optimization of feature representation via multi-view visual words reconstruction and the optimization of metric learning via joint discriminative transfer learning. The starting point of the proposed model is to capture multiple groups of multi-view visual words (MvVW) through an unsupervised clustering method (i.e. K-means) from human parts (e.g. head, torso, legs). Then, we construct a joint feature matrix by combining multi-group feature matrices with different body parts. To solve the inconsistent distributions under different views, we propose a method of joint transfer constraint to learn the similarity function by combining multiple common subspaces, each in charge of a sub-region. In the common subspaces, the original samples can be reconstructed based on MvVW under low-rank and sparse representation constraints, which can enhance the structure robustness and noise resistance. During the process of objective function optimization, based on confinement fusion of multi-view and multiple sub-regions, a solution strategy is proposed to solve the objective function using joint matrix transform. Taking all of these into account, the issue of person re-identification under inconsistent data distributions can be transformed into a consistent iterative convex optimization problem, and solved via the inexact augmented Lagrange multiplier (IALM) algorithm. Extensive experiments are conducted on three challenging person re-identification datasets (i.e., VIPeR, CUHK01 and PRID450S), which shows that our model outperforms several state-of-the-art methods.

Auto-weighted Multi-view Clustering via Deep Matrix Decomposition

Publication date: Available online 28 August 2019

Source: Pattern Recognition

Author(s): Shudong Huang, Zhao Kang, Zenglin Xu

Abstract

Real data are often collected from multiple channels or comprised of different representations (i.e., views). Multi-view learning provides an elegant way to analyze the multi-view data for low-dimensional representation. In recent years, several multi-view learning methods have been designed and successfully applied in various tasks. However, existing multi-view learning methods usually work in a single layer formulation. Since the mapping between the obtained representation and the original data contains rather complex hierarchical information with implicit lower-level hidden attributes, it is desirable to fully explore the hidden structures hierarchically. In this paper, a novel deep multi-view clustering model is proposed by uncovering the hierarchical semantics of the input data in a layer-wise way. By utilizing a novel collaborative deep matrix decomposition framework, the hidden representations are learned with respect to different attributes. The proposed model is able to collaboratively learn the hierarchical semantics obtained by each layer. The instances from the same class are forced to be closer layer by layer in the low-dimensional space, which is beneficial for the subsequent clustering task. Furthermore, an idea weight is automatically assigned to each view without introducing extra hyperparameter as previous methods do. To solve the optimization problem of our model, an efficient iterative updating algorithm is proposed and its convergence is also guaranteed theoretically. Our empirical study on multi-view clustering task shows encouraging results of our model in comparison to the state-of-the-art algorithms.