In this paper, we establish a baseline for object symmetry detection in complex backgrounds by presenting a new benchmark and an end-to-end deep learning approach, opening up a promising direction for symmetry detection in the wild. The new benchmark, named Sym-PASCAL, spans challenges including object diversity, multi-objects, part-invisibility, and various complex backgrounds that are far beyond those in existing datasets. The proposed symmetry detection approach, named Side-output Residual Network (SRN), leverages output Residual Units (RUs) to fit the errors between the object symmetry ground-truth and the outputs of RUs. By stacking RUs in a deep-to-shallow manner, SRN exploits the ‘flow’ of errors among multiple scales to ease the problems of fitting com-plex outputs with limited layers, suppressing the complex backgrounds, and effectively matching object symmetry of different scales. Experimental results validate both the benchmark and its challenging aspects related to real-world images, and the state-of-the-art performance of our symmetry detection approach. The benchmark and the code for SRN are publicly available at https://github.com/KevinKecc/SRN .
二、Self-learning Scine-specific Pedestrain Detectors using a Progressive Latent Model. 2017.CVPR
In this paper, a self-learning approach is proposed towards solving scene-specific pedestrian detection prob- lem without any human’ annotation involved. The self- learning approach is deployed as progressive steps of object discovery, object enforcement, and label propagation. In the learning procedure, object locations in each frame are treated as latent variables that are solved with a progressive latent model (PLM). Compared with conventional latent models, the proposed PLM incorporates a spatial regu- larization term to reduce ambiguities in object proposals and to enforce object localization, and also a graph-based label propagation to discover harder instances in adjacent frames. With the difference of convex (DC) objective functions, PLM can be efficiently optimized with a concave- convex programming and thus guaranteeing the stability of self-learning. Extensive experiments demonstrate that even without annotation the proposed self-learning approach outperforms weakly supervised learning approaches, while achieving comparable performance with transfer learning and fully supervised approaches.
三、Texture Classification in Extreme Scale Variations using GANet 2019.CVPR
Research in texture recognition often concentrates on recognizing textures with intraclass variations such as il- lumination, rotation, viewpoint and small scale changes. In contrast, in real-world applications a change in scale can have a dramatic impact on texture appearance, to the point of changing completely from one texture category to another. As a result, texture variations due to changes in scale are amongst the hardest to handle. In this work we conduct the first study of classifying textures with extreme variations in scale. To address this issue, we first propose and then reduce scale proposals on the basis of dominant texture patterns. Motivated by the challenges posed by this problem, we propose a new GANet network where we use a Genetic Algorithm to change the filters in the hidden layers during network training, in order to promote the learning of more informative semantic texture patterns. Finally, we adopt a FV- CNN (Fisher Vector pooling of a Convolutional Neural Network filter bank) feature encoder for global texture representation. Because extreme scale variations are not necessarily present in most standard texture databases, to support the proposed extreme-scale aspects of texture understanding we are developing a new dataset, the Extreme Scale Variation Textures (ESVaT), to test the performance of our framework. It is demonstrated that the proposed framework significantly outperforms gold-standard texture features by more than 10% on ESVaT. We also test the performance of our proposed approach on the KTHTIPS2b and OS datasets and a further dataset synthetically derived from Forrest, showing superior performance compared to the state of the art.
四、Saliency Intergration: An Arbitrator Model 2018.TMM
Saliency integration has attracted much attention on unifying saliency maps from multiple saliency models. Previous offline integration methods usually face two challenges: 1. if most of the candidate saliency models misjudge the saliency on an image, the integration result will lean heavily on those inferior candidate models; 2. an unawareness of the ground truth saliency labels brings difficulty in estimating the expertise of each candi- date model. To address these problems, in this paper, we propose an arbitrator model (AM) for saliency integration. Firstly, we incorporate the consensus of multiple saliency models and the external knowledge into a reference map to effectively rectify the misleading by candidate models. Secondly, our quest for ways of estimating the expertise of the saliency models without ground truth labels gives rise to two distinct online model-expertise estimation methods. Finally, we derive a Bayesian integration framework to reconcile the saliency models of varying expertise and the reference map. To extensively evaluate the proposed AM model, we test twenty-seven state-of-the-art saliency models, covering both traditional and deep learning ones, on various combinations over four datasets. The evaluation results show that the AM model improves the performance substantially compared to the existing state-of-the-art integration methods, regardless of the chosen candidate saliency models.
五、Hierarchical Contour Closure based Holistic Salient Object Detection 2017.TIM
Most existing salient object detection methods com- pute the saliency for pixels, patches or superpixels by contrast. Such fine-grained contrast based salient object detection methods are stuck with saliency attenuation of the salient object and saliency overestimation of the background when the image is complicated. To better compute the saliency for complicated images, we propose a hierarchical contour closure based holistic salient object detection method, in which two saliency cues, i.e., closure completeness and closure reliability are thoroughly exploited. The former pops out the holistic homogeneous regions bounded by completely closed outer contours, and the latter highlights the holistic homogeneous regions bounded by averagely highly reliable outer contours. Accordingly, we propose two computational schemes to compute the corresponding saliency maps in a hierarchical segmentation space. Finally, we propose a framework to combine the two saliency maps, obtaining the final saliency map. Experimental results on three publicly available datasets show that even each single saliency map is able to reach the state-of-the-art performance. Furthermore, our framework which combines two saliency maps outperforms the state of the arts. Additionally, we show that the proposed framework can be easily used to extend existing methods and further improve their performances substantially.