• Title/Summary/Keyword: multi-scale features

Search Result 186, Processing Time 0.028 seconds

Scale-aware Faster R-CNN for Caltech Pedestrian Detection (Caltech 보행자 감지를 위한 Scale-aware Faster R-CNN)

  • Byambajav, Batkhuu;Alikhanov, Jumabek;Jo, Geun-Sik
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2016.10a
    • /
    • pp.506-509
    • /
    • 2016
  • We present real-time pedestrian detection that exploit accuracy of Faster R-CNN network. Faster R-CNN has shown to success at PASCAL VOC multi-object detection tasks, and their ability to operate on raw pixel input without the need to design special features is very engaging. Therefore, in this work we apply and adjust Faster R-CNN to single object detection, which is pedestrian detection. The drawback of Faster R-CNN is its failure when object size is small. Previously, small sized object problem was solved by Scale-aware Network. We incorporate Scale-aware Network to Faster R-CNN. This made our method Scale-aware Faster R-CNN (DF R-CNN) that is both fast and very accurate. We separated Faster R-CNN networks into two sub-network, that is one for large-size objects and another one for small-size objects. The resulting approach achieves a 28.3% average miss rate on the Caltech Pedestrian detection benchmark, which is competitive with the other best reported results.

Change Detection of a Small Town Area from Multi-Temporal Aerial Photos using Image Differencing and Image Ratio Techniques (다시기 항공사진으로부터 영상대차법과 영상대비법을 이용한 소도읍 지역의 변화 검출)

  • Lee, Jin-Duk;Yeon, Sang-Ho;Lee, Dong-Ho
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.11 no.1
    • /
    • pp.116-124
    • /
    • 2008
  • This study presents the application of multi-temporal and multi-scale panchromatic aerial photos for change detection in a small urban area. For aerial photos of the scale of 1:20,000 taken in 1987 and 1996 and the scale of 1:37,500 taken in 2000. Pre-processing that make the same conditions to all of the aerial photos was carried out through geometric correction, registration, contrasting, resamplimg, and mosaicking and then change detection were carried out respectively by image differencing and image ratio techniques. As a result, the change of urban features and landcover were able to be detected from panchromatic aerial photos that is single-band images and then the detected change results were compared between both techniques.

  • PDF

Robust AAM-based Face Tracking with Occlusion Using SIFT Features (SIFT 특징을 이용하여 중첩상황에 강인한 AAM 기반 얼굴 추적)

  • Eom, Sung-Eun;Jang, Jun-Su
    • The KIPS Transactions:PartB
    • /
    • v.17B no.5
    • /
    • pp.355-362
    • /
    • 2010
  • Face tracking is to estimate the motion of a non-rigid face together with a rigid head in 3D, and plays important roles in higher levels such as face/facial expression/emotion recognition. In this paper, we propose an AAM-based face tracking algorithm. AAM has been widely used to segment and track deformable objects, but there are still many difficulties. Particularly, it often tends to diverge or converge into local minima when a target object is self-occluded, partially or completely occluded. To address this problem, we utilize the scale invariant feature transform (SIFT). SIFT is an effective method for self and partial occlusion because it is able to find correspondence between feature points under partial loss. And it enables an AAM to continue to track without re-initialization in complete occlusions thanks to the good performance of global matching. We also register and use the SIFT features extracted from multi-view face images during tracking to effectively track a face across large pose changes. Our proposed algorithm is validated by comparing other algorithms under the above 3 kinds of occlusions.

EAR: Enhanced Augmented Reality System for Sports Entertainment Applications

  • Mahmood, Zahid;Ali, Tauseef;Muhammad, Nazeer;Bibi, Nargis;Shahzad, Imran;Azmat, Shoaib
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.12
    • /
    • pp.6069-6091
    • /
    • 2017
  • Augmented Reality (AR) overlays virtual information on real world data, such as displaying useful information on videos/images of a scene. This paper presents an Enhanced AR (EAR) system that displays useful statistical players' information on captured images of a sports game. We focus on the situation where the input image is degraded by strong sunlight. Proposed EAR system consists of an image enhancement technique to improve the accuracy of subsequent player and face detection. The image enhancement is followed by player and face detection, face recognition, and players' statistics display. First, an algorithm based on multi-scale retinex is proposed for image enhancement. Then, to detect players' and faces', we use adaptive boosting and Haar features for feature extraction and classification. The player face recognition algorithm uses boosted linear discriminant analysis to select features and nearest neighbor classifier for classification. The system can be adjusted to work in different types of sports where the input is an image and the desired output is display of information nearby the recognized players. Simulations are carried out on 2096 different images that contain players in diverse conditions. Proposed EAR system demonstrates the great potential of computer vision based approaches to develop AR applications.

Face Super-Resolution using Adversarial Distillation of Multi-Scale Facial Region Dictionary (다중 스케일 얼굴 영역 딕셔너리의 적대적 증류를 이용한 얼굴 초해상화)

  • Jo, Byungho;Park, In Kyu;Hong, Sungeun
    • Journal of Broadcast Engineering
    • /
    • v.26 no.5
    • /
    • pp.608-620
    • /
    • 2021
  • Recent deep learning-based face super-resolution (FSR) works showed significant performances by utilizing facial prior knowledge such as facial landmark and dictionary that reflects structural or semantic characteristics of the human face. However, most of these methods require additional processing time and memory. To solve this issue, this paper propose an efficient FSR models using knowledge distillation techniques. The intermediate features of teacher network which contains dictionary information based on major face regions are transferred to the student through adversarial multi-scale features distillation. Experimental results show that the proposed model is superior to other SR methods, and its effectiveness compare to teacher model.

A Saliency-Based Focusing Region Selection Method for Robust Auto-Focusing

  • Jeon, Jaehwan;Cho, Changhun;Paik, Joonki
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.1 no.3
    • /
    • pp.133-142
    • /
    • 2012
  • This paper presents a salient region detection algorithm for auto-focusing based on the characteristics of a human's visual attention. To describe the saliency at the local, regional, and global levels, this paper proposes a set of novel features including multi-scale local contrast, variance, center-surround entropy, and closeness to the center. Those features are then prioritized to produce a saliency map. The major advantage of the proposed approach is twofold; i) robustness to changes in focus and ii) low computational complexity. The experimental results showed that the proposed method outperforms the existing low-level feature-based methods in the sense of both robustness and accuracy for auto-focusing.

  • PDF

Enhanced SIFT Descriptor Based on Modified Discrete Gaussian-Hermite Moment

  • Kang, Tae-Koo;Zhang, Huazhen;Kim, Dong W.;Park, Gwi-Tae
    • ETRI Journal
    • /
    • v.34 no.4
    • /
    • pp.572-582
    • /
    • 2012
  • The discrete Gaussian-Hermite moment (DGHM) is a global feature representation method that can be applied to square images. We propose a modified DGHM (MDGHM) method and an MDGHM-based scale-invariant feature transform (MDGHM-SIFT) descriptor. In the MDGHM, we devise a movable mask to represent the local features of a non-square image. The complete set of non-square image features are then represented by the summation of all MDGHMs. We also propose to apply an accumulated MDGHM using multi-order derivatives to obtain distinguishable feature information in the third stage of the SIFT. Finally, we calculate an MDGHM-based magnitude and an MDGHM-based orientation using the accumulated MDGHM. We carry out experiments using the proposed method with six kinds of deformations. The results show that the proposed method can be applied to non-square images without any image truncation and that it significantly outperforms the matching accuracy of other SIFT algorithms.

The Application of Dyadic Wavelet In the RS Image Edge Detection

  • Qiming, Qin;Wenjun, Wang;Sijin, Chen
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.1268-1271
    • /
    • 2003
  • In the edge detection of RS image, the useful detail losing and the spurious edge often appear. To solve the problem, we use the dyadic wavelet to detect the edge of surface features by combining the edge detecting with the multi-resolution analyzing of the wavelet transform. Via the dyadic wavelet decomposing, we obtain the RS image of a certain appropriate scale, and figure out the edge data of the plane and the upright directions respectively, then work out the grads vector module of the surface features, at last by tracing them we get the edge data of the object therefore build the RS image which obtains the checked edge. This method can depress the effect of noise and examine exactly the edge data of the object by rule and line. With an experiment of a RS image which obtains an airport, we certificate the feasibility of the application of dyadic wavelet in the object edge detection.

  • PDF

Vehicle Image Recognition Using Deep Convolution Neural Network and Compressed Dictionary Learning

  • Zhou, Yanyan
    • Journal of Information Processing Systems
    • /
    • v.17 no.2
    • /
    • pp.411-425
    • /
    • 2021
  • In this paper, a vehicle recognition algorithm based on deep convolutional neural network and compression dictionary is proposed. Firstly, the network structure of fine vehicle recognition based on convolutional neural network is introduced. Then, a vehicle recognition system based on multi-scale pyramid convolutional neural network is constructed. The contribution of different networks to the recognition results is adjusted by the adaptive fusion method that adjusts the network according to the recognition accuracy of a single network. The proportion of output in the network output of the entire multiscale network. Then, the compressed dictionary learning and the data dimension reduction are carried out using the effective block structure method combined with very sparse random projection matrix, which solves the computational complexity caused by high-dimensional features and shortens the dictionary learning time. Finally, the sparse representation classification method is used to realize vehicle type recognition. The experimental results show that the detection effect of the proposed algorithm is stable in sunny, cloudy and rainy weather, and it has strong adaptability to typical application scenarios such as occlusion and blurring, with an average recognition rate of more than 95%.

Multi-scale agglomerates and photocatalytic properties of ZnS nanostructures

  • Man, Min-Tan;Lee, Hong-Seok
    • Proceedings of the Korean Vacuum Society Conference
    • /
    • 2016.02a
    • /
    • pp.267.2-267.2
    • /
    • 2016
  • Semiconductor photo-catalysis offers the potential for complete removal of toxic chemicals through its effective and broad potential applications. Various new compounds and materials for chemical catalysts were synthesized in the past few decades. As one of the most important II-VI group semiconductors, zinc sulfide (ZnS) with a wide direct band gap of 3.8 eV has been extensively investigated and used as a catalyst in photochemistry, environmental protection and in optoelectronic devices. In this work, the ZnS films and nanostructures have been successfully prepared by wet chemical method. We show that the agglomerates with four successive scales are always observed in the case of the homogeneous precipitation of zinc sulfide. Hydrodynamics plays a crucial role to determine the size of the largest agglomerates; however, other factors should be invoked to interpret the complete structure. In addition, studies of the photocatalytic properties by exposure to UV light irradiation demonstrated that ZnS nanocrystals (NCs) are good photo-catalysts as a result of the rapid generation of electron-hole pairs by photo-excitation and the highly negative reduction potentials of excited electrons. A combination of their unique features of high surface-to volume ratios, carrier dynamics and rich photo-catalytic suggests that these ZnS NCs will find many interesting applications in semiconductor photo-catalysis, solar cells, environmental remediation, and nano-devices.

  • PDF