• Title/Summary/Keyword: Image Feature Detection

Search Result 898, Processing Time 0.031 seconds

Change Detection for High-resolution Satellite Images Using Transfer Learning and Deep Learning Network (전이학습과 딥러닝 네트워크를 활용한 고해상도 위성영상의 변화탐지)

  • Song, Ah Ram;Choi, Jae Wan;Kim, Yong Il
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.37 no.3
    • /
    • pp.199-208
    • /
    • 2019
  • As the number of available satellites increases and technology advances, image information outputs are becoming increasingly diverse and a large amount of data is accumulating. In this study, we propose a change detection method for high-resolution satellite images that uses transfer learning and a deep learning network to overcome the limit caused by insufficient training data via the use of pre-trained information. The deep learning network used in this study comprises convolutional layers to extract the spatial and spectral information and convolutional long-short term memory layers to analyze the time series information. To use the learned information, the two initial convolutional layers of the change detection network are designed to use learned values from 40,000 patches of the ISPRS (International Society for Photogrammertry and Remote Sensing) dataset as initial values. In addition, 2D (2-Dimensional) and 3D (3-dimensional) kernels were used to find the optimized structure for the high-resolution satellite images. The experimental results for the KOMPSAT-3A (KOrean Multi-Purpose SATllite-3A) satellite images show that this change detection method can effectively extract changed/unchanged pixels but is less sensitive to changes due to shadow and relief displacements. In addition, the change detection accuracy of two sites was improved by using 3D kernels. This is because a 3D kernel can consider not only the spatial information but also the spectral information. This study indicates that we can effectively detect changes in high-resolution satellite images using the constructed image information and deep learning network. In future work, a pre-trained change detection network will be applied to newly obtained images to extend the scope of the application.

A Novel Video Copy Detection Method based on Statistical Analysis (통계적 분석 기반 불법 복제 비디오 영상 감식 방법)

  • Cho, Hye-Jeong;Kim, Ji-Eun;Sohn, Chae-Bong;Chung, Kwang-Sue;Oh, Seoung-Jun
    • Journal of Broadcast Engineering
    • /
    • v.14 no.6
    • /
    • pp.661-675
    • /
    • 2009
  • The carelessly and illegally copied contents are raising serious social problem as internet and multimedia technologies are advancing. Therefore, development of video copy detection system must be settled without delay. In this paper, we propose the hierarchical video copy detection method that estimates similarity using statistical characteristics between original video and manipulated(transformed) copy video. We rank according to luminance value of video to be robust to spacial transformation, and choose similar videos categorized as candidate segments in huge amount of database to reduce processing time and complexity. The copy videos generally insert black area in the edge of the image, so we remove rig black area and decide copy or not by using statistical characteristics of original video and copied video with center part of frame that contains important information of video. Experiment results show that the proposed method has similar keyframe accuracy to reference method, but we use less memory to save feature information than reference's, because the number of keyframes is less 61% than that of reference's. Also, the proposed method detects if the video is copied or not efficiently despite expansive spatial transformations such as blurring, contrast change, zoom in, zoom out, aspect ratio change, and caption insertion.

Cascade CNN with CPU-FPGA Architecture for Real-time Face Detection (실시간 얼굴 검출을 위한 Cascade CNN의 CPU-FPGA 구조 연구)

  • Nam, Kwang-Min;Jeong, Yong-Jin
    • Journal of IKEEE
    • /
    • v.21 no.4
    • /
    • pp.388-396
    • /
    • 2017
  • Since there are many variables such as various poses, illuminations and occlusions in a face detection problem, a high performance detection system is required. Although CNN is excellent in image classification, CNN operatioin requires high-performance hardware resources. But low cost low power environments are essential for small and mobile systems. So in this paper, the CPU-FPGA integrated system is designed based on 3-stage cascade CNN architecture using small size FPGA. Adaptive Region of Interest (ROI) is applied to reduce the number of CNN operations using face information of the previous frame. We use a Field Programmable Gate Array(FPGA) to accelerate the CNN computations. The accelerator reads multiple featuremap at once on the FPGA and performs a Multiply-Accumulate (MAC) operation in parallel for convolution operation. The system is implemented on Altera Cyclone V FPGA in which ARM Cortex A-9 and on-chip SRAM are embedded. The system runs at 30FPS with HD resolution input images. The CPU-FPGA integrated system showed 8.5 times of the power efficiency compared to systems using CPU only.

Gaze Detection System using Real-time Active Vision Camera (실시간 능동 비전 카메라를 이용한 시선 위치 추적 시스템)

  • 박강령
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.12
    • /
    • pp.1228-1238
    • /
    • 2003
  • This paper presents a new and practical method based on computer vision for detecting the monitor position where the user is looking. In general, the user tends to move both his face and eyes in order to gaze at certain monitor position. Previous researches use only one wide view camera, which can capture a whole user's face. In such a case, the image resolution is too low and the fine movements of user's eye cannot be exactly detected. So, we implement the gaze detection system with dual camera systems(a wide and a narrow view camera). In order to locate the user's eye position accurately, the narrow view camera has the functionalities of auto focusing and auto panning/tilting based on the detected 3D facial feature positions from the wide view camera. In addition, we use dual R-LED illuminators in order to detect facial features and especially eye features. As experimental results, we can implement the real-time gaze detection system and the gaze position accuracy between the computed positions and the real ones is about 3.44 cm of RMS error.

Image Based Damage Detection Method for Composite Panel With Guided Elastic Wave Technique Part I. Damage Localization Algorithm (복합재 패널에서 유도 탄성파를 이용한 이미지 기반 손상탐지 기법 개발 Part I. 손상위치 탐지 알고리즘)

  • Kim, Changsik;Jeon, Yongun;Park, Jungsun;Cho, Jin Yeon
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.49 no.1
    • /
    • pp.1-12
    • /
    • 2021
  • In this paper, a new algorithm is proposed to estimate the damage location in the composite panel by extracting the elastic wave signal reflected from the damaged area. The guided elastic wave is generated by a piezoelectric actuator and sensed by a piezoelectric sensor. The proposed algorithm adopts a diagnostic approach. It compares the non-damaged signal with the damaged signal, and extract damage information along with sensor network and lamb wave group velocity estimated by signal correlation. However, it is difficult to clearly distinguish the damage location due to the nonlinear properties of lamb wave and complex information composed of various signals. To overcome this difficulty, the cumulative summation feature vector algorithm(CSFV) and a visualization technique are newly proposed in this paper. CSFV algorithm finds the center position of the damage by converting the signals reflected from the damage to the area of distance at which signals reach, and visualization technique is applied that expresses feature vectors by multiplying damage indexes. Experiments are performed for a composite panel and comparative study with the existing algorithms is carried out. From the results, it is confirmed that the damage location can be detected by the proposed algorithm with more reliable accuracy.

Design of Port Security System Using Deep Learning and Object Features (딥러닝과 객체 특징점을 활용한 항만 보안시스템 설계)

  • Wang, Tae-su;Kim, Minyoung;Jang, Jongwook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.50-53
    • /
    • 2022
  • Recently, there have been cases in which counterfeit foreign ships have entered and left domestic ports several times. Vessels have a ship-specific serial number given by the International Maritime Organization (IMO) to identify the vessel, and IMO marking is mandatory on all ships built since 2004. In the case of airports and ports, which are representative logistics platforms, a security system is essential, but it is difficult to establish a security system at a port and there are many blind spots, which can cause security problems due to insufficient security systems. In this paper, a port security system is designed using deep learning object recognition and OpenCV. The security system process extracts the IMO number of the ship after recognizing the object when entering the ship, determines whether it is the same ship through feature point matching for ships with entry records, and stores the ship image and IMO number in the entry/exit DB for the first arrival vessel. Through the system of this paper, port security can be strengthened by improving the efficiency and system of port logistics by increasing the efficiency of port management personnel and reducing incidental costs caused by unauthorized entry.

  • PDF

The Research of Shape Recognition Algorithm for Image Processing of Cucumber Harvest Robot (오이수확로봇의 영상처리를 위한 형상인식 알고리즘에 관한 연구)

  • Min, Byeong-Ro;Lim, Ki-Taek;Lee, Dae-Weon
    • Journal of Bio-Environment Control
    • /
    • v.20 no.2
    • /
    • pp.63-71
    • /
    • 2011
  • Pattern recognition of a cucumber were conducted to detect directly the binary images by using thresholding method, which have the threshold level at the optimum intensity value. By restricting conditions of learning pattern, output patterns could be extracted from the same and similar input patterns by the algorithm. The algorithm of pattern recognition was developed to determine the position of the cucumber from a real image within working condition. The algorithm, designed and developed for this project, learned two, three or four learning pattern, and each learning pattern applied it to twenty sample patterns. The restored success rate of output pattern to sample pattern form two, three or four learning pattern was 65.0%, 45.0%, 12.5% respectively. The more number of learning pattern had, the more number of different out pattern detected when it was conversed. Detection of feature pattern of cucumber was processed by using auto scanning with real image of 30 by 30 pixel. The computing times required to execute the processing time of cucumber recognition took 0.5 to 1 second. Also, five real images tested, false pattern to the learning pattern is found that it has an elimination rate which is range from 96 to 98%. Some output patterns was recognized as a cucumber by the algorithm with the conditions. the rate of false recognition was range from 0.1 to 4.2%.

A Study on Lip-reading Enhancement Using Time-domain Filter (시간영역 필터를 이용한 립리딩 성능향상에 관한 연구)

  • 신도성;김진영;최승호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.5
    • /
    • pp.375-382
    • /
    • 2003
  • Lip-reading technique based on bimodal is to enhance speech recognition rate in noisy environment. It is most important to detect the correct lip-image. But it is hard to estimate stable performance in dynamic environment, because of many factors to deteriorate Lip-reading's performance. There are illumination change, speaker's pronunciation habit, versatility of lips shape and rotation or size change of lips etc. In this paper, we propose the IIR filtering in time-domain for the stable performance. It is very proper to remove the noise of speech, to enhance performance of recognition by digital filtering in time domain. While the lip-reading technique in whole lip image makes data massive, the Principal Component Analysis of pre-process allows to reduce the data quantify by detection of feature without loss of image information. For the observation performance of speech recognition using only image information, we made an experiment on recognition after choosing 22 words in available car service. We used Hidden Markov Model by speech recognition algorithm to compare this words' recognition performance. As a result, while the recognition rate of lip-reading using PCA is 64%, Time-domain filter applied to lip-reading enhances recognition rate of 72.4%.

Automatic gasometer reading system using selective optical character recognition (관심 문자열 인식 기술을 이용한 가스계량기 자동 검침 시스템)

  • Lee, Kyohyuk;Kim, Taeyeon;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.1-25
    • /
    • 2020
  • In this paper, we suggest an application system architecture which provides accurate, fast and efficient automatic gasometer reading function. The system captures gasometer image using mobile device camera, transmits the image to a cloud server on top of private LTE network, and analyzes the image to extract character information of device ID and gas usage amount by selective optical character recognition based on deep learning technology. In general, there are many types of character in an image and optical character recognition technology extracts all character information in an image. But some applications need to ignore non-of-interest types of character and only have to focus on some specific types of characters. For an example of the application, automatic gasometer reading system only need to extract device ID and gas usage amount character information from gasometer images to send bill to users. Non-of-interest character strings, such as device type, manufacturer, manufacturing date, specification and etc., are not valuable information to the application. Thus, the application have to analyze point of interest region and specific types of characters to extract valuable information only. We adopted CNN (Convolutional Neural Network) based object detection and CRNN (Convolutional Recurrent Neural Network) technology for selective optical character recognition which only analyze point of interest region for selective character information extraction. We build up 3 neural networks for the application system. The first is a convolutional neural network which detects point of interest region of gas usage amount and device ID information character strings, the second is another convolutional neural network which transforms spatial information of point of interest region to spatial sequential feature vectors, and the third is bi-directional long short term memory network which converts spatial sequential information to character strings using time-series analysis mapping from feature vectors to character strings. In this research, point of interest character strings are device ID and gas usage amount. Device ID consists of 12 arabic character strings and gas usage amount consists of 4 ~ 5 arabic character strings. All system components are implemented in Amazon Web Service Cloud with Intel Zeon E5-2686 v4 CPU and NVidia TESLA V100 GPU. The system architecture adopts master-lave processing structure for efficient and fast parallel processing coping with about 700,000 requests per day. Mobile device captures gasometer image and transmits to master process in AWS cloud. Master process runs on Intel Zeon CPU and pushes reading request from mobile device to an input queue with FIFO (First In First Out) structure. Slave process consists of 3 types of deep neural networks which conduct character recognition process and runs on NVidia GPU module. Slave process is always polling the input queue to get recognition request. If there are some requests from master process in the input queue, slave process converts the image in the input queue to device ID character string, gas usage amount character string and position information of the strings, returns the information to output queue, and switch to idle mode to poll the input queue. Master process gets final information form the output queue and delivers the information to the mobile device. We used total 27,120 gasometer images for training, validation and testing of 3 types of deep neural network. 22,985 images were used for training and validation, 4,135 images were used for testing. We randomly splitted 22,985 images with 8:2 ratio for training and validation respectively for each training epoch. 4,135 test image were categorized into 5 types (Normal, noise, reflex, scale and slant). Normal data is clean image data, noise means image with noise signal, relfex means image with light reflection in gasometer region, scale means images with small object size due to long-distance capturing and slant means images which is not horizontally flat. Final character string recognition accuracies for device ID and gas usage amount of normal data are 0.960 and 0.864 respectively.

A Study on the Applicability of Deep Learning Algorithm for Detection and Resolving of Occlusion Area (영상 폐색영역 검출 및 해결을 위한 딥러닝 알고리즘 적용 가능성 연구)

  • Bae, Kyoung-Ho;Park, Hong-Gi
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.11
    • /
    • pp.305-313
    • /
    • 2019
  • Recently, spatial information is being constructed actively based on the images obtained by drones. Because occlusion areas occur due to buildings as well as many obstacles, such as trees, pedestrians, and banners in the urban areas, an efficient way to resolve the problem is necessary. Instead of the traditional way, which replaces the occlusion area with other images obtained at different positions, various models based on deep learning were examined and compared. A comparison of a type of feature descriptor, HOG, to the machine learning-based SVM, deep learning-based DNN, CNN, and RNN showed that the CNN is used broadly to detect and classify objects. Until now, many studies have focused on the development and application of models so that it is impossible to select an optimal model. On the other hand, the upgrade of a deep learning-based detection and classification technique is expected because many researchers have attempted to upgrade the accuracy of the model as well as reduce the computation time. In that case, the procedures for generating spatial information will be changed to detect the occlusion area and replace it with simulated images automatically, and the efficiency of time, cost, and workforce will also be improved.