• Title/Summary/Keyword: image feature extraction

Search Result 1,017, Processing Time 0.025 seconds

Automatic gasometer reading system using selective optical character recognition (관심 문자열 인식 기술을 이용한 가스계량기 자동 검침 시스템)

  • Lee, Kyohyuk;Kim, Taeyeon;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.1-25
    • /
    • 2020
  • In this paper, we suggest an application system architecture which provides accurate, fast and efficient automatic gasometer reading function. The system captures gasometer image using mobile device camera, transmits the image to a cloud server on top of private LTE network, and analyzes the image to extract character information of device ID and gas usage amount by selective optical character recognition based on deep learning technology. In general, there are many types of character in an image and optical character recognition technology extracts all character information in an image. But some applications need to ignore non-of-interest types of character and only have to focus on some specific types of characters. For an example of the application, automatic gasometer reading system only need to extract device ID and gas usage amount character information from gasometer images to send bill to users. Non-of-interest character strings, such as device type, manufacturer, manufacturing date, specification and etc., are not valuable information to the application. Thus, the application have to analyze point of interest region and specific types of characters to extract valuable information only. We adopted CNN (Convolutional Neural Network) based object detection and CRNN (Convolutional Recurrent Neural Network) technology for selective optical character recognition which only analyze point of interest region for selective character information extraction. We build up 3 neural networks for the application system. The first is a convolutional neural network which detects point of interest region of gas usage amount and device ID information character strings, the second is another convolutional neural network which transforms spatial information of point of interest region to spatial sequential feature vectors, and the third is bi-directional long short term memory network which converts spatial sequential information to character strings using time-series analysis mapping from feature vectors to character strings. In this research, point of interest character strings are device ID and gas usage amount. Device ID consists of 12 arabic character strings and gas usage amount consists of 4 ~ 5 arabic character strings. All system components are implemented in Amazon Web Service Cloud with Intel Zeon E5-2686 v4 CPU and NVidia TESLA V100 GPU. The system architecture adopts master-lave processing structure for efficient and fast parallel processing coping with about 700,000 requests per day. Mobile device captures gasometer image and transmits to master process in AWS cloud. Master process runs on Intel Zeon CPU and pushes reading request from mobile device to an input queue with FIFO (First In First Out) structure. Slave process consists of 3 types of deep neural networks which conduct character recognition process and runs on NVidia GPU module. Slave process is always polling the input queue to get recognition request. If there are some requests from master process in the input queue, slave process converts the image in the input queue to device ID character string, gas usage amount character string and position information of the strings, returns the information to output queue, and switch to idle mode to poll the input queue. Master process gets final information form the output queue and delivers the information to the mobile device. We used total 27,120 gasometer images for training, validation and testing of 3 types of deep neural network. 22,985 images were used for training and validation, 4,135 images were used for testing. We randomly splitted 22,985 images with 8:2 ratio for training and validation respectively for each training epoch. 4,135 test image were categorized into 5 types (Normal, noise, reflex, scale and slant). Normal data is clean image data, noise means image with noise signal, relfex means image with light reflection in gasometer region, scale means images with small object size due to long-distance capturing and slant means images which is not horizontally flat. Final character string recognition accuracies for device ID and gas usage amount of normal data are 0.960 and 0.864 respectively.

Vapor Recognition Using Image Matching of Micro-Array Sensor Response from Portable Electronic Nose (휴대용 전자 후각 장치에서 다채널 마이크로 센서 신호의 영상 정합을 이용한 가스 인식)

  • Yang, Yoon-Seok
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.48 no.2
    • /
    • pp.64-70
    • /
    • 2011
  • Portable artificial electronic nose (E-nose) system suffers from noisy fluctuation in surroundings such as temperature, vapor concentration, and gas flow, because its measuring condition is not controled precisely as in the laboratory. It is important to develop a simple and robust vapor recognition technique applicable to this uncontrolled measurement, especially for the portable measuring and diagnostic system which are expanding its area with the improvements in micro bio sensor technology. This study used a PDA-based portable E-nose to collect the uncontrolled vapor measurement signals, and applied the image matching algorithm developed in the previous study on the measured signal to verify its robustness and improved accuracy in portable vapor recognition. The results showed not only its consistent performance under noisy fluctuation in the portable measurement signal, but also an advanced recognition accuracy for 2 similar vapor species which have been hard to discriminate with the conventional maximum sensitivity feature extraction method. The proposed method can be easily applied to the data processing of the ubiquitous sensor network (USN) which are usually exposed to various operating conditions. Furthermore, it will greatly help to realize portable medical diagnostic and environment monitoring system with its robust performance and high accuracy.

A Study of Feature-Extraction from the Specifically Intended Product Designs (제품의 특성추출을 통한 디자인 적용 방법에 관한 연구)

  • Hyoung, Sung-Eun;Cho, Un-Dea;Cho, Kwang-Soo
    • Science of Emotion and Sensibility
    • /
    • v.10 no.1
    • /
    • pp.87-98
    • /
    • 2007
  • The aim of this study is to grasp the features of the object which reveals its own specific purposes, and to apply them to the product concept and design forms when designers develop products. For this study, the subjects of the experiment were chosen to fill out a basic questionnaire, and an image analysis of them was performed. After the analysis, the functional design elements of the subjects were extracted and coded. They preyed the correlation between the results of the image analysis and the characteristics of the subjects. The questionnaire was carried out to determine the characteristics of the subjects. As the features of specific products were extracted through this experiment, they can be used as basic data to analyze consumer needs and to better understand the products when we design for them. This can be useful fundamental data enabling designers to understand products easily and to establish concepts for their designs. In the case of the MP3 player in this study, the results of the image analysis of it are turned out to be sound quality, compatibility, portability, employment, interface, and personality. Their respective related features were investigated as well. The important features of designing the MP3 player were presented. Through this fundamental study, it will be possible to understand consumer's needs more effectively, which will bring about the development of the fundamental basis of various fields in design.

  • PDF

A Robust Pattern Watermarking Method by Invisibility and Similarity Improvement (비가시성과 유사도 증가를 통한 강인한 패턴 워터마킹 방법)

  • 이경훈;김용훈;이태홍
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.10
    • /
    • pp.938-943
    • /
    • 2003
  • In this paper, we Propose a method using the Tikhonov-Miller process to improve the robustness of watermarking under various attacks. A visually recognizable pattern watermark is embedded in the LH2, HL2 and HH2 subband of wavelet transformed domain using threshold and besides watermark is embeded by utilizing HVS(Human Visual System) feature. The pattern watermark was interlaced after random Permutation for a security and an extraction rate. To demonstrate the improvement of robustness and similarity of the proposed method, we applied some basic algorithm of image processing such as scaling, filtering, cropping, histogram equalizing and lossy compression(JPEG, gif). As a result of experiment, the proposed method was able to embed robust watermark invisibility and extract with an excellent normalized correlation of watermark under various attacks.

Hangul Component Decomposition in Outline Fonts (한글 외곽선 폰트의 자소 분할)

  • Koo, Sang-Ok;Jung, Soon-Ki
    • Journal of the Korea Computer Graphics Society
    • /
    • v.17 no.4
    • /
    • pp.11-21
    • /
    • 2011
  • This paper proposes a method for decomposing a Hangul glyph of outline fonts into its initial, medial and final components using statistical-structural information. In a font family, the positions of components are statistically consistent and the stroke relationships of a Hangul character reflect its structure. First, we create the component histograms that accumulate the shapes and positions of the same components. Second, we make pixel clusters from character image based on pixel direction probabilities and extract the candidate strokes using position, direction, size of clusters and adjacencies between clusters. Finally, we find the best structural match between candidate strokes and predefined character model by relaxation labeling. The proposed method in this paper can be used for a study on formative characteristics of Hangul font, and for a font classification/retrieval system.

On Motion Planning for Human-Following of Mobile Robot in a Predictable Intelligent Space

  • Jin, Tae-Seok;Hashimoto, Hideki
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.4 no.1
    • /
    • pp.101-110
    • /
    • 2004
  • The robots that will be needed in the near future are human-friendly robots that are able to coexist with humans and support humans effectively. To realize this, humans and robots need to be in close proximity to each other as much as possible. Moreover, it is necessary for their interactions to occur naturally. It is desirable for a robot to carry out human following, as one of the human-affinitive movements. The human-following robot requires several techniques: the recognition of the moving objects, the feature extraction and visual tracking, and the trajectory generation for following a human stably. In this research, a predictable intelligent space is used in order to achieve these goals. An intelligent space is a 3-D environment in which many sensors and intelligent devices are distributed. Mobile robots exist in this space as physical agents providing humans with services. A mobile robot is controlled to follow a walking human using distributed intelligent sensors as stably and precisely as possible. The moving objects is assumed to be a point-object and projected onto an image plane to form a geometrical constraint equation that provides position data of the object based on the kinematics of the intelligent space. Uncertainties in the position estimation caused by the point-object assumption are compensated using the Kalman filter. To generate the shortest time trajectory to follow the walking human, the linear and angular velocities are estimated and utilized. The computer simulation and experimental results of estimating and following of the walking human with the mobile robot are presented.

On Robust Principal Component using Analysis Neural Networks (신경망을 이용한 로버스트 주성분 분석에 관한 연구)

  • Kim, Sang-Min;Oh, Kwang-Sik;Park, Hee-Joo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.7 no.1
    • /
    • pp.113-118
    • /
    • 1996
  • Principal component analysis(PCA) is an essential technique for data compression and feature extraction, and has been widely used in statistical data analysis, communication theory, pattern recognition, and image processing. Oja(1992) found that a linear neuron with constrained Hebbian learning rule can extract the principal component by using stochastic gradient ascent method. In practice real data often contain some outliers. These outliers will significantly deteriorate the performances of the PCA algorithms. In order to make PCA robust, Xu & Yuille(1995) applied statistical physics to the problem of robust principal component analysis(RPCA). Devlin et.al(1981) obtained principal components by using techniques such as M-estimation. The propose of this paper is to investigate from the statistical point of view how Xu & Yuille's(1995) RPCA works under the same simulation condition as in Devlin et.al(1981).

  • PDF

Similar Movie Contents Retrieval Using Peak Features from Audio (오디오의 Peak 특징을 이용한 동일 영화 콘텐츠 검색)

  • Chung, Myoung-Bum;Sung, Bo-Kyung;Ko, Il-Ju
    • Journal of Korea Multimedia Society
    • /
    • v.12 no.11
    • /
    • pp.1572-1580
    • /
    • 2009
  • Combing through entire video files for the purpose of recognizing and retrieving matching movies requires much time and memory space. Instead, most current similar movie-matching methods choose to analyze only a part of each movie's video-image information. Yet, these methods still share a critical problem of erroneously recognizing as being different matching videos that have been altered only in resolution or converted merely with a different codecs. This paper proposes an audio-information-based search algorithm by which similar movies can be identified. The proposed method prepares and searches through a database of movie's spectral peak information that remains relatively steady even with changes in the bit-rate, codecs, or sample-rate. The method showed a 92.1% search success rate, given a set of 1,000 video files whose audio-bit-rate had been altered or were purposefully written in a different codec.

  • PDF

Using Skeleton Vector Information and RNN Learning Behavior Recognition Algorithm (스켈레톤 벡터 정보와 RNN 학습을 이용한 행동인식 알고리즘)

  • Kim, Mi-Kyung;Cha, Eui-Young
    • Journal of Broadcast Engineering
    • /
    • v.23 no.5
    • /
    • pp.598-605
    • /
    • 2018
  • Behavior awareness is a technology that recognizes human behavior through data and can be used in applications such as risk behavior through video surveillance systems. Conventional behavior recognition algorithms have been performed using the 2D camera image device or multi-mode sensor or multi-view or 3D equipment. When two-dimensional data was used, the recognition rate was low in the behavior recognition of the three-dimensional space, and other methods were difficult due to the complicated equipment configuration and the expensive additional equipment. In this paper, we propose a method of recognizing human behavior using only CCTV images without additional equipment using only RGB and depth information. First, the skeleton extraction algorithm is applied to extract points of joints and body parts. We apply the equations to transform the vector including the displacement vector and the relational vector, and study the continuous vector data through the RNN model. As a result of applying the learned model to various data sets and confirming the accuracy of the behavior recognition, the performance similar to that of the existing algorithm using the 3D information can be verified only by the 2D information.

Environmental IoT-Enabled Multimodal Mashup Service for Smart Forest Fires Monitoring

  • Elmisery, Ahmed M.;Sertovic, Mirela
    • Journal of Multimedia Information System
    • /
    • v.4 no.4
    • /
    • pp.163-170
    • /
    • 2017
  • Internet of things (IoT) is a new paradigm for collecting, processing and analyzing various contents in order to detect anomalies and to monitor particular patterns in a specific environment. The collected data can be used to discover new patterns and to offer new insights. IoT-enabled data mashup is a new technology to combine various types of information from multiple sources into a single web service. Mashup services create a new horizon for different applications. Environmental monitoring is a serious tool for the state and private organizations, which are located in regions with environmental hazards and seek to gain insights to detect hazards and locate them clearly. These organizations may utilize IoT - enabled data mashup service to merge different types of datasets from different IoT sensor networks in order to leverage their data analytics performance and the accuracy of the predictions. This paper presents an IoT - enabled data mashup service, where the multimedia data is collected from the various IoT platforms, then fed into an environmental cognition service which executes different image processing techniques such as noise removal, segmentation, and feature extraction, in order to detect interesting patterns in hazardous areas. The noise present in the captured images is eliminated with the help of a noise removal and background subtraction processes. Markov based approach was utilized to segment the possible regions of interest. The viable features within each region were extracted using a multiresolution wavelet transform, then fed into a discriminative classifier to extract various patterns. Experimental results have shown an accurate detection performance and adequate processing time for the proposed approach. We also provide a data mashup scenario for an IoT-enabled environmental hazard detection service and experimentation results.