• Title/Summary/Keyword: Recognition algorithm

Search Result 3,539, Processing Time 0.033 seconds

HEVC Encoder Optimization using Depth Information (깊이정보를 이용한 HEVC의 인코더 고속화 방법)

  • Lee, Yoon Jin;Bae, Dong In;Park, Gwang Hoon
    • Journal of Broadcast Engineering
    • /
    • v.19 no.5
    • /
    • pp.640-655
    • /
    • 2014
  • Many of today's video systems have additional depth camera to provide extra features such as 3D support. Thanks to these changes made in multimedia system, it is now much easier to obtain depth information of the video. Depth information can be used in various areas such as object classification, background area recognition, and so on. With depth information, we can achieve even higher coding efficiency compared to only using conventional method. Thus, in this paper, we propose the 2D video coding algorithm which uses depth information on top of the next generation 2D video codec HEVC. Background area can be recognized with depth information and by performing HEVC with it, coding complexity can be reduced. If current CU is background area, we propose the following three methods, 1) Earlier stop split structure of CU with PU SKIP mode, 2) Limiting split structure of CU with CU information in temporal position, 3) Limiting the range of motion searching. We implement our proposal using HEVC HM 12.0 reference software. With these methods results shows that encoding complexity is reduced more than 40% with only 0.5% BD-Bitrate loss. Especially, in case of video acquired through the Kinect developed by Microsoft Corp., encoding complexity is reduced by max 53% without a loss of quality. So, it is expected that these techniques can apply real-time online communication, mobile or handheld video service and so on.

Smart Ship Container With M2M Technology (M2M 기술을 이용한 스마트 선박 컨테이너)

  • Sharma, Ronesh;Lee, Seong Ro
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.38C no.3
    • /
    • pp.278-287
    • /
    • 2013
  • Modern information technologies continue to provide industries with new and improved methods. With the rapid development of Machine to Machine (M2M) communication, a smart container supply chain management is formed based on high performance sensors, computer vision, Global Positioning System (GPS) satellites, and Globle System for Mobile (GSM) communication. Existing supply chain management has limitation to real time container tracking. This paper focuses on the studies and implementation of real time container chain management with the development of the container identification system and automatic alert system for interrupts and for normal periodical alerts. The concept and methods of smart container modeling are introduced together with the structure explained prior to the implementation of smart container tracking alert system. Firstly, the paper introduces the container code identification and recognition algorithm implemented in visual studio 2010 with Opencv (computer vision library) and Tesseract (OCR engine) for real time operation. Secondly it discusses the current automatic alert system provided for real time container tracking and the limitations of those systems. Finally the paper summarizes the challenges and the possibilities for the future work for real time container tracking solutions with the ubiquitous mobile and satellite network together with the high performance sensors and computer vision. All of those components combine to provide an excellent delivery of supply chain management with outstanding operation and security.

Edge Grouping and Contour Detection by Delaunary Triangulation (Delaunary 삼각화에 의한 그룹화 및 외형 탐지)

  • Lee, Sang-Hyun;Jung, Byeong-Soo;Jeong, Je-Pyong;Kim, Jung-Rok;Moon, Kyung-li
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.13 no.1
    • /
    • pp.135-142
    • /
    • 2013
  • Contour detection is important for many computer vision applications, such as shape discrimination and object recognition. In many cases, local luminance changes turn out to be stronger in textured areas than on object contours. Therefore, local edge features, which only look at a small neighborhood of each pixel, cannot be reliable indicators of the presence of a contour, and some global analysis is needed. The novelty of this operator is that dilation is limited to Deluanary triangular. An efficient implementation is presented. The grouping algorithm is then embedded in a multi-threshold contour detector. At each threshold level, small groups of edges are removed, and contours are completed by means of a generalized reconstruction from markers. Both qualitative and quantitative comparison with existing approaches prove the superiority of the proposed contour detector in terms of larger amount of suppressed texture and more effective detection of low-contrast contour.

LASPI: Hardware friendly LArge-scale stereo matching using Support Point Interpolation (LASPI: 지원점 보간법을 이용한 H/W 구현에 용이한 스테레오 매칭 방법)

  • Park, Sanghyun;Ghimire, Deepak;Kim, Jung-guk;Han, Youngki
    • Journal of KIISE
    • /
    • v.44 no.9
    • /
    • pp.932-945
    • /
    • 2017
  • In this paper, a new hardware and software architecture for a stereo vision processing system including rectification, disparity estimation, and visualization was developed. The developed method, named LArge scale stereo matching method using Support Point Interpolation (LASPI), shows excellence in real-time processing for obtaining dense disparity maps from high quality image regions that contain high density support points. In the real-time processing of high definition (HD) images, LASPI does not degrade the quality level of disparity maps compared to existing stereo-matching methods such as Efficient LArge-scale Stereo matching (ELAS). LASPI has been designed to meet a high frame-rate, accurate distance resolution performance, and a low resource usage even in a limited resource environment. These characteristics enable LASPI to be deployed to safety-critical applications such as an obstacle recognition system and distance detection system for autonomous vehicles. A Field Programmable Gate Array (FPGA) for the LASPI algorithm has been implemented in order to support parallel processing and 4-stage pipelining. From various experiments, it was verified that the developed FPGA system (Xilinx Virtex-7 FPGA, 148.5MHz Clock) is capable of processing 30 HD ($1280{\times}720pixels$) frames per second in real-time while it generates disparity maps that are applicable to real vehicles.

Analysis of Galvanic Skin Response Signal for High-Arousal Negative Emotion Using Discrete Wavelet Transform (이산 웨이브렛 변환을 이용한 고각성 부정 감성의 GSR 신호 분석)

  • Lim, Hyun-Jun;Yoo, Sun-Kook;Jang, Won Seuk
    • Science of Emotion and Sensibility
    • /
    • v.20 no.3
    • /
    • pp.13-22
    • /
    • 2017
  • Emotion has a direct influence such as decision-making, perception, etc. and plays an important role in human life. For the convenient and accurate recognition of high-arousal negative emotion, the purpose of this paper is to design an algorithm for analysis using the bio-signal. In this study, after two emotional induction using the 'normal' / 'fear' emotion types of videos, we measured the Galvanic Skin Response (GSR) signal which is the simple of bio-signals. Then, by decomposing Tonic component and Phasic component in the measured GSR and decomposing Skin Conductance Very Slow Response (SCVSR) and Skin Conductance Slow Response (SCSR) in the Phasic component associated with emotional stimulation, extracting the major features of the components for an accurate analysis, we used a discrete wavelet transform with excellent time-frequency localization characteristics, not the method used previously. The extracted features are maximum value of Phasic component, amplitude of Phasic component, zero crossing rate of SCVSR and zero crossing rate of SCSR for distinguishing high-arousal negative emotion. As results, the case of high-arousal negative emotion exhibited higher value than the case of low-arousal normal emotion in all 4 of the features, and the more significant difference between the two emotion was found statistically than the previous analysis method. Accordingly, the results of this study indicate that the GSR may be a useful indicator for a high-arousal negative emotion measurement and contribute to the development of the emotional real-time rating system using the GSR.

A Study on the Recognition Algorithm of Paprika in the Images using the Deep Neural Networks (심층 신경망을 이용한 영상 내 파프리카 인식 알고리즘 연구)

  • Hwa, Ji Ho;Lee, Bong Ki;Lee, Dae Weon
    • Proceedings of the Korean Society for Agricultural Machinery Conference
    • /
    • 2017.04a
    • /
    • pp.142-142
    • /
    • 2017
  • 본 연구에서는 파프리카를 자동 수확하기 위한 시스템 개발의 일환으로 파프리카 재배환경에서 획득한 영상 내에 존재하는 파프리카 영역과 비 파프리카 영역의 RGB 정보를 입력으로 하는 인공신경망을 설계하고 학습을 수행하고자 하였다. 학습된 신경망을 이용하여 영상 내 파프리카 영역과 비 파프리카 영역의 구분이 가능 할 것으로 사료된다. 심층 신경망을 설계하기 위하여 MS Visual studio 2015의 C++, MFC와 Python 및 TensorFlow를 사용하였다. 먼저, 심층 신경망은 입력층과 출력층, 그리고 은닉층 8개를 가지는 형태로 입력 뉴런 3개, 출력 뉴런 4개, 각 은닉층의 뉴런은 5개로 설계하였다. 일반적으로 심층 신경망에서는 은닉층이 깊을수록 적은 입력으로 좋은 학습 결과를 기대 할 수 있지만 소요되는 시간이 길고 오버 피팅이 일어날 가능성이 높아진다. 따라서 본 연구에서는 소요시간을 줄이기 위하여 Xavier 초기화를 사용하였으며, 오버 피팅을 줄이기 위하여 ReLU 함수를 활성화 함수로 사용하였다. 파프리카 재배환경에서 획득한 영상에서 파프리카 영역과 비 파프리카 영역의 RGB 정보를 추출하여 학습의 입력으로 하고 기대 출력으로 붉은색 파프리카의 경우 [0 0 1], 노란색 파프리카의 경우 [0 1 0], 비 파프리카 영역의 경우 [1 0 0]으로 하는 형태로 3538개의 학습 셋을 만들었다. 학습 후 학습 결과를 평가하기 위하여 30개의 테스트 셋을 사용하였다. 학습 셋을 이용하여 학습을 수행하기 위해 학습률을 변경하면서 학습 결과를 확인하였다. 학습률을 0.01 이상으로 설정한 경우 학습이 이루어지지 않았다. 이는 학습률에 의해 결정되는 가중치의 변화량이 너무 커서 비용 함수의 결과가 0에 수렴하지 않고 발산하는 경향에 의한 것으로 사료된다. 학습률을 0.005, 0.001로 설정 한 경우 학습에 성공하였다. 학습률 0.005의 경우 학습 횟수 3146회, 소요시간 20.48초, 학습 정확도 99.77%, 테스트 정확도 100%였으며, 학습률 0.001의 경우 학습 횟수 38931회, 소요시간 181.39초, 학습 정확도 99.95%, 테스트 정확도 100%였다. 학습률이 작을수록 더욱 정확한 학습이 가능하지만 소요되는 시간이 크고 국부 최소점에 빠질 확률이 높았다. 학습률이 큰 경우 학습 소요 시간이 줄어드는 반면 학습 과정에서 비용이 발산하여 학습이 이루어지지 않는 경우가 많음을 확인 하였다.

  • PDF

A screening of Alzheimer's disease using basis synthesis by singular value decomposition from Raman spectra of platelet (혈소판 라만 스펙트럼에서 특이값 분해에 의한 기저 합성을 통한 알츠하이머병 검출)

  • Park, Aaron;Baek, Sung-June
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.5
    • /
    • pp.2393-2399
    • /
    • 2013
  • In this paper, we proposed a method to screening of Alzheimer's disease (AD) from Raman spectra of platelet with synthesis of basis spectra using singular value decomposition (SVD). Raman spectra of platelet from AD transgenic mice are preprocessed with denoising, removal background and normalization method. The column vectors of each data matrix consist of Raman spectrum of AD and normal (NR). The matrix is factorized using SVD algorithm and then the basis spectra of AD and NR are determined by 12 column vectors of each matrix. The classification process is completed by select the class that minimized the root-mean-square error between the validation spectrum and the linear synthesized spectrum of the basis spectra. According to the experiments involving 278 Raman spectra, the proposed method gave about 97.6% classification rate, which is better performance about 6.1% than multi-layer perceptron (MLP) with extracted features using principle components analysis (PCA). The results show that the basis spectra using SVD is well suited for the diagnosis of AD by Raman spectra from platelet.

Study on Development of Automated Program Model for Measuring Sensibility Preference of Portrait (인물사진의 감성 선호도 측정 자동화 프로그램 모형 개발 연구)

  • Lee, Chang-Seop;Jung, Da-Yeon;Lee, Eun-Ju;Har, Dong-Hwan
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.9
    • /
    • pp.34-43
    • /
    • 2018
  • The purpose of this study is to develop measurement program model for a human being-oriented product through the between the evaluation factors of portrait and general preferences of portraits. We added new items that are essential to the image evaluation by analysing previous studies. In this study, We identified the facial focus for the first step, and the portraits were evaluated by dividing it into objective and subjective image quality evaluation items. RSC Contrast and Dynamic Range were selected as the Objective evaluation items, and the numerical values of each image could be evaluation items, and the numerical values of each image could be evaluated by statistical analysis method. Facial Exposure, Composition, Position, Ratio, Out of focus, and Emotions and Color tone of image were selected as the Subjective evaluation items. In addition, a new face recognition algorithm is applied to judge the emotions, the manufacturer can get the information that they can analyze the people's emotion. The program developed to quantitatively and qualitatively compiles the evaluation items when evaluating portraits. The program that I developed through this study can be used an analysis program that produce the data for developing the evaluation model of the product more suitable to general users of imaging systems.

Improvements of an English Pronunciation Dictionary Generator Using DP-based Lexicon Pre-processing and Context-dependent Grapheme-to-phoneme MLP (DP 알고리즘에 의한 발음사전 전처리와 문맥종속 자소별 MLP를 이용한 영어 발음사전 생성기의 개선)

  • 김회린;문광식;이영직;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.5
    • /
    • pp.21-27
    • /
    • 1999
  • In this paper, we propose an improved MLP-based English pronunciation dictionary generator to apply to the variable vocabulary word recognizer. The variable vocabulary word recognizer can process any words specified in Korean word lexicon dynamically determined according to the current recognition task. To extend the ability of the system to task for English words, it is necessary to build a pronunciation dictionary generator to be able to process words not included in a predefined lexicon, such as proper nouns. In order to build the English pronunciation dictionary generator, we use context-dependent grapheme-to-phoneme multi-layer perceptron(MLP) architecture for each grapheme. To train each MLP, it is necessary to obtain grapheme-to-phoneme training data from general pronunciation dictionary. To automate the process, we use dynamic programming(DP) algorithm with some distance metrics. For training and testing the grapheme-to-phoneme MLPs, we use general English pronunciation dictionary with about 110 thousand words. With 26 MLPs each having 30 to 50 hidden nodes and the exception grapheme lexicon, we obtained the word accuracy of 72.8% for the 110 thousand words superior to rule-based method showing the word accuracy of 24.0%.

  • PDF

Development of K-$Touch^{TM}$ API for kinesthetic/tactile haptic interaction (역/촉감 햅틱 상호작용을 위한 "K-$Touch^{TM}$" API 개발 - 햅틱(Haptic) 개발자 및 응용분야를 위한 소프트웨어 인터페이스 -)

  • Lee, Beom-Chan;Kim, Jong-Phil;Ryu, Je-Ha
    • Journal of the HCI Society of Korea
    • /
    • v.1 no.2
    • /
    • pp.1-8
    • /
    • 2006
  • This paper presents a development of new haptic API (Application Programming Interface) that is called K-$Touch^{TM}$ haptic API. It is designed in order to allow users to interact with objects by kinesthetic and tactile modalities through haptic interfaces. The K-$Touch^{TM}$ API would serve two different types of users: high level programmers who need an easy to use haptic API for creating haptic applications and researchers in the haptic filed who need to experiment or develop with new devices and new algorithms while not wanting to re-write all the required code from scratch. Since the graphic hardware based kinesthetic rendering algorithm implemented in the K-$Touch^{TM}$ API is different from any other conventional kinesthetic algorithms, this API can provide users with haptic interaction for various data representations such as 2D, 2.5D depth(height field), 3D polygon, and volume data. In addition, this API supports kinesthetic and tactile interaction simultaneously in order to allow users with realistic haptic interaction. With a wide range of applicative characteristics, therefore, it is expected that the proposed K-$Touch^{TM}$ haptic API will assists to have deeper recognition of the environments, and enhance a sense of immersion in environments. Moreover, it will be useful development toolkit to investigate new devices and algorithms in the haptic research field.

  • PDF