• 제목/요약/키워드: 최적 학습 모델 구성

Search Result 98, Processing Time 0.025 seconds

Development of Facial Emotion Recognition System Based on Optimization of HMM Structure by using Harmony Search Algorithm (Harmony Search 알고리즘 기반 HMM 구조 최적화에 의한 얼굴 정서 인식 시스템 개발)

  • Ko, Kwang-Eun;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.21 no.3
    • /
    • pp.395-400
    • /
    • 2011
  • In this paper, we propose an study of the facial emotion recognition considering the dynamical variation of emotional state in facial image sequences. The proposed system consists of two main step: facial image based emotional feature extraction and emotional state classification/recognition. At first, we propose a method for extracting and analyzing the emotional feature region using a combination of Active Shape Model (ASM) and Facial Action Units (FAUs). And then, it is proposed that emotional state classification and recognition method based on Hidden Markov Model (HMM) type of dynamic Bayesian network. Also, we adopt a Harmony Search (HS) algorithm based heuristic optimization procedure in a parameter learning of HMM in order to classify the emotional state more accurately. By using all these methods, we construct the emotion recognition system based on variations of the dynamic facial image sequence and make an attempt at improvement of the recognition performance.

Performance Comparison of State-of-the-Art Vocoder Technology Based on Deep Learning in a Korean TTS System (한국어 TTS 시스템에서 딥러닝 기반 최첨단 보코더 기술 성능 비교)

  • Kwon, Chul Hong
    • The Journal of the Convergence on Culture Technology
    • /
    • v.6 no.2
    • /
    • pp.509-514
    • /
    • 2020
  • The conventional TTS system consists of several modules, including text preprocessing, parsing analysis, grapheme-to-phoneme conversion, boundary analysis, prosody control, acoustic feature generation by acoustic model, and synthesized speech generation. But TTS system with deep learning is composed of Text2Mel process that generates spectrogram from text, and vocoder that synthesizes speech signals from spectrogram. In this paper, for the optimal Korean TTS system construction we apply Tacotron2 to Tex2Mel process, and as a vocoder we introduce the methods such as WaveNet, WaveRNN, and WaveGlow, and implement them to verify and compare their performance. Experimental results show that WaveNet has the highest MOS and the trained model is hundreds of megabytes in size, but the synthesis time is about 50 times the real time. WaveRNN shows MOS performance similar to that of WaveNet and the model size is several tens of megabytes, but this method also cannot be processed in real time. WaveGlow can handle real-time processing, but the model is several GB in size and MOS is the worst of the three vocoders. From the results of this study, the reference criteria for selecting the appropriate method according to the hardware environment in the field of applying the TTS system are presented in this paper.

Fast Distributed Network File System using State Transition Model in the Media Streaming System (미디어 스트리밍 시스템에서의 상태 천이 모델을 활용한 고속 분산 네트워크 파일 시스템)

  • Woo, Soon;Lee, Jun-Pyo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.6
    • /
    • pp.145-152
    • /
    • 2012
  • Due to the large sizes of streaming media, previous delivery techniques are not providing optimal performance. For this purpose, video proxy server is employed for reducing the bandwidth consumption, network congestion, and network traffic. This paper proposes a fast distributed network file system using state transition model in the media streaming system for efficient utilization of video proxy server. The proposed method is composed of three steps: step 1. Training process using state transition model, step 2. base and decision probability generation, and step 3. storing and deletion based on probability. In addition, storage space of video proxy server is divided into each segment area in order to store the segments efficiently and to avoid the fragmentation. The simulation results show that the proposed method performs better than other methods in terms of hit rate and number of deletion. Therefore, the proposed method provides the lowest user start-up latency and the highest bandwidth saving significantly.

Modular UI research as a user customization method on handheld devices (User Customization이 가능한 Handheld device의 Modular interface 설계)

  • Song, Sang-Gon;Park, Bo-Eun;Jang, Hyun-Kook;Kim, Young-Sun;Kim, Na-Young;Park, Hyun-Chul
    • 한국HCI학회:학술대회논문집
    • /
    • 2007.02b
    • /
    • pp.153-158
    • /
    • 2007
  • 기존의 모바일 기기들은 일반화된 고정 인터페이스 상태의 제품으로, 사용자가 단순 학습을 통해 반복된 인터렉션과 구성만을 이용하는 수준이었으나 최근과 같이 기기가 지능적이고 다양화되면서 개인의 요구와 특성을 반영한 인터페이스 방법들을 요구하기 시작하였다. 이에 따라 일반적 성향에 맞추어진 Universal UI와 개인의 특성에 맞춰진 Customizing UI의 양립적 이슈는 지속적으로 대치되어 왔다. 그러므로 이를 서로 적절한 수준에서 상호 보완하여 디자인하는 것은 그 형태나 속성을 떠나 최신 UI의 기본적인 필요충분조건이 되었다. Universal UI는 인식과 행동패턴을 달리하는 사람들의 공통분모를 찾아냄으로써 표준화의 Solution을 찾아낼 수 있었지만 Customizing에 대한 해답은 사용자 심리, 문화, 역사까지 고려해야 한다. 우리는 이 두 부분을 모두 만족하는 다방면의 멘탈모델 수립과 UT검증을 통해 모바일 기기에서 최적의 인터페이스 개발을 진행하게 되었다. 이러한 Customizing에 대한 연구는 기기 사용자의 지역과, 문화적 특성에 따라 최적화된 인터페이스를 제공할 수 있기에 기기 제조사는 Future work을 위해서라도 이러한 부분에의 충분한 연구 의의가 있다고 할 수 있다. 우리는 본 논문에서와 같은 Personalization과, 선호 기능을 좀 더 쉽게 적응하고 사용할 수 있도록 하는 Customization 을 통해 사용자의 성향을 적극 반영할 수 있는 모바일 인터페이스 제품 개발로 한 단계 발전시켰다고 본다.

  • PDF

A Design of New Digital Adaptive Predistortion Linearizer Algorithm Based on DFP(Davidon-Fletcher-Powell) Method (DFP Method 기반의 새로운 적응형 디지털 전치 왜곡 선형화기 알고리즘 개발)

  • Jang, Jeong-Seok;Choi, Yong-Gyu;Suh, Kyoung-Whoan;Hong, Ui-Seok
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.22 no.3
    • /
    • pp.312-319
    • /
    • 2011
  • In this paper, a new linearization algorithm for DPD(Digital PreDistorter) is suggested. This new algorithm uses DFP(Davidon-Fletcher-Powell) method. This algorithm is more accurate than that of the existing algorithms, and this method renew the best-fit value in every routine with out setting the initial value of step-size. In modeling power amplifier, the memory polynomial model which can model the memory effect of the power amplifier is used. And the overall structure of linearizer is based on an indirect learning architecture. In order to verify for performance of proposed algorithm, we compared with LMS(Least Mean-Squares), RLS(Recursive Least squares) algorithm.

Indoor Localization in Wireless Sensor Network using LVQ (LVQ를 이용한 무선 센서 네트워크의 실내 위치 인식)

  • Park, Jin-Woo;Jung, Kyung-Kwon;Eom, Ki-Hwan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.5
    • /
    • pp.1295-1302
    • /
    • 2010
  • This paper proposed indoor location recognition method based on RSSI(received signal strength indication) using the LVQ network. In order to verify the effectiveness of the proposed method, we performed experiments, and then compared to the conventional triangularity measurement method. In the experiments, we set up the system to the laboratory, divided the 40 section, and installed 6 nodes as a reference node. We obtained the log-normal path loss model of wireless channels, RSSI converted into the distance. The distance values used as the input of LVQ. To learn the LVQ network, we set the target values as section indices. In the experiments, we determined the optimal number of subclass, and confirmed that the success rate of training phase was 96%, test phase was 91%.

Inspection of Coin Surface Defects using Multiple Eigen Spaces (다수의 고유 공간을 이용한 주화 표면 품질 진단)

  • Kim, Jae-Min;Ryoo, Ho-Jin
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.3
    • /
    • pp.18-25
    • /
    • 2011
  • In a manufacturing process of metal coins, surface defects of coins are manually detected. This paper describes an new method for detecting surface defects of metal coins on a moving conveyor belt using image processing. This method consists of multiple procedures: segmentation of a coin from the background, alignment of the coin to the model, projection of the aligned coin to the best eigen image space, and detection of defects by comparison of the projection error with an adaptive threshold. In these procedures, the alignement and the projection are newly developed in this paper for the detection of coin surface defects. For alignment, we use the histogram of the segmented coin, which converts two-dimensional image alignment to one-dimensional alignment. The projection reduces the intensity variation of the coin image caused by illumination and coin rotation change. For projection, we build multiple eigen image spaces and choose the best eigen space using estimated coin direction. Since each eigen space consists of a small number of eigen image vectors, we can implement the projection in real- time.

A Neural Networks Model for Flow Forecasting in Nakdong River Basin (낙동강 유역에서의 유량 예측 신경망 모형에 관한 연구)

  • Han, Kun-Yeun;Kim, Dong-Il;Choi, Hyun-Gu;Yoon, Young-Sam
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2008.05a
    • /
    • pp.1727-1731
    • /
    • 2008
  • 수자원의 효율적인 관리를 위해서는 신뢰성 있는 유량자료의 획득이 대단히 중요하다. 우리나라는 양질의 유량자료를 획득하기 위해 매년 많은 시간과 돈을 투자하고 있으나 자료의 질적인 면에서 만족할 만한 성과를 얻지 못하고 있다. 현재까지 우리나라의 유량자료는 댐의 수문자료와 수량관리 부처인 건교부에서 운영하는 수위표 지점의 수위-유량곡선에서 산출된 자료에 의존하고 있다. 그러나 수위-유량 관계식을 보정하기 위한 유량측정사업이 지속적이지 못하며, 이 관계식은 유량이 적은 저수기 및 갈수기에는 부정확하다는 한계가 있다. 또한, 국립환경과학원 낙동강물환경연구소에서 오염총량관리를 위한 낙동강수계 유량측정사업을 실시하고 있지만, 목적은 낙동강수계의 오염총량관리 단위유역 말단 47개 지점에서 유량측정을 효율적으로 실시하여 수질정책의 기초자료를 제공하는데 있다. 이 자료 역시 오염총량관리를 위하여 유량측정을 실시하여 수자원의 효율적인 관리를 위한 일 유량을 알 수가 없는 한계점을 가지고 있다. 따라서 저수기 및 갈수기에 수질정책의 기초자료를 제공하기 위해서 하천을 포함한 유역의 정확한 강우-유출특성의 파악이 필요하다. 그러나 강우-유출특성 또한 유역 내 강우의 시 공간적 분포가 다르며 그 자가 비선형성이 강하고 여러 변동성을 포함하므로, 강우로부터 하천의 유출량의 정확한 해석이 불가능하다. 그러나 최근 인공지능 분야에서 신호처리, 지능제어 및 패턴인식 등의 수단으로 사용되고 있는 신경망은 학습이라는 최적화 과정을 통해 입력과 출력으로 구성되는 하나의 시스템을 비선형적으로 구축할 수 있으며 이러한 이점을 활용하여 수자원 분야에서 다양하게 적용되고 있다. 본 연구의 목적은 강우-유출자료 및 댐 방류량 자료의 비선형적인 특정을 가장 잘 반영할 수 있는 신경망모형을 적용하여 수질정책의 기초자료를 제공하기 위하여 신뢰성 있는 유량자료를 산정하는 모형을 개발하는 것이다. 이를 위해서 낙동강물환경연구소에서 오염총량관리를 위한 낙동강수계 유량측정 지점 상류의 댐 방류량의 일 방류량자료와 강우자료를 입력 자료로 하여 유량을 예측할 수 있는 유량예측 신경망 모형 FFBN(Flow Forecasting By Neural)을 개발하였다. 그리고 입력 자료로서 장기유출모형인 SWAT의 모의결과를 입력 자료로 추가한 FFBNS(Flow Forecasting By Neural and SWAT)을 개발하였다. 신경망 모형의 구조는 입력층과 출력층 사이에 하나의 은닉층이 존재하는 다층 신경망으로 구성하였으며, 학습단계에서는 오류 역전파 알고리듬 학습방법 중 모멘텀법을 사용하였다. 예측된 유출량을 실측치와의 비교를 위하여 낙본D지점과 낙본 E지점에 대하여 $2005{\sim}2006$년까지의 모의 결과를 낙동 수위측정지점과 구미 수위측정지점의 실측치 통하여 복잡한 비선형성을 가지는 유출 시계열 자료에 대한 효과적인 최적의 신경망모델을 개발하여 유량을 예측하고 적용 가능성을 검토하고자 한다. 모의 결과는 수질정책의 기초자료 제공에 기여할 수 있을 것으로 판단된다.

  • PDF

A study on improving self-inference performance through iterative retraining of false positives of deep-learning object detection in tunnels (터널 내 딥러닝 객체인식 오탐지 데이터의 반복 재학습을 통한 자가 추론 성능 향상 방법에 관한 연구)

  • Kyu Beom Lee;Hyu-Soung Shin
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.26 no.2
    • /
    • pp.129-152
    • /
    • 2024
  • In the application of deep learning object detection via CCTV in tunnels, a large number of false positive detections occur due to the poor environmental conditions of tunnels, such as low illumination and severe perspective effect. This problem directly impacts the reliability of the tunnel CCTV-based accident detection system reliant on object detection performance. Hence, it is necessary to reduce the number of false positive detections while also enhancing the number of true positive detections. Based on a deep learning object detection model, this paper proposes a false positive data training method that not only reduces false positives but also improves true positive detection performance through retraining of false positive data. This paper's false positive data training method is based on the following steps: initial training of a training dataset - inference of a validation dataset - correction of false positive data and dataset composition - addition to the training dataset and retraining. In this paper, experiments were conducted to verify the performance of this method. First, the optimal hyperparameters of the deep learning object detection model to be applied in this experiment were determined through previous experiments. Then, in this experiment, training image format was determined, and experiments were conducted sequentially to check the long-term performance improvement through retraining of repeated false detection datasets. As a result, in the first experiment, it was found that the inclusion of the background in the inferred image was more advantageous for object detection performance than the removal of the background excluding the object. In the second experiment, it was found that retraining by accumulating false positives from each level of retraining was more advantageous than retraining independently for each level of retraining in terms of continuous improvement of object detection performance. After retraining the false positive data with the method determined in the two experiments, the car object class showed excellent inference performance with an AP value of 0.95 or higher after the first retraining, and by the fifth retraining, the inference performance was improved by about 1.06 times compared to the initial inference. And the person object class continued to improve its inference performance as retraining progressed, and by the 18th retraining, it showed that it could self-improve its inference performance by more than 2.3 times compared to the initial inference.

Classification of Radar Signals Using Machine Learning Techniques (기계학습 방법을 이용한 레이더 신호 분류)

  • Hong, Seok-Jun;Yi, Yearn-Gui;Choi, Jong-Won;Jo, Jeil;Seo, Bo-Seok
    • Journal of IKEEE
    • /
    • v.22 no.1
    • /
    • pp.162-167
    • /
    • 2018
  • In this paper, we propose a method to classify radar signals according to the jamming technique by applying the machine learning to parameter data extracted from received radar signals. In the present army, the radar signal is classified according to the type of threat based on the library of the radar signal parameters mostly built by the preliminary investigation. However, since radar technology is continuously evolving and diversifying, it can not properly classify signals when applying this method to new threats or threat types that do not exist in existing libraries, thus limiting the choice of appropriate jamming techniques. Therefore, it is necessary to classify the signals so that the optimal jamming technique can be selected using only the parameter data of the radar signal that is different from the method using the existing threat library. In this study, we propose a method based on machine learning to cope with new threat signal form. The method classifies the signal corresponding the new jamming method for the new threat signal by learning the classifier composed of the hidden Markov model and the neural network using the existing library data.