• Title/Summary/Keyword: 생성형 모델

Search Result 811, Processing Time 0.025 seconds

A Korean menu-ordering sentence text-to-speech system using conformer-based FastSpeech2 (콘포머 기반 FastSpeech2를 이용한 한국어 음식 주문 문장 음성합성기)

  • Choi, Yerin;Jang, JaeHoo;Koo, Myoung-Wan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.3
    • /
    • pp.359-366
    • /
    • 2022
  • In this paper, we present the Korean menu-ordering Sentence Text-to-Speech (TTS) system using conformer-based FastSpeech2. Conformer is the convolution-augmented transformer, which was originally proposed in Speech Recognition. Combining two different structures, the Conformer extracts better local and global features. It comprises two half Feed Forward module at the front and the end, sandwiching the Multi-Head Self-Attention module and Convolution module. We introduce the Conformer in Korean TTS, as we know it works well in Korean Speech Recognition. For comparison between transformer-based TTS model and Conformer-based one, we train FastSpeech2 and Conformer-based FastSpeech2. We collected a phoneme-balanced data set and used this for training our models. This corpus comprises not only general conversation, but also menu-ordering conversation consisting mainly of loanwords. This data set is the solution to the current Korean TTS model's degradation in loanwords. As a result of generating a synthesized sound using ParallelWave Gan, the Conformer-based FastSpeech2 achieved superior performance of MOS 4.04. We confirm that the model performance improved when the same structure was changed from transformer to Conformer in the Korean TTS.

Human-like Whole Body Motion Generation of Humanoid Based on Simplified Human Model (단순인체모델 기반 휴머노이드의 인간형 전신동작 생성)

  • Kim, Chang-Hwan;Kim, Seung-Su;Ra, Syung-Kwon;You, Bum-Jae
    • The Journal of Korea Robotics Society
    • /
    • v.3 no.4
    • /
    • pp.287-299
    • /
    • 2008
  • People have expected a humanoid robot to move as naturally as a human being does. The natural movements of humanoid robot may provide people with safer physical services and communicate with persons through motions more correctly. This work presented a methodology to generate the natural motions for a humanoid robot, which are converted from human motion capture data. The methodology produces not only kinematically mapped motions but dynamically mapped ones. The kinematical mapping reflects the human-likeness in the converted motions, while the dynamical mapping could ensure the movement stability of whole body motions of a humanoid robot. The methodology consists of three processes: (a) Human modeling, (b) Kinematic mapping and (c) Dynamic mapping. The human modeling based on optimization gives the ZMP (Zero Moment Point) and COM (Center of Mass) time trajectories of an actor. Those trajectories are modified for a humanoid robot through the kinematic mapping. In addition to modifying the ZMP and COM trajectories, the lower body (pelvis and legs) motion of the actor is then scaled kinematically and converted to the motion available to the humanoid robot considering dynamical aspects. The KIST humanoid robot, Mahru, imitated a dancing motion to evaluate the methodology, showing the good agreement in the motion.

  • PDF

Intelligent System based on Command Fusion and Fuzzy Logic Approaches - Application to mobile robot navigation (명령융합과 퍼지기반의 지능형 시스템-이동로봇주행적용)

  • Jin, Taeseok;Kim, Hyun-Deok
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.5
    • /
    • pp.1034-1041
    • /
    • 2014
  • This paper propose a fuzzy inference model for obstacle avoidance for a mobile robot with an active camera, which is intelligently searching the goal location in unknown environments using command fusion, based on situational command using an vision sensor. Instead of using "physical sensor fusion" method which generates the trajectory of a robot based upon the environment model and sensory data. In this paper, "command fusion" method is used to govern the robot motions. The navigation strategy is based on the combination of fuzzy rules tuned for both goal-approach and obstacle-avoidance. We describe experimental results obtained with the proposed method that demonstrate successful navigation using real vision data.

A Study on Automatic Generation for 3-Dimensional Geometry of Cylindrical Gear and Hob (원통 치차와 호브의 3차원 형상 자동 생성에 관한 연구)

  • Jeong, Tae-Hyeong;Ha, Yeong-Uk
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.25 no.1
    • /
    • pp.39-46
    • /
    • 2001
  • When designing a gear, designers determine basic dimensions of a gear with transmitted power considering strength, interference and so on. But, designers can not easily obtain the tooth profile generated by dimensions as well as the geometry of generating hob for cutting the tooth profile. In order to resolve these problems, an automatic design system creating not only the solid model of a gear but also that of the generating hob using the design parameters of dimensions is developed. Through the developed system, designers system, designers can improve the efficiency of design and satisfy the variable requirements of design as well. In this research, the three-dimensional solid models for spur and helical gears are generated considering the design parameters. Besides, those for generating hob regarding the design parameters of hob is created automatically. The system is developed by using Visual Basic and its three-dimensional geometric modeling module is constructed by using SolidWorks.

A Study on NOx Removal Efficiency Depending on Electrode Configurations of Silent Discharges (무성방전 플라즈마 전극구조에 대한 질소산화물 제거효율 연구)

  • Hyung-Taek Kim;Young-Sik Chung;Myung-Whan Whang;Elena. A. Filimonova
    • Journal of the Korean Society of Safety
    • /
    • v.17 no.3
    • /
    • pp.112-117
    • /
    • 2002
  • A comparative investigation of an experimental and a simulation of chemical kinetics for NOx removal from silent(dielectric-barrier) discharges is presented. Several types of dielectric-barrier discharges were implemented depending upon the configuration of electrodes. The simulation was based on an approximate mathematical model for plasma cleaning of waste gas. The influence of non-uniform distributions of species due to the production of primary active particles in the streamer channel was taken into account. A comparison of observed experimental to the calculated removal efficiency of NOx showed acceptable agreement.

A Speech Recognition in a Wineless Network Environment (무선 네트워크 환경 하에서의 음성인식에 관한 고찰)

  • Lim Soo-Ho;Shen Guang-Hu;Hahm Seong-Jun;Kim Joo-Gon;Jung Ho-Youl;Chung Hyun-Yeol
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.61-64
    • /
    • 2004
  • 최근 PDA(Personal Digital Assistants)와 같은 휴대형 단말기들은 다양한 멀티미디어 기술과 무선 인터넷 기술의 영향으로 정보단말기로서 각광을 받고 있다. 그러나 현재의 단말기는 프로세서와 메모리의 한계로 인하여 원활한 음성인식 시스템을 구축하기에는 한계가 있다. 이를 보완하는 방법으로 본 논문에서는 Client/server로 분리된 음성 인식 시스템을 구축하였다. 구축한 시스템은 무선 네트워크 환경을 이용하여 PDA(Personal Digital Assistants)에서 음성 파일 또는 특징 파라미터를 Serve 측으로 전송하여 Server측에서 음성 인식을 수행한 후 그 결과를 모바일 단말기로 되돌려 주는 시스템이다. 구성된 시스템을 평가하기 위해서는 국어 공학센터의 음성 DB(KLE 452DB)를 이용하여 음향 모델을 생성한 후 다양한 환경(연구실, 복도, 주차장 도서관 로비)에서 발성한 후 이를 교내 무선 인터넷망(Nespot)을 통하여 송신하여 실시간 인식하였다. 실험 결과, 각각 $84.04\%\;72.28\%\;69.47\%\;67.61\%$의 평균 인식률을 얻을 수 있었다.

  • PDF

Traffic Collision Detection at Intersections based on Motion Vector and Staying Period of Vehicles (차량의 움직임 벡터와 체류시간 기반의 교차로 추돌 검출)

  • Shin, Youn-Chul;Park, Joo-Heon;Lee, Myeong-Jin
    • Journal of Advanced Navigation Technology
    • /
    • v.17 no.1
    • /
    • pp.90-97
    • /
    • 2013
  • Recently, intelligent transportation system based on image processing has been developed. In this paper, we propose a collision detection algorithm based on the analysis of motion vectors and the staying periods of vehicles in intersections. Objects in the region of interest are extracted from the subtraction image between background images based on Gaussian mixture model and input images. Collisions and traffic jams are detected by analysing measured motion vectors of vehicles and their staying periods in intersections. Experiments are performed on video sequences actually recoded at intersections. Correct detection rate and false alarm rate are 85.7% and 7.7%, respectively.

Compiling Haskell to Java via an Intermediate Code L (중간언어 L-코드를 이용한 Haskell-Java 언어 번역기 구현)

  • Choi, Kwang-Hoon;Han, Tai-Sook
    • Journal of KIISE:Software and Applications
    • /
    • v.28 no.12
    • /
    • pp.955-965
    • /
    • 2001
  • We propose a systematic method of compiling Haskell based on the spineless Tagless G-machine (STGM) for the Java, Virtual Machine (JVM) We introduce an intermediate language called L-code to identify each micro-operation of the machine by its instruction, Each macro operation of the machine is identified by a binding Each instruction of the L-code can be easily translated into Java statements. After our determination on representation and L-code program from a STG program is translated into Java program according to out compilation rules. Our experiment shows that the execution times of translated benchmarks are competitive compared with those in Haskell interpreter Hugs, particularly when Glasgow Haskell compiler's STG -level optimizations are applied.

  • PDF

Collision-free Path Planning Using Genetic Algorithm (유전자 알고리즘을 이용한 충돌회피 경로계획)

  • Lee, Dong-Hwan;Zhao, Ran;Lee, Hong-Kyu
    • Journal of Advanced Navigation Technology
    • /
    • v.13 no.5
    • /
    • pp.646-655
    • /
    • 2009
  • This paper presents a new search strategy based on models of evolution in order to solve the problem of collision-free robotic path planning. We designed the robot path planning method with genetic algorithm which has become a well-known technique for optimization, intelligent search. Considering the path points as genes in a chromosome will provide a number of possible solutions on a given map. In this case, path distances that each chromosome creates can be regarded as a fitness measure for the corresponding chromosome. The effectiveness of the proposed genetic algorithm in the path planning was demonstrated by simulation. The proposed search strategy is able to use multiple and static obstacles.

  • PDF

Advanced JPEG bit rate control for the mobile multimedia device (이동형 멀티미디어 기기를 위한 개선된 JPEG 비트율 조절 알고리즘)

  • Yang, Yoon-Gi;Lee, Chang-Su;Kim, Jin-Yul
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.5
    • /
    • pp.579-587
    • /
    • 2008
  • Typically, the file sizes of JPEG compressed images with various complexity differ from images regardless of same image size. So, it is not easy to estimate the remaining image counts that should be stored in the limited storage equipped with the digital camera. To solve the problem, the bit rate control employs the modification of quantization table. The previous work assumed that there is linear relation between image activity and modification factor of quantization table, but in this paper, more accurate functional relations based on statistics are employed to improve the bit rate control accuracy. Computer simulations reveals that the standard deviation of the bit rate error of the proposed scheme is 50% less than that of the conventional method.

  • PDF