• Title/Summary/Keyword: 문자-음성 변환

Search Result 52, Processing Time 0.023 seconds

Design and Implementation of Korean Tet-to-Speech System (다이폰을 이용한 한국어 문자-음성 변환 시스템의 설계 및 구현)

  • 정준구
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06c
    • /
    • pp.91-94
    • /
    • 1994
  • This paper is a study on the design and implementation of the Korean Tet-to-Speech system. In this paper, parameter symthesis method is chosen for speech symthesis method and PARCOR coeffient, one of the LPC analysis, is used as acoustic parameter, We use a diphone as synthesis unit, it include a basic naturalness of human speech. Diphone DB is consisted of 1228 PCM files. LPC synthesis method has defect that decline clearness of synthesis speech, during synthesizing unvoiced sound In this paper, we improve clearness of synthesized speech, using residual signal as ecitation signal of unvoiced sound. Besides, to improve a naturalness, we control the prosody of synthesized speech through controlling the energy and pitch pattern. Synthesis system is implemented at PC/486 and use a 70Hz-4.5KHz band pass filter for speech imput/output, amplifier and TMS320c30 DSP board.

  • PDF

Automatic sentence segmentation of subtitles generated by STT (STT로 생성된 자막의 자동 문장 분할)

  • Kim, Ki-Hyun;Kim, Hong-Ki;Oh, Byoung-Doo;Kim, Yu-Seop
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.559-560
    • /
    • 2018
  • 순환 신경망(RNN) 기반의 Long Short-Term Memory(LSTM)는 자연어처리 분야에서 우수한 성능을 보이는 모델이다. 음성을 문자로 변환해주는 Speech to Text (STT)를 이용해 자막을 생성하고, 생성된 자막을 다른 언어로 동시에 번역을 해주는 서비스가 활발히 진행되고 있다. STT를 사용하여 자막을 추출하는 경우에는 마침표가 없이 전부 연결된 문장이 생성되기 때문에 정확한 번역이 불가능하다. 본 논문에서는 영어자막의 자동 번역 시, 정확도를 높이기 위해 텍스트를 문장으로 분할하여 마침표를 생성해주는 방법을 제안한다. 이 때, LSTM을 이용하여 데이터를 학습시킨 후 테스트한 결과 62.3%의 정확도로 마침표의 위치를 예측했다.

  • PDF

Automatic Generation of Pronunciation Variants for Korean Continuous Speech Recognition (한국어 연속음성 인식을 위한 발음열 자동 생성)

  • 이경님;전재훈;정민화
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.2
    • /
    • pp.35-43
    • /
    • 2001
  • Many speech recognition systems have used pronunciation lexicon with possible multiple phonetic transcriptions for each word. The pronunciation lexicon is of often manually created. This process requires a lot of time and efforts, and furthermore, it is very difficult to maintain consistency of lexicon. To handle these problems, we present a model based on morphophon-ological analysis for automatically generating Korean pronunciation variants. By analyzing phonological variations frequently found in spoken Korean, we have derived about 700 phonemic contexts that would trigger the multilevel application of the corresponding phonological process, which consists of phonemic and allophonic rules. In generating pronunciation variants, morphological analysis is preceded to handle variations of phonological words. According to the morphological category, a set of tables reflecting phonemic context is looked up to generate pronunciation variants. Our experiments show that the proposed model produces mostly correct pronunciation variants of phonological words. Then we estimated how useful the pronunciation lexicon and training phonetic transcription using this proposed systems.

  • PDF

Cyber Threats Analysis of AI Voice Recognition-based Services with Automatic Speaker Verification (화자식별 기반의 AI 음성인식 서비스에 대한 사이버 위협 분석)

  • Hong, Chunho;Cho, Youngho
    • Journal of Internet Computing and Services
    • /
    • v.22 no.6
    • /
    • pp.33-40
    • /
    • 2021
  • Automatic Speech Recognition(ASR) is a technology that analyzes human speech sound into speech signals and then automatically converts them into character strings that can be understandable by human. Speech recognition technology has evolved from the basic level of recognizing a single word to the advanced level of recognizing sentences consisting of multiple words. In real-time voice conversation, the high recognition rate improves the convenience of natural information delivery and expands the scope of voice-based applications. On the other hand, with the active application of speech recognition technology, concerns about related cyber attacks and threats are also increasing. According to the existing studies, researches on the technology development itself, such as the design of the Automatic Speaker Verification(ASV) technique and improvement of accuracy, are being actively conducted. However, there are not many analysis studies of attacks and threats in depth and variety. In this study, we propose a cyber attack model that bypasses voice authentication by simply manipulating voice frequency and voice speed for AI voice recognition service equipped with automated identification technology and analyze cyber threats by conducting extensive experiments on the automated identification system of commercial smartphones. Through this, we intend to inform the seriousness of the related cyber threats and raise interests in research on effective countermeasures.

A study of Implementing An Embedded System for Conversion from Text to Speech (문서-음성 변환 임베디드 시스템 구축에 관한 연구)

  • Lee, Hyun-Chang;Seo, Jeong-Man
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.3
    • /
    • pp.77-83
    • /
    • 2008
  • According to the development and expansion of software and hardware about recent information technologies(IT), disabled persons in using IT seem to feel more information gap. Devices for IT are important tools for users including disabled persons to communicate with each other and get information. Although the Korea faces ageing society rapidly, products for disabled persons are seldom shown in time for use. As getting older especially, one of the body function disorders is visual disturbance. There are tools, braille lettering, for disabled persons with visual disturbance to communicate or get information from book. Compared to general books, however, braille lettering book is lack of including all of information of our society. Therefore, in this paper, we implement and show an embedded system for disabled person with visual disturbance to get information by scanning text, extγacting characters and converting the text to speech automatically.

  • PDF

The Impact of Speech-To-Text-based Class on Learners' Cognitive Abilities

  • HyunMin Kang;SunKwan Han
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.1
    • /
    • pp.287-293
    • /
    • 2024
  • This research studied the cognitive impact of classes using artificial intelligence on aviation technical school students. First, we developed a class consisting of a class based on traditional presentation materials and a class composed of speech-to-text (STT)-based artificial intelligence materials. A 133 students from an aviation education institution participated in two types of classes. We measured students' cognitive load and Mind Wandering test results before and after class, and conducted an achievement evaluation. As a result of the test analysis, we confirmed that extraneous cognitive load was reduced, content concentration increased, and achievement improved. In the future, we hope that AI-based STT classes will be widely used in schools that teach technology.

A Design and Implementation of Speech Recognition and Synthetic Application for Hearing-Impairment

  • Kim, Woo-Lin;Ham, Hye-Won;Yun, Sang-Un;Lee, Won Joo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.12
    • /
    • pp.105-110
    • /
    • 2021
  • In this paper, we design and implement an Android mobile application that helps hearing impaired people communicate based on STT(Speech-to-Text) and TTS(Text-to-Speech) APIs and accelerometer sensor of a smartphone. This application provides the ability to record what the hearing-Impairment person's interlocutor is saying with a microphone, convert it to text using the STT API, and display it to the hearing-Impairment person. In addition. In addition, when a hearing-impaired person inputs a text using the TTS API, it is converted into voice and told to the interlocutor. When a hearing-impaired person shakes their smartphone, an accelerometer based background service function is provided to run the application. The application implemented in this paper provides a function that allows hearing impaired people to communicate easily with other people when communicating with others without using sign language as a video call.

The Implementation Directions and an Analysis of Assistive Devices and Alternative Formats to Improve Accessibility for Disabled People (장애인 접근성 향상을 위한 보조기기 및 대체자료 분석과 구현 방향)

  • Rim, Myunghwan;Gil, Younhee;Jeon, Gwangil
    • The Journal of the Korea Contents Association
    • /
    • v.15 no.7
    • /
    • pp.664-673
    • /
    • 2015
  • The assistive devices for disabled people are being highlighted even in industrial aspects through the policy and support for disabled people, enactment of regulation for the improvement of accessibility of disabled, technological innovation and product development. Recently, internet access with the sense of touch and hearing and utilizing electronic publishing contents and e-mailing are being convenient through the product of ICT development such as screen reader for visually impaired people, braille display, screen enlarger, text converter and others. Even so, in rapidly changing digital media smart era, the accessibility of visually impaired people is still poor and assistive devices and alternative formats are in need of improvement. Therefore, in aspect of the research and development innovation, this study proposes the implementation directions for improvement of accessibility by analyzing the current situation and structure of alternative formats and assistive devices for visually impaired people. As a result, in the future, various types of digital information are expected to be converted into a customized and realistic forms and distributed through a dedicated disability products or smart devices.

A Machine-to-machine based Intelligent Walking Assistance System for Visually Impaired Person (시각장애인을 위한 M2M 기반의 지능형 보행보조시스템)

  • Kang, Chang-Soon;Jo, Hwa-Seop;Kim, Byung-Hee
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.36 no.3B
    • /
    • pp.287-296
    • /
    • 2011
  • The white stick mainly used for visually impaired person has difficulty in providing location information and effective countermeasures for emergency situations encountered during walking as well as detecting floating obstacles on the ground. In this paper, we propose a machine-to-machine based intelligent walking assistance system for safe and convenient walking of the visually impaired. The proposed system consists of a walking assistance stick used by the visually impaired and a server supporting multiple stick users in remote places through mobile communication networks. The stick equipped with ultrasonic sensors, GPS(global positioning system) receiver and vibrator not only detects floating obstacles, but also offers stick users with present location identification utilizing a text-to-voice conversion technology. Besides providing geographic information, the server notifies the emergency locations of users to guardian and aid agency, and it provides log information during walking such as the place, time and the number of accidents. Test results with a developed prototype system have shown that the system properly performs the functions and satisfies overall system performance.

The Design for Self-care System Based on RFID (RFID를 이용한 Self-care System 설계)

  • Xiao, Huang;Zhou, Kun-Peng;Jin, Woo-Jeong;Cho, Yong-Soon;Jung, Hoe-Kyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2010.05a
    • /
    • pp.879-881
    • /
    • 2010
  • For the rapid development of society, such as small family, one-people family is following. The traditional family is being changed, so the older stay home alone. That makes it more and more. Staying home alone, the older's health and safety are worth considering by us. With the rapid development of RFIDRadio Frequency Identification) technology, its applications have extended to all areas of our lifes. RFIDRadio Frequency Identification) has became a major topic of concern in multi-industry. With the high-speed economic growth and the development of science, medicine, the old people's life expectancy is increasing slightly. So it is necessary to design a protective system for the older's safety. In this thesis, self-care system is made by using RFID(Radio Frequency Identification) technology to authenticate an user and using TTS(test to speech) to convert character information to voice information and also using infrared radiation technology to protect home effectively, and using e-blood pressure monitors to examination the older's bodies.

  • PDF