• Title/Summary/Keyword: Korean Speech Engineering Systems

Search Result 105, Processing Time 0.022 seconds

A Study on the Isolated word Recognition Using One-Stage DMS/DP for the Implementation of Voice Dialing System

  • Seong-Kwon Lee
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.1039-1045
    • /
    • 1994
  • The speech recognition systems using VQ have usually the problem decreasing recognition rate, MSVQ assigning the dissimilar vectors to a segment. In this paper, applying One-stage DMS/DP algorithm to the recognition experiments, we can solve these problems to what degree. Recognition experiment is peformed for Korean DDD area names with DMS model of 20 sections and word unit template. We carried out the experiment in speaker dependent and speaker independent, and get a recognition rates of 97.7% and 81.7% respectively.

  • PDF

Towards Effective Entity Extraction of Scientific Documents using Discriminative Linguistic Features

  • Hwang, Sangwon;Hong, Jang-Eui;Nam, Young-Kwang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.3
    • /
    • pp.1639-1658
    • /
    • 2019
  • Named entity recognition (NER) is an important technique for improving the performance of data mining and big data analytics. In previous studies, NER systems have been employed to identify named-entities using statistical methods based on prior information or linguistic features; however, such methods are limited in that they are unable to recognize unregistered or unlearned objects. In this paper, a method is proposed to extract objects, such as technologies, theories, or person names, by analyzing the collocation relationship between certain words that simultaneously appear around specific words in the abstracts of academic journals. The method is executed as follows. First, the data is preprocessed using data cleaning and sentence detection to separate the text into single sentences. Then, part-of-speech (POS) tagging is applied to the individual sentences. After this, the appearance and collocation information of the other POS tags is analyzed, excluding the entity candidates, such as nouns. Finally, an entity recognition model is created based on analyzing and classifying the information in the sentences.

Optimization of Memristor Devices for Reservoir Computing (축적 컴퓨팅을 위한 멤리스터 소자의 최적화)

  • Kyeongwoo Park;HyeonJin Sim;HoBin Oh;Jonghwan Lee
    • Journal of the Semiconductor & Display Technology
    • /
    • v.23 no.1
    • /
    • pp.1-6
    • /
    • 2024
  • Recently, artificial neural networks have been playing a crucial role and advancing across various fields. Artificial neural networks are typically categorized into feedforward neural networks and recurrent neural networks. However, feedforward neural networks are primarily used for processing static spatial patterns such as image recognition and object detection. They are not suitable for handling temporal signals. Recurrent neural networks, on the other hand, face the challenges of complex training procedures and requiring significant computational power. In this paper, we propose memristors suitable for an advanced form of recurrent neural networks called reservoir computing systems, utilizing a mask processor. Using the characteristic equations of Ti/TiOx/TaOy/Pt, Pt/TiOx/Pt, and Ag/ZnO-NW/Pt memristors, we generated current-voltage curves to verify their memristive behavior through the confirmation of hysteresis. Subsequently, we trained and inferred reservoir computing systems using these memristors with the NIST TI-46 database. Among these systems, the accuracy of the reservoir computing system based on Ti/TiOx/TaOy/Pt memristors reached 99%, confirming the Ti/TiOx/TaOy/Pt memristor structure's suitability for inferring speech recognition tasks.

  • PDF

Arduino IoT Studio based on 5W1H Programming Model for non Programmer

  • Im, Hong-Gab;Baek, Yeong-Tae;Lee, Se-Hoon;Kim, Ji-Seong;Sin, Bo-Bae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.2
    • /
    • pp.29-35
    • /
    • 2017
  • In this paper, we present a 5W1H programming model for IT non-experienced people who are not familiar with computer programming and those who need programming education. Based on this model, we can design a development tool that can be easily programmed by beginners. This development tool is a programming method applying the 5W1H concept and constructs a sentence to satisfy the control condition of 'Who, When, Where, What, and How', which is the sentence element of 5W1H. Therefore, the user can easily develop the target system as if constructing the sentence without learning the programming language of the target system. In this paper, to verify the effectiveness of the 5W1H programming model proposed in this paper, we applied the concept of 5W1H programming to Arduino and developed the development tool and performed the first verification and applied the second verification to the speech recognition smart home development platform.

Communication Aid System For Dementia Patients (치매환자를 위한 대화 보조 시스템)

  • Sung-Ill Kim;Byoung-Chul Kim
    • Journal of Biomedical Engineering Research
    • /
    • v.23 no.6
    • /
    • pp.459-465
    • /
    • 2002
  • The goat of the present research is to improve the quality of life of both the elderly patients with dementia and their caregivers. For this Purpose, we developed a communication aid system that is consisted of three modules such as speech recognition engine, graphical agent. and database classified by a nursing schedule. The system was evaluated in an actual environment of nursing facility by introducing the system to an older mail patient with dementia. The comparison study was then carried out with and without system, respectively. The occupational therapists then evaluated subject"s reaction to the system by photographing his behaviors. The evaluation results revealed that the proposed system was more responsive in catering to needs of subject than professional caregivers. Moreover we could see that the frequency of causing the utterances of subject increased by introducing the system.

LLM-based chatbot system to improve worker efficiency and prevent safety incidents (작업자의 업무 능률 향상과 안전 사고 방지를 위한 LLM 기반 챗봇 시스템)

  • Doohwan Kim;Yohan Han;Inhyuk Jeong;Yeongseok Hwnag;Jinju Park;Nahyeon Lee;Yujin Lee
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2024.01a
    • /
    • pp.321-324
    • /
    • 2024
  • 본 논문에서는 LLM(Large Language Models) 기반의 STT 결합 챗봇 시스템을 제안한다. 제조업 공장에서 안전 교육의 부족과 외국인 근로자의 증가는 안전을 중시하는 작업 환경에서 새로운 도전과제로 부상하고 있다. 이에 본 연구는 언어 모델과 음성 인식(Speech-to-Text, STT) 기술을 활용한 혁신적인 챗봇 시스템을 통해 이러한 문제를 해결하고자 한다. 제안된 시스템은 작업자들이 장비 사용 매뉴얼 및 안전 지침을 쉽게 접근하도록 지원하며, 비상 상황에서 신속하고 정확한 대응을 가능하게 한다. 연구 과정에서 LLM은 작업자의 의도를 파악하고, STT 기술은 음성 명령을 효과적으로 처리한다. 실험 결과, 이 시스템은 작업자의 업무 효율성을 증대시키고 언어 장벽을 해소하는데 효과적임이 확인되었다. 본 연구는 제조업 현장에서 작업자의 안전과 업무 효율성 향상에 기여할 것으로 기대된다.

  • PDF

Deep Level Situation Understanding for Casual Communication in Humans-Robots Interaction

  • Tang, Yongkang;Dong, Fangyan;Yoichi, Yamazaki;Shibata, Takanori;Hirota, Kaoru
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.15 no.1
    • /
    • pp.1-11
    • /
    • 2015
  • A concept of Deep Level Situation Understanding is proposed to realize human-like natural communication (called casual communication) among multi-agent (e.g., humans and robots/machines), where the deep level situation understanding consists of surface level understanding (such as gesture/posture understanding, facial expression understanding, speech/voice understanding), emotion understanding, intention understanding, and atmosphere understanding by applying customized knowledge of each agent and by taking considerations of thoughtfulness. The proposal aims to reduce burden of humans in humans-robots interaction, so as to realize harmonious communication by excluding unnecessary troubles or misunderstandings among agents, and finally helps to create a peaceful, happy, and prosperous humans-robots society. A simulated experiment is carried out to validate the deep level situation understanding system on a scenario where meeting-room reservation is done between a human employee and a secretary-robot. The proposed deep level situation understanding system aims to be applied in service robot systems for smoothing the communication and avoiding misunderstanding among agents.

Design of a Multi-Agent System Architecture for Implementing CPFR (CPFR 구현을 위한 다중 에이전트 시스템 구조설계)

  • Kim, Chang-Ouk;Kim, Sun-II;Yoon, Jung-Wook;Park, Yun-Sun
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.30 no.1
    • /
    • pp.1-10
    • /
    • 2004
  • Advance in Internet technology has changed traditional production planning and control methods. In particular, collaborations between participants in supply chains are being increasingly addressed in industry for enhancing chain-wide productivity. A representative paradigm that emphasizes collaboration in production planning and control is CPFR(Collaborative Planning, Forecasting and Replenishment). In this paper, we present a multi-agent system architecture that supports the collaborations specified in CPFR. The multi-agent system architecture consists of event manager, data view agent, business rule agent, and collaboration agent. The collaboration agent systematically controls negotiation between supplier and buyer with the aid of collaboration protocol and blackboard. The multi-agent system has been implemented with EJB(Enterprise Java Beans).

Acoustic Monitoring and Localization for Social Care

  • Goetze, Stefan;Schroder, Jens;Gerlach, Stephan;Hollosi, Danilo;Appell, Jens-E.;Wallhoff, Frank
    • Journal of Computing Science and Engineering
    • /
    • v.6 no.1
    • /
    • pp.40-50
    • /
    • 2012
  • Increase in the number of older people due to demographic changes poses great challenges to the social healthcare systems both in the Western and as well as in the Eastern countries. Support for older people by formal care givers leads to enormous temporal and personal efforts. Therefore, one of the most important goals is to increase the efficiency and effectiveness of today's care. This can be achieved by the use of assistive technologies. These technologies are able to increase the safety of patients or to reduce the time needed for tasks that do not relate to direct interaction between the care giver and the patient. Motivated by this goal, this contribution focuses on applications of acoustic technologies to support users and care givers in ambient assisted living (AAL) scenarios. Acoustic sensors are small, unobtrusive and can be added to already existing care or living environments easily. The information gathered by the acoustic sensors can be analyzed to calculate the position of the user by localization and the context by detection and classification of acoustic events in the captured acoustic signal. By doing this, possibly dangerous situations like falls, screams or an increased amount of coughs can be detected and appropriate actions can be initialized by an intelligent autonomous system for the acoustic monitoring of older persons. The proposed system is able to reduce the false alarm rate compared to other existing and commercially available approaches that basically rely only on the acoustic level. This is due to the fact that it explicitly distinguishes between the various acoustic events and provides information on the type of emergency that has taken place. Furthermore, the position of the acoustic event can be determined as contextual information by the system that uses only the acoustic signal. By this, the position of the user is known even if she or he does not wear a localization device such as a radio-frequency identification (RFID) tag.

Gender Analysis in Elderly Speech Signal Processing (노인음성신호처리에서의 젠더 분석)

  • Lee, JiYeoun
    • Journal of Digital Convergence
    • /
    • v.16 no.10
    • /
    • pp.351-356
    • /
    • 2018
  • Changes in vocal cords due to aging can change the frequency of speech, and the speech signals of the elderly can be automatically distinguished from normal speech signals through various analyzes. The purpose of this study is to provide a tool that can be easily accessed by the elderly and disabled people who can be excluded from the rapidly changing technological society and to improve the voice recognition performance. In the study, the gender of the subjects was reported as sex analysis, and the number of female and male voice samples was used equally. In addition, the gender analysis was applied to set the voices of the elderly without using voices of all ages. Finally, we applied a review methodology of standards and reference models to reduce gender difference. 10 Korean women and 10 men aged 70 to 80 years old are used in this study. Comparing the F0 value extracted directly with the waveform and the F0 extracted with TF32 and the Wavesufer speech analysis program, Wavesufer analyzed the F0 of the elderly voice better than TF32. However, there is a need for a voice analysis program for elderly people. In conclusions, analyzing the voice of the elderly will improve speech recognition and synthesis capabilities of existing smart medical systems.