Search | Korea Science

Conformer-based Elderly Speech Recognition using Feature Fusion Module (피쳐 퓨전 모듈을 이용한 콘포머 기반의 노인 음성 인식)

Minsik Lee;Jihie Kim
- Annual Conference on Human and Language Technology
- /
- 2023.10a
- /
- pp.39-43
- /
- 2023
자동 음성 인식(Automatic Speech Recognition, ASR)은 컴퓨터가 인간의 음성을 텍스트로 변환하는 기술이다. 자동 음성 인식 시스템은 다양한 응용 분야에서 사용되며, 음성 명령 및 제어, 음성 검색, 텍스트 트랜스크립션, 자동 음성 번역 등 다양한 작업을 목적으로 한다. 자동 음성 인식의 노력에도 불구하고 노인 음성 인식(Elderly Speech Recognition, ESR)에 대한 어려움은 줄어들지 않고 있다. 본 연구는 노인 음성 인식에 콘포머(Conformer)와 피쳐 퓨전 모듈(Features Fusion Module, FFM)기반 노인 음성 인식 모델을 제안한다. 학습, 평가는 VOTE400(Voide Of The Elderly 400 Hours) 데이터셋으로 한다. 본 연구는 그동안 잘 이뤄지지 않았던 콘포머와 퓨전피쳐를 사용해 노인 음성 인식을 위한 딥러닝 모델을 제시하였다는데 큰 의미가 있다. 또한 콘포머 모델보다 높은 수준의 정확도를 보임으로써 노인 음성 인식을 위한 딥러닝 모델 연구에 기여했다.
PDF

A Research of S-100 GI Registry (S-100 표준화 등록소 구축 및 활용방안 연구)

Choi, Hyun-Soo;Oh, Se-Woong;Kang, Dong-Woo
- Proceedings of the Korean Institute of Navigation and Port Research Conference
- /
- 2018.05a
- /
- pp.87-88
- /
- 2018
본 연구에서는 국제수로기구(IHO)에서 제정한 S-100/10X 표준의 지속적 안정화를 위해 S-100 표준화 등록소(S-100 GI Registry)를 신규로 구축하였다. 이를 통하여 S-100 기반의 다양한 제품표준 사양에서 사용되는 피쳐 정보와 심볼을 체계적으로 관리하고 이용할 수 있을 것으로 예측된다.
PDF

Automatic Recognition of Analog and Digital Modulation Signals (아날로그 및 디지털 변조 신호의 자동 인식)

Seo Seunghan;Yoon Yeojong;Jin Younghwan;Seo Yongju;Lim Sunmin;Ahn Jaemin;Eun Chang-Soo;Jang Won;Nah Sunphil
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.30 no.1C
- /
- pp.73-81
- /
- 2005
We propose an automatic modulation recognition scheme which extracts pre-defined key features from the received signal and then applies equal gain combining method to determine the used modulation. Moreover, we compare and analyze the performance of the proposed algorithm with that of decision-theoretic algorithm. Our scheme extracts five pre-defined key features from each data segment, a data unit for the key feature extraction, which are then averaged over all the segments to recognize the modulation according to the decision procedure. We check the performance of the proposed algorithm through computer simulations for analog modulations such as AM, FM, SSB and for digital modulations such as FSK2, FSK4, PSK2, and PSK4, by measuring recognition success rate varying SNR and data collection time. The result shows that the performance of the proposed scheme is comparable to that of the decision-theoretic algorithm with less complexity.
PDF KSCI

New feature and SVM based advanced classification of Computer Graphics and Photographic Images (노이즈 기반의 새로운 피쳐(feature)와 SVM에 기반한 개선된 CG(Computer Graphics) 및 PI(Photographic Images) 판별 방법)

Jeong, DooWon;Chung, Hyunji;Hong, Ilyoung;Lee, Sangjin
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.24 no.2
- /
- pp.311-318
- /
- 2014
As modern computer graphics technology has been developed, it is hard to discriminate computer graphics from photographic images with the naked eye. Advances in graphics technology has brought a lot of convenience to human, it has side effects such as image forgery, malicious edit and fraudulent means. In order to cope with such problems, studies of various algorithms using a feature that represents a characteristic of an image has been processed. In this paper, we verify directly the existing algorithm, and provide new features based a noise that represents the characteristics of the computer graphics well. And this paper introduces the method of using SVM(Support Vector Machine) with features proposed in previous research to improve the discrimination accuracy.
https://doi.org/10.13089/JKIISC.2014.24.2.311 인용 PDF KSCI HTML

On the Importance of Tonal Features for Speech Emotion Recognition (음성 감정인식에서의 톤 정보의 중요성 연구)

Lee, Jung-In;Kang, Hong-Goo
- Journal of Broadcast Engineering
- /
- v.18 no.5
- /
- pp.713-721
- /
- 2013
This paper describes an efficiency of chroma based tonal features for speech emotion recognition. As the tonality caused by major or minor keys affects to the perception of musical mood, so the speech tonality affects the perception of the emotional states of spoken utterances. In order to justify this assertion with respect to tonality and emotion, subjective hearing tests are carried out by using synthesized signals generated from chroma features, and consequently show that the tonality contributes especially to the perception of the negative emotion such as anger and sad. In automatic emotion recognition tests, the modified chroma-based tonal features are shown to produce noticeable improvement of accuracy when they are supplemented to the conventional log-frequency power coefficient (LFPC)-based spectral features.
https://doi.org/10.5909/JBE.2013.18.5.713 인용 PDF KSCI

A study on the development of S-100 based product specifications (S-100 범용수로데이터모델 제품표준 개발 연구)

Ko, Hyun-Joo;Oh, Se-Woong;Sim, Woo-Sung
- Proceedings of the Korean Institute of Navigation and Port Research Conference
- /
- 2013.06a
- /
- pp.317-318
- /
- 2013
International Hydrography Organization has published S-100 Universal Hydrographic Data Model to support use of various hydrographic data for navigational safety. In the S-100 standards, it is possible to manage hydrographic data and apply various application field by introducing the concept of registry and its register. In this study, the S-100 standard based product specification in the field of maritime safety is developed by designing application schema according to general feature model defined in the S-100 standard, and feature catalogue is produced through simple registry.
PDF

Multi-Cutting Machine for TJ Coupler production (머신러닝 기법을 활용한 주술기 저혈압 발생 환자 예측)

Lee, Ji-hyun;Kang, Ah Reum;Kim, Sang-Hyun;Woo, JiYoung
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2019.01a
- /
- pp.27-28
- /
- 2019
수술 시 시행되는 마취 과정에서 저혈압, 빈맥 등의 합병증이 다양한 정도로 발생한다. 이는 환자의 수술 후 심근경색이나 급성 신장 손상과 같은 심각한 합병증을 야기할 수 있으며 이러한 합병증들은 환자를 사망에 이르게 하는 원인이 되기도 한다. 본 연구에서는 머신러닝 기법을 활용해 전신마취 유도 중 저혈압 발생 환자를 예측하고자 한다. 순천향대학교 부천병원에서 수집된 207명 환자의 데이터를 이용하여 저혈압 발생 환자를 탐지하는 모델을 구축하였다. 의무 기록정보에 나타난 성별, 나이, 몸무게, 키, 신체적 상태 정보와 마취 유도 단계의 생체 신호 정보를 이용하였다. 신체적 상태 정보를 제외한 전체 피쳐를 모두 사용하였을 때, 탐지 정확도 68.06%, 관련 논문을 바탕으로 중요 피쳐만을 사용하여 실험하였을 때, 정확도 71.53%였으며, 환자의 신체적 상태 피쳐를 포함하여 실험하였을 때, 정확도 75%로 가장 우수한 결과를 얻었다.
PDF

S-100 기반 e-Nav DB 제작 지침 지원 S/W 개발 방안 연구

Hwang, Seon-Pil;O, Se-Ung;Sim, U-Seong;Kim, Seon-Yeong
- Proceedings of the Korean Institute of Navigation and Port Research Conference
- /
- 2015.07a
- /
- pp.121-123
- /
- 2015
해사안전과 해양환경 보호를 위해 국제해사기구가 추진하고 있는 e-Navigation 정보의 종류는 해상교통 정보, 수로도서지 정보, 동적해역 정보 등 다양하며, 국제해사기구가 e-Nav 정보 표준 개발에 S-100 표준을 적용하기로 결정함에 따라 일관된 기준에 따른 e-Nav DB 구축이 요구되고 있다. S-100 표준에는 DB 구축 지침 문서인 DCEG(Data Classification Encoding Guide) 양식을 정의한 바 있는데 본 문서는 타 표준 문서와 일관성있게 작성되어야 하나, S-100 표준 개발과정에서 휴먼 에러나 오류에 문제점이 발생되고 있다. 본 연구에서는 이를 위하여 S-100 기반 e-Nav DB 구축문서 지원을 위한 S/W 개발 방안에 대해 연구 하였는데, S-100 표준 체계에서 등록소의 FCD와 피쳐 카탈로그 DB가 운영된다는 점에 착안하여, 피쳐 카탈로그 DB와 연계할 수 있는 DCEG DB과 DCEG Editor 개발방안을 제안 하였다. 본 연구에서 제안하는 방안은 향후 일관적인 e-Navigation DB 구축에 도움이 될 것으로 사료된다.
PDF

Face recognition of Intra-red Images for Interactive TV Control System (인터랙티브 TV 컨트롤 시스템을 위한 근적외선 영상의 얼굴 인식)

Won, Chul-Ho;Lee, Sang-Heon;Lee, Tae-Gyoun
- Journal of Korea Society of Industrial Information Systems
- /
- v.15 no.5
- /
- pp.11-17
- /
- 2010
In this parer, face recognition method which can be applied to ITCS (interactive TV control system) is proposed. We extracted ULBP(uniform local binary pattern) histogram feature from infra-red images, and we detected left-right eyes and face region by using SVM classifier. Then, We implemented face recognition system which is using Gabor transform and ULBP histogram feature and applied to personal verification for ITCS.
https://doi.org/10.9723/jksiis.2010.15.5.011 인용 PDF KSCI

An Optimization Strategy for Vector Spatial Data Transmission onover the Internet (인터넷을 통한 벡터 공간 데이타의 효율적 전송을 위한 최적화 기법)

Liang Chen;Chung-Ho Lee;Hae-Young Bae
- Journal of KIISE:Databases
- /
- v.30 no.3
- /
- pp.273-285
- /
- 2003
Generally, vector spatial data, with richer information than raster spatial data enabledata, enables a mere flexible and effective manipulation of the data sets. However, one of challenges against the publication of vector spatial information on the Internet is the efficient transmission of the big and complex vector spatial datadata, which is both large and complex, across the narrow-bandwidth of the Internet. This paper proposes a new transmission method, namely, the Scale-Dependent Transmission method, with the purpose of improving the efficiency of vector spatial data transmission on the narrow-bandwidthacross the Internet. Simply put, its nam idea is “Transmit what can be seen””. Scale is regarded as a factor naturally associated with spatial features so that not all features are visible to users at a certain scale. With the aid of the Wavelet-Wavelet-based Map Generalization Algorithm, the proposed method filters out invisible features from spatial objects according to the display scale and then to transmit onlytransmits only the visible features as athe final answer for an individual operation. Experiments show that the response times ofan individual operation has been reducedoperations were substantially by the usage of reduced when using the proposed method.
PDF KSCI

Search Result 87, Processing Time 0.03 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)