• Title/Summary/Keyword: Text-independent

Search Result 237, Processing Time 0.026 seconds

Audio-Based Human-Robot Interaction Technology (오디오 기반 인간로봇 상호작용 기술)

  • Kwak, K.C.;Kim, H.J.;Bae, K.S.;Yoon, H.S.
    • Electronics and Telecommunications Trends
    • /
    • v.22 no.2 s.104
    • /
    • pp.31-37
    • /
    • 2007
  • 인간로봇 상호작용 기술(human-robot interaction)은 다양한 의사소통 채널인 로봇카메라, 마이크로폰, 기타 센서를 통해 인지 및 정서적으로 상호작용할 수 있도록 로봇시스템 및 상호작용 환경을 디자인하고 구현 및 평가하는 지능형 서비스 로봇의 핵심기술이다. 본 고에서는 오디오 기반 인간로봇 상호작용 기술 중에서 음원 추적(sound localization)과 화자인식(speaker recognition) 기술의 국내외 기술동향을 살펴보고 최근 ETRI 지능형로봇연구단에서 상용화를 추진중인 시청각 기반 음원 추적(audio visual sound localization)과 문장독립 화자인식(text-independent speaker recognition)기술들을 다룬다. 또한 이들 기술들을 가정환경에서 효과적으로 사용하기 위해 음성인식, 얼굴검출, 얼굴인식 등을 결합한 시나리오에 대해서 살펴본다.

Contents Navigation System using Speech Recognition (음성인식 기반 컨텐츠 네비게이션 시스템)

  • Kim, Kee-Beak;Choi, Jong-Ho
    • KSCI Review
    • /
    • v.15 no.1
    • /
    • pp.99-102
    • /
    • 2007
  • 최근 들어 인간의 의지를 각종의 전자시스템에 전달하기 위한 수단으로 음성인식 기술을 이용하고자 하는 연구가 널리 진행되고 있다. 음성인식 인터페이스에서 가장 중요한 이슈는 처리시간의 감소 및 범용 인터페이스의 개발이다. 이러한 문제점을 해결하기 위하여 본 연구에서는 하드웨어 기반의 상용 IC로 생산되고 있는 음성인식프로세서인 RSC-4128이 내장된 음성인식 모듈 VR-STAMP를 사용하였다. 본 연구에서 새롭게 개발한 시스템은 T2SI(Text To Speaker Independent) 기반의 화자(話者)독립 방식으로 음성인식 신호를 컨텐츠 네비게이션 시스템의 제어신호로 활용하여 임베디드 시스템 및 PC 등에 설치된 윈도우즈 기반의 응용 소프트웨어를 제어할 수 있는 시스템이다. 필드 테스트를 통해 그 유용성을 확인한 결과, 본 연구에서 개발한 시스템은 컨텐츠 네비게이션은 물론 가전기기 제어 및 흠 네트워크 등에 널리 응용될 수 있을 것으로 판단된다.

  • PDF

Speaker Indexing using Vowel Based Speaker Identification Model (모음 기반 하자 식별 모델을 이용한 화자 인덱싱)

  • Kum Ji Soo;Park Chan Ho;Lee Hyon Soo
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.151-154
    • /
    • 2002
  • 본 논문에서는 음성 데이터에서 동일한 화자의 음성 구간을 찾아내는 화자 인덱싱(Speaker Indexing) 기술 중 사전 화자 모델링 과정을 통한 인덱싱 방법을 제안하고 실험하였다. 제안한 인덱싱 방법은 문장 독립(Text Independent) 화자 식별(Speaker Identification)에 사용할 수 있는 모음(Vowel)에 대해 특징 파라미터를 추출하고, 이를 바탕으로 화자별 모델을 구성하였다. 인덱싱은 음성 구간에서 모음의 위치를 검출하고, 구성한 화자 모델과의 거리 계산을 통하여 가장 가까운 모델을 식별된 결과로 한다. 그리고 식별된 결과는 화자 구간 변화와 음성 데이터의 특성을 바탕으로 필터링 과정을 거쳐 최종적인 인덱싱 결과를 얻는다. 화자 인덱싱 실험 대상으로 방송 뉴스를 녹음하여 10명의 화자 모델을 구성하였고, 인덱싱 실험을 수행한 결과 $91.8\%$의 화자 인덱싱 성능을 얻었다.

  • PDF

Speaker Identification in Small Training Data Environment using MLLR Adaptation Method (MLLR 화자적응 기법을 이용한 적은 학습자료 환경의 화자식별)

  • Kim, Se-hyun;Oh, Yung-Hwan
    • Proceedings of the KSPS conference
    • /
    • 2005.11a
    • /
    • pp.159-162
    • /
    • 2005
  • Identification is the process automatically identify who is speaking on the basis of information obtained from speech waves. In training phase, each speaker models are trained using each speaker's speech data. GMMs (Gaussian Mixture Models), which have been successfully applied to speaker modeling in text-independent speaker identification, are not efficient in insufficient training data environment. This paper proposes speaker modeling method using MLLR (Maximum Likelihood Linear Regression) method which is used for speaker adaptation in speech recognition. We make SD-like model using MLLR adaptation method instead of speaker dependent model (SD). Proposed system outperforms the GMMs in small training data environment.

  • PDF

Semantic Word Categorization using Feature Similarity based K Nearest Neighbor

  • Jo, Taeho
    • Journal of Multimedia Information System
    • /
    • v.5 no.2
    • /
    • pp.67-78
    • /
    • 2018
  • This article proposes the modified KNN (K Nearest Neighbor) algorithm which considers the feature similarity and is applied to the word categorization. The texts which are given as features for encoding words into numerical vectors are semantic related entities, rather than independent ones, and the synergy effect between the word categorization and the text categorization is expected by combining both of them with each other. In this research, we define the similarity metric between two vectors, including the feature similarity, modify the KNN algorithm by replacing the exiting similarity metric by the proposed one, and apply it to the word categorization. The proposed KNN is empirically validated as the better approach in categorizing words in news articles and opinions. The significance of this research is to improve the classification performance by utilizing the feature similarities.

Static Analysis Tools Against Cross-site Scripting Vulnerabilities in Web Applications : An Analysis

  • Talib, Nurul Atiqah Abu;Doh, Kyung-Goo
    • Journal of Software Assessment and Valuation
    • /
    • v.17 no.2
    • /
    • pp.125-142
    • /
    • 2021
  • Reports of rampant cross-site scripting (XSS) vulnerabilities raise growing concerns on the effectiveness of current Static Analysis Security Testing (SAST) tools as an internet security device. Attentive to these concerns, this study aims to examine seven open-source SAST tools in order to account for their capabilities in detecting XSS vulnerabilities in PHP applications and to determine their performance in terms of effectiveness and analysis runtime. The representative tools - categorized as either text-based or graph-based analysis tools - were all test-run using real-world PHP applications with known XSS vulnerabilities. The collected vulnerability detection reports of each tool were analyzed with the aid of PhpStorm's data flow analyzer. It is observed that the detection rates of the tools calculated from the total vulnerabilities in the applications can be as high as 0.968 and as low as 0.006. Furthermore, the tools took an average of less than a minute to complete an analysis. Notably, their runtime is independent of their analysis type.

Wide Sargasso Sea: An Elegy of Class Conflict in Jamaica

  • Park, Jai Young
    • Journal of English Language & Literature
    • /
    • v.57 no.6
    • /
    • pp.1199-1212
    • /
    • 2011
  • This paper is to scrutinize Jean Rhys' Wide Sargasso Sea through a Marxist criticism. While critics were industriously excavating discourses of feminism, post-colonialism, and racism in the novel, they tended to regard the Marxist attribute as supplementary material and to diminish the significance not considering as an independent subject to be examined. However, the novel, in which all the major relationships are based on capital, exemplifies class conflict between the bourgeois and the proletariat. Marx and Engels believe that the foundation of our society is capital and that society evolves through class conflict to obtain more capital, and thus they assert people's relations are the product of the commodification of individuals. Furthering their study, Louis Althusser specifies the power system through the (repressive) state apparatus and the ideological state apparatus. With the theories of the thinkers' above, this paper analyzes the relationship between Annette and Mason, Antoinette and her nameless husband, allegedly Rochester, Rochester and Amelie, and Rochester and Daniel Cosway. This paper offers an alternative reading of a classical feminist and post-colonial text.

Effective Text Question Analysis for Goal-oriented Dialogue (목적 지향 대화를 위한 효율적 질의 의도 분석에 관한 연구)

  • Kim, Hakdong;Go, Myunghyun;Lim, Heonyeong;Lee, Yurim;Jee, Minkyu;Kim, Wonil
    • Journal of Broadcast Engineering
    • /
    • v.24 no.1
    • /
    • pp.48-57
    • /
    • 2019
  • The purpose of this study is to understand the intention of the inquirer from the single text type question in Goal-oriented dialogue. Goal-Oriented Dialogue system means a dialogue system that satisfies the user's specific needs via text or voice. The intention analysis process is a step of analysing the user's intention of inquiry prior to the answer generation, and has a great influence on the performance of the entire Goal-Oriented Dialogue system. The proposed model was used for a daily chemical products domain and Korean text data related to the domain was used. The analysis is divided into a speech-act which means independent on a specific field concept-sequence and which means depend on a specific field. We propose a classification method using the word embedding model and the CNN as a method for analyzing speech-act and concept-sequence. The semantic information of the word is abstracted through the word embedding model, and concept-sequence and speech-act classification are performed through the CNN based on the semantic information of the abstract word.