통합 검색 | Korea Science

청각모델을 이용한 음성신호의 특징 추출 방법에 관한 연구 (Speech Feature Extraction Using Auditory Model)

박규홍;김영호;정상국;노승용
- 대한전기학회:학술대회논문집
- /
- 대한전기학회 1998년도 하계학술대회 논문집 G
- /
- pp.2259-2261
- /
- 1998
Auditory Models that are capable of achieving human performance would provide a basis for realizing effective speech processing systems. Perceptual invariance to adverse signal conditions (noise, microphone and channel distortions, room reverberations) may provide a basis for robust speech recognition and speech coder with high efficiency. Auditory model that simulates the part of auditory periphery up through the auditory nerve level and new distance measure that is defined as angle between vectors are described.
PDF

Human Sensibility Ergonomics Investigation of Car Navigation System Digital Map Color Structure

Cha, Doo-Won;Park, Peom
- 산업경영시스템학회지
- /
- 제23권60호
- /
- pp.47-55
- /
- 2000
Two experiments were conducted to examine the relationships between the color structure and the user preference of a CNS (Car Navigation System) digital map in terms of HSE (Human Sensibility Ergonomics). In the first experiment, the user's preference of color structures were investigated from the subjects' self-designed digital maps using a CNS digital map UIMS (User Interface Management System): in the second, statistical relation models between the user's color structure satisfaction level and the color components of CIE (Commission Internationale de ι'Eclairage) of the real products were suggested. For each experiment, CIE L*u*v* and CIE LCH color space were adapted, respectively, because they have their own characteristics of perceptual uniformity which enables the color components to transform a linear function.
PDF

A Case Study on Designing a Console Design Review System Considering Operators' Viewing Range and Anthropometric Data

Cha, Woo Chang;Choi, Eun Gyeong
- 대한인간공학회지
- /
- 제36권5호
- /
- pp.373-383
- /
- 2017
Objective: The aim of this study is to introduce an operator console design review system suitable for designing and evaluating consoles based on human factor guidelines for a digitalized main control room in an advanced nuclear power plant which has a requirement for anthropometric data usage. Background: The system interface of the main control room in a nuclear power plant has been getting digitalized and consists of various consoles with many information displays. Console operators often face human-computer interactive problems due to inappropriate console design stemming from the perceptual constraints of anthropometric data usage. Method: Computational models with a process of visual perception and variables of anthropometric data are used for designing and evaluating operator consoles suitable for human system interface guidelines, which are used in an advanced nuclear power plant. Results: From the computational model and simulation application, console dimensions and a designing test module, which would be used for designing suitable consoles with safety concerns in a nuclear power plant, have been introduced. Conclusion: This case study may influence employing a suitable design concept with various anthropometric data in many areas with safety concerns and may show a feasible solution to designing and evaluating the safety console dimensions. Application: The results of this study may be used for designing a control room with the human factors requiring a safe working environment.
https://doi.org/10.5143/JESK.2017.36.5.373 인용 PDF KSCI

빛과 어둠의 대비와 통합에 나타난 공간의 지각과 인식에 관한 연구 (A Study on Cognition and Perception of Space through Contrast and Integration of Light and Darkness)

김종진
- 한국실내디자인학회논문집
- /
- 제19권5호
- /
- pp.3-10
- /
- 2010
In the history of art and architecture, there are different characteristics in relationship between light and space. Among them, two characteristics seem to be fundamental : The first is that contrast between light and darkness is more articulated. Direct sunlight penetrates into the dark interior space made by heavy masonry structure. This is generally found in the traditional western religious buildings. The second is that light is mixed with darkness and becomes shade. Shade is different from shadow that is usually perceived as the opposite of light. Sunlight is filtered under through the big horizontal roof and rice paper walls in the traditional far-east Asian architecture and becomes weak ambient light. In this shade, there is no strong contrast between light and darkness. This difference is not only originated from the architectural differences, but also originated from the conceptual differences about light, space, and the world in two cultures. This paper tries to study the philosophical, aesthetical backgrounds as well as case examples in art and architecture of two characteristics. Based on the case studies, this paper aims to analyze the main perceptual structure. Finding the relationship between light, space, and human body by making three dimensional models is the crucial analysis method of this research. Although in real life and experiencing the world, these two characteristics are not clearly separated, comparative study based on different cultures gives opportunity to think of diverse perspectives on light and space.
PDF KSCI

Closure Duration and Pitch as Phonetic Cues to Korean Stop Identity in AP-medial Position: Perception Test

Kang, Hyun-Sook;Dilley, Laura
- 음성과학
- /
- 제14권4호
- /
- pp.25-39
- /
- 2007
The present study investigated some perceptual phonetic attributes of two Korean stop types, aspirated and lax, in medial position of an accentual phrase. The intonational pattern across syllables (Jun, 1993) is argued to depend on the type of stop (aspirated vs. lax) only in the initial position of an accentual phrase. In Kang & Dilley (2007), we showed that significant differences between aspirated and lax stops in medial position of an accentual phrase exist in closure duration, voice-onset time, and fundamental frequency (F0) values for post-stop vowels. In the present perception experiment, we investigated whether these phonetic attributes contribute to the perception of these two types of stops: The closure durations and/or F0's of post-stop vowels on accentual-phrase medial words were altered and twenty native Korean speakers then judged these words as beginning with an aspirated or lax stop. Both closure duration and F0 significantly affected judgments of stop identity. These results indicate that a wider range of acoustic cues that distinguish aspirated and lax Korean stops in production also plays a role in perception. To account for these results we suggest some phonetic and phonological models of consonant-tone interactions for Korean.
PDF

로봇 손의 힘 조절을 위한 생물학적 감각-운동 협응 (Sensory Motor Coordination System for Robotic Grasping)

김태형;김태선;수동성;이종호
- 대한전기학회논문지:시스템및제어부문D
- /
- 제53권2호
- /
- pp.127-134
- /
- 2004
In this paper, human motor behaving model based sensory motor coordination(SMC) algorithm is implemented on robotic grasping task. Compare to conventional SMC models which connect sensor to motor directly, the proposed method used biologically inspired human behaving system in conjunction with SMC algorithm for fast grasping force control of robot arm. To characterize various grasping objects, pressure sensors on hand gripper were used. Measured sensory data are simultaneously transferred to perceptual mechanism(PM) and long term memory(LTM), and then the sensory information is forwarded to the fastest channel among several information-processing flows in human motor system. In this model, two motor learning routes are proposed. One of the route uses PM and the other uses short term memory(STM) and LTM structure. Through motor learning procedure, successful information is transferred from STM to LTM. Also, LTM data are used for next moor plan as reference information. STM is designed to single layered perception neural network to generate fast motor plan and receive required data which comes from LTM. Experimental results showed that proposed method can control of the grasping force adaptable to various shapes and types of greasing objects, and also it showed quicker grasping-behavior lumining time compare to simple feedback system.
PDF KSCI

A 3D Audio-Visual Animated Agent for Expressive Conversational Question Answering

Martin, J.C.;Jacquemin, C.;Pointal, L.;Katz, B.
- 한국정보컨버전스학회:학술대회논문집
- /
- 한국정보컨버전스학회 2008년도 International conference on information convergence
- /
- pp.53-56
- /
- 2008
This paper reports on the ACQA(Animated agent for Conversational Question Answering) project conducted at LIMSI. The aim is to design an expressive animated conversational agent(ACA) for conducting research along two main lines: 1/ perceptual experiments(eg perception of expressivity and 3D movements in both audio and visual channels): 2/ design of human-computer interfaces requiring head models at different resolutions and the integration of the talking head in virtual scenes. The target application of this expressive ACA is a real-time question and answer speech based system developed at LIMSI(RITEL). The architecture of the system is based on distributed modules exchanging messages through a network protocol. The main components of the system are: RITEL a question and answer system searching raw text, which is able to produce a text(the answer) and attitudinal information; this attitudinal information is then processed for delivering expressive tags; the text is converted into phoneme, viseme, and prosodic descriptions. Audio speech is generated by the LIMSI selection-concatenation text-to-speech engine. Visual speech is using MPEG4 keypoint-based animation, and is rendered in real-time by Virtual Choreographer (VirChor), a GPU-based 3D engine. Finally, visual and audio speech is played in a 3D audio and visual scene. The project also puts a lot of effort for realistic visual and audio 3D rendering. A new model of phoneme-dependant human radiation patterns is included in the speech synthesis system, so that the ACA can move in the virtual scene with realistic 3D visual and audio rendering.
PDF

U-Publication 시스템과 비즈니스 모델의 설계와 분석 (Design and Evaluation of U-Publication: Tag-Embedded Publication System and Business Model)

박아름;이경전
- 지능정보연구
- /
- 제14권3호
- /
- pp.41-57
- /
- 2008
U-Publication은 독자가 기존의 출판물을 오프라인에서만 소비했던 것과는 달리, 여러 개의 자동인식태그가 부착되어 있어 자동인식태그 리더(reader)로 태그에 저장된 URL을 통해 온라인으로 접속할 수 있는 출판물을 말한다. U-Media를 기존의 미디어가 사람의 생체시스템에만 호소하는 것과 달리, 사람의 생체 시스템뿐만 아니라 사람에 내재되거나 사람이 가지고 있는 디지털 시스템에 호소하는 미디어라고 정의할 때(Lee & Ju 2007), U-Publication은 온라인과 오프라인이 Seamless하게 연결되어 양방향으로 정보가 이동할 수 있다는 측면에서 U-Media라고 할 수 있으며, 소비자들은 자동인식태그 부착 출판물의 인쇄된 컨텐트 뿐만 아니라 출판물에 부착된 태그의 링크를 통해서 추가적인 컨텐트를 소비할 수 있고 다양한 상거래를 할 수 있다. 이 논문은 U-Publication을 정의하고, 이를 기반으로 한 비즈니스 모델을 설계하고, 이 비즈니스 모델을 평가하는 과정을 시뮬레이션 방법에 의해 제시하고 있다.
PDF

다해상도 3D 얼굴 모델의 압축 (Multiresolution 3D Facial Model Compression)

박동희;이종석;이영식;배철수
- 한국정보통신학회:학술대회논문집
- /
- 한국해양정보통신학회 2002년도 춘계종합학술대회
- /
- pp.602-607
- /
- 2002
본 논문에서는 효율적인 압축 기법과 멀티미디어를 위한 다 해상도 3D 얼굴 모델 전송, 그리고 저비트율 응용에 대해 제안하고자 한다. 일반적으로 얼굴 모델은 3D 레이저 디지타이저에 의해서 얻어지게 되고 애니메이션, 비디오게임, 비디오 회의와 같은 응용 범위에 따라 여러 해상도로 재양자화 되어진다. 3D 디지털화된 얼굴 모델을 정합하고 재양자화 하기 위해서 2D 템플릿을 변형함으로써 압축 모델을 얻을 수 있다. 현재까지의 연구에서 다섯 가지 해상도로 계층적 2D 얼굴 와이어프레임 템플릿을 만들었다. 변형 과정에서 2D 템플릿은 얼굴 특징점과 제안된 PCAT(piecewise chainlet affine transformation)에 의해 바뀌게 된다. 재양자화된 후 3D 디지털화된 모델은 인지하지 못할 정도로 손실이 줄어들게 된다. 더욱이, 본 논문에서 제안한 계층적 데이터 구조를 갖는 다 해상도 얼굴모델은 통신망에서 점진적으로 알려지고 사용되어질 것이다.
PDF

MCE기반의 다중 특징 파라미터 스코어의 결합을 통한 화자인식 성능 향상 (Performance Improvement of Speaker Recognition by MCE-based Score Combination of Multiple Feature Parameters)

강지훈;김보람;김규영;이상훈
- 한국산학기술학회논문지
- /
- 제21권6호
- /
- pp.679-686
- /
- 2020
본 논문에서는 화자인식 성능 향상을 위해 음원에서 개선된 특징추출 방식과 최소 분류 오차 기반의 다중 특징 벡터 스코어에 대한 가중치 추정을 사용하여 스코어 결합을 제안하였다. 제안한 특징 벡터는 Glottal Flow에서 무의미한 정보구간인 평탄한 스펙트럼 구간을 제거하기 위하여 저역통과 필터를 수행한 신호에서 인지적 선형 예측 캡스트럼 계수, 왜도, 첨도를 추출하여 구성하였다. 제안한 특징 벡터는 종래의 음원에서 멜-주파수 캡스트럼 계수, 인지적 선형 예측 캡스트럼 계수를 추출하여 가우시안 혼합 모델로 모델링한 화자인식 시스템을 개선하기 위해 사용된다. 또한, 스코어 추정과정의 신뢰성을 높이기 위하여 기존의 스코어의 확률 분포를 사용하여 가중치를 추정하는 대신 제안한 특징 벡터에서 평가된 점수와 종래의 특징 벡터에서 평가된 점수에 대하여 최소 분류 오차 기법으로 가중치를 추정하여 스코어를 결합함으로써 최적의 화자를 찾는다. 실험 결과 제안한 특징 벡터가 화자를 인식하는데 유효한 정보를 포함하고 있는 것을 확인하였다. 또한, 최소 분류 오차 기반의 다중 특징 파라미터 스코어를 결합하여 화자인식을 수행하였을 때, 종래의 화자인식 성능보다 더 우수한 성능을 나타내는 것을 확인할 수 있으며, 특히 가우시안 혼합 모델이 낮을 때 더 높은 성능향상을 보였다.
https://doi.org/10.5762/KAIS.2020.21.6.679 인용 PDF KSCI

검색결과 60건 처리시간 0.026초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)