통합 검색 | Korea Science

언어 모델 네트워크에 기반한 대어휘 연속 음성 인식 (Large Vocabulary Continuous Speech Recognition Based on Language Model Network)

안동훈;정민화
- 한국음향학회지
- /
- 제21권6호
- /
- pp.543-551
- /
- 2002
이 논문에서는 20,000 단어급의 대어휘를 대상으로 실시간 연속음성 인식을 수행할 수 있는 탐색 방법을 제안한다. 기본적인 탐색 방법은 토큰 전파 방식의 비터비 (Viterbi) 디코딩 알고리듬을 이용한 1 패스로 구성된다. 언어 모델 네트워크를 도입하여 다양한 언어 모델들을 일관된 탐색 공간으로 구성하도록 하였으며, 프루닝(pruning) 단계에서 살아남은 토큰들로부터 동적으로 탐색 공간을 재구성하였다. 용이한 후처리를 위해 워드그래프 및 N개의 최적 문장을 출력할 수 있도록 비터비 알고리듬을 수정하였다. 이렇게 구성된 디코더는 20,000 단어급 데이터 베이스에 대해 테스트하였으며 인식률 및 RTF측면에서 평가되었다.
PDF KSCI

사용자 참여 가상공간 스토리북 구현 (A study of user performed Virtual Space Storybook)

박수진;정문열
- 한국컴퓨터그래픽스학회논문지
- /
- 제25권3호
- /
- pp.115-122
- /
- 2019
본 연구에서는 사용자 참여를 유도하는 가상공간 스토리 북을 기획하고 구현하여 연구 실험 하였다. 구현한 스토리북은 시나리오를 진행하기 위해서 사용자가 적극적으로 시나리오상의 미션을 수행해야 시나리오가 진행이 되는 것이다. 가상공간 스토리 북의 시나리오를 진행하는 절차는 다음과 같다. 첫 번째, 프로젝션으로 가상공간을 구현한다. 두 번째, 시나리오에 맞추어 사용자는 실제 물체를 가져와 가상의 공간에 실제 물체를 삽입한다. 세 번째, 실제 물체와 대응되는 3D모델이 증강한다. 마지막으로 사용자는 증강된 이미지와 실제 물체를 자유롭게 제어함으로 가상공간에서 이루어지는 시나리오를 체험한다. 구현한 결과물은 3명의 5살 어린이들에게 유저 스터디를 진행하였다. 실험에 참여한 어린이는 가상공간 스토리북을 매우 잘 이해하는 모습을 보였으며 실제 물체를 가상의 공간에 집어넣는 과정을 이해하는 모습이 관찰되었다. 또한 어린이는 현실과 가상을 혼돈하지 않고 실제 물체와 가상 이미지를 구분하는 모습을 확인할 수 있었다. 결과적으로 가상공간 스토리 북을 통해 가상의 공간위에 실제 물체를 증강시키는, 방식을 가진 가상공간 스토리북의 가능성을 확인 할 수 있었다.
https://doi.org/10.15701/kcgs.2019.25.3.115 인용 PDF KSCI

PCA기반의 스테레오 얼굴영상에서 거리에 따른 인식률 비교 (Comparison of recognition rate with distance on stereo face images base PCA)

박장한;남궁재찬
- 대한전자공학회논문지SP
- /
- 제42권1호
- /
- pp.9-16
- /
- 2005
본 논문에서는 스테레오 영상에서 좌ㆍ우측 영상을 입력받아 거리 변화에 따른 얼굴인식률을 PCA(Principal Component Analysis) 알고리듬으로 비교한다. 제안된 방법에서는 RGB컬러공간에서 YCbCr컬러공간으로 변환하여 얼굴영역을 검출한다. 또한 스테레오 영상을 이용하여 거리를 취득한 후 추출된 얼굴영상의 확대 및 축소하여 보다 강건한 얼굴영역을 추출하고, PCA 알고리듬으로 인식률을 실험하였다. 취득된 얼굴영상의 평균적인 인식결과로 98.61%(30cm), 98.91%(50cm), 99.05%(100cm), 99.90%(120cm), 97.31%(150cm), 96.71%(200cm)의 인식률을 얻을 수 있었다. 따라서 실험을 통하여 제안된 방법은 거리에 따라 확대 및 축소를 적용하면 높은 인식률을 얻을 수 있음을 보였다.
PDF KSCI

한국어 주소 음성인식의 고속화를 위한 적응 프루닝 문턱치 알고리즘 (An Adaptive Pruning Threshold Algorithm for the Korean Address Speech Recognition)

황철준;오세진;김범국;정호열;정현열
- 한국음향학회지
- /
- 제20권7호
- /
- pp.55-62
- /
- 2001
음성인식의 고속화를 위한 저자들에 의한 기존의 연구에서는 탐색이 진행함에 따라 시간방향의 탐색공간 문턱치를 가변적으로 적용하여 인식률의 저하없이 인식속도를 개선시켰다. 이 방법은 탐색 공간을 효과적으로 줄일 수는 있었으나 문턱치를 결정하기 위해서 여러 번의 사전 실험을 수행하여야 하는 번거러움이 있었다. 이러한 문제점을 해결하기 위하여 본 논문에서는 이전 탐색구간에 대한 최대우도와 후보들의 우도를 이용하여 현재 탐색구간의 문턱치를 탐색이 진행하는 과정에서 자동적으로 구하는 적응 프루닝 문턱치 알고리즘을 제안하였다. 제안한 알고리즘의 유효성을 확인하기 위해 국내 행정단위 시 (도), 구 (군), 동 (읍, 면), 번지를 구성하는 단어로 구성된 주소 인식 시스템에 적용하여 기존의 방법과 제안한 방법을 비교 검토하였다. 인식실험 결과, 연결단어 인식률 96.0%, 단어 인식률이 98.7%인 경우를 기준으로 하였을 때 제안된 방법이 기존의 고정 프루닝과 가변 프루닝 문턱치에 비하여 인식률 저하없이 각각 14.4%와 9.14%의 탐색 공간을 상대적으로 줄일 수 있어 제안된 방법의 유효성을 확인할 수 있었다.
PDF

영구임대아파트 거주민의 불만도와 인근주민의 인식에 대한 연구 (A Study on the Dissatisfaction Evaluation of Rental Apartment Residents and Recognition of Neighborhoods)

김민희;최정민
- 한국주거학회:학술대회논문집
- /
- 한국주거학회 2005년도 추계학술대회 논문집
- /
- pp.355-358
- /
- 2005
This study investigates both the dissatisfaction evaluation of residents in Kangseo Permanant Rental Housing(PRH) complex managed by SH corporation and their neighborhood's recognition on the PRH complex in terms of questionnaires divided by two groups. The result shows that the residents in PRH complex are especially dissatisfied with lining space and number of rooms, parking capacity, outmoded building, and incidence of crime.
PDF

상태공간탐색을 이용한 한글패턴 인식방법 (A Recognition Method of HANGEUL Pattern Using a State Space Search)

김상진;이병래;박규태
- 한국통신학회논문지
- /
- 제15권4호
- /
- pp.267-277
- /
- 1990
이 논문에서는 인공지능의 기본적인 문제풀이 기법인 상태공간 탐색을 이용하여 한글을 구성하는 기본자소를 분리하여 인식하는 방법을 제안하였다. 자소분리와 인식과정을 보다 밀접하게 결합하기 위하여 문제를 상태공간에 표현하고, 이 공간을 탐색하여 풀이하였다. 그리고 탐색효율을 향상시키기 위하여 한글의 조합규칙에 입각한 구조정보와 매트릭스 평면에서 각 자소가 갖는 위치정보를 이용하였으며, 컴퓨터실험을 통하여 그 유용성을 확인하였다.
PDF

Human Gait Recognition Based on Spatio-Temporal Deep Convolutional Neural Network for Identification

Zhang, Ning;Park, Jin-ho;Lee, Eung-Joo
- 한국멀티미디어학회논문지
- /
- 제23권8호
- /
- pp.927-939
- /
- 2020
Gait recognition can identify people's identity from a long distance, which is very important for improving the intelligence of the monitoring system. Among many human features, gait features have the advantages of being remotely available, robust, and secure. Traditional gait feature extraction, affected by the development of behavior recognition, can only rely on manual feature extraction, which cannot meet the needs of fine gait recognition. The emergence of deep convolutional neural networks has made researchers get rid of complex feature design engineering, and can automatically learn available features through data, which has been widely used. In this paper,conduct feature metric learning in the three-dimensional space by combining the three-dimensional convolution features of the gait sequence and the Siamese structure. This method can capture the information of spatial dimension and time dimension from the continuous periodic gait sequence, and further improve the accuracy and practicability of gait recognition.
https://doi.org/10.9717/kmms.2020.23.8.927 인용 PDF KSCI HTML

음성인식을 위한 변환 공간 모델에 근거한 순차 적응기법 (Sequential Adaptation Algorithm Based on Transformation Space Model for Speech Recognition)

김동국;장준혁;김남수
- 음성과학
- /
- 제11권4호
- /
- pp.75-88
- /
- 2004
In this paper, we propose a new approach to sequential linear regression adaptation of continuous density hidden Markov models (CDHMMs) based on transformation space model (TSM). The proposed TSM which characterizes the a priori knowledge of the training speakers associated with maximum likelihood linear regression (MLLR) matrix parameters is effectively described in terms of the latent variable models. The TSM provides various sources of information such as the correlation information, the prior distribution, and the prior knowledge of the regression parameters that are very useful for rapid adaptation. The quasi-Bayes (QB) estimation algorithm is formulated to incrementally update the hyperparameters of the TSM and regression matrices simultaneously. Experimental results showed that the proposed TSM approach is better than that of the conventional quasi-Bayes linear regression (QBLR) algorithm for a small amount of adaptation data.
PDF

Maximum Likelihood Training and Adaptation of Embedded Speech Recognizers for Mobile Environments

Cho, Young-Kyu;Yook, Dong-Suk
- ETRI Journal
- /
- 제32권1호
- /
- pp.160-162
- /
- 2010
For the acoustic models of embedded speech recognition systems, hidden Markov models (HMMs) are usually quantized and the original full space distributions are represented by combinations of a few quantized distribution prototypes. We propose a maximum likelihood objective function to train the quantized distribution prototypes. The experimental results show that the new training algorithm and the link structure adaptation scheme for the quantized HMMs reduce the word recognition error rate by 20.0%.
https://doi.org/10.4218/etrij.10.0209.0242 인용 PDF KSCI

애니메이션 분야의 심미적 인식에 의한 동일시와 동기화 연출 (Directed Identification, Synchronization by Aesthetic Recognition of Animation Field)

이현우;류창수
- 한국멀티미디어학회논문지
- /
- 제25권10호
- /
- pp.1475-1482
- /
- 2022
Mickey Mousing perfect match between animation sound and image was an aesthetic in the field of animation, but since the 2000s, works such as and released by producers such as DreamWorks and Pixar have expanded the perfection of synchronization to irony. It also influenced the identification system of sentiment. It is time to view the directing attempt of these elements as a factor that changed the new paradigm of narrative, and related research is needed. In this study, the scene of was analyzed as a case study for the synchronization of animation sound and image components and the boundary direction on the recognition of identification between reality and fiction. Aesthetic recognition of the research work is based on the premise of real time and space perception, and the audience can recognize in the conceptual world as an integrated art by playfully producing fictional time and space. The direct antithesis of synchronization and identification was drawn to maintain the curiosity of the next scene by repeating selective concealment and disclosure of information in the direction of conveying an unfamiliar and heterogeneous feeling to the audience.
https://doi.org/10.9717/kmms.2022.25.10.1475 인용 PDF KSCI

검색결과 1,166건 처리시간 0.024초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)