• Title/Summary/Keyword: Vocabulary recognition

Search Result 221, Processing Time 0.025 seconds

The Problem of multi-dimension communication and 21st century Media Art (21세기 다차원 커뮤니케이션과 매체예술의 문제)

  • Park Ki-Woong
    • Journal of Science of Art and Design
    • /
    • v.3
    • /
    • pp.5-32
    • /
    • 2001
  • At the beginning of 21st Century, human desire of communication is forwarding to the Cosmos The way of Communication is going to be not by simple methodology but by the complicated methodology with high technology. It will not be monologue communication but be interactive communication, that will make intelligence Infra, which could be able to communicate from the location of information is present to which is not present; and the multi-media which will be able to solve the technical problems of these communication; will be developed continually. In the genre of plastic art, there are no exception in these changedness. More developed and proceed to develope intermedia could be able of intercommunication. The more development of technology could be able to the more development of new plastic art. Furthermore the development of science make the genre of art to be changed. There are no exception of this changedness in any part of society, The art always has been guided by the person who has proceeding idea for new value. The 21st century plastic art will be in the procedure of the intercommunication. Human-being's concern is to communicate with Universe, and that will be multi-dimension$(4{\cdot}5{\cdot}6{\cdot}7{\cdots}dimension)$ communication beyond our usual recognition. To conjoin this, the possibility of cyber space expressing is going to be considered, and the way is being done by the development of the Media Art, which is able to go and back to the cyber-space. And the message will be so complicate beyond our recognition. Because we will need to communicate with various newly-built vocabulary, so we need to magnify the repertories of new vocabulary.

  • PDF

Korean Word Segmentation and Compound-noun Decomposition Using Markov Chain and Syllable N-gram (마코프 체인 밀 음절 N-그램을 이용한 한국어 띄어쓰기 및 복합명사 분리)

  • 권오욱
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.3
    • /
    • pp.274-284
    • /
    • 2002
  • Word segmentation errors occurring in text preprocessing often insert incorrect words into recognition vocabulary and cause poor language models for Korean large vocabulary continuous speech recognition. We propose an automatic word segmentation algorithm using Markov chains and syllable-based n-gram language models in order to correct word segmentation error in teat corpora. We assume that a sentence is generated from a Markov chain. Spaces and non-space characters are generated on self-transitions and other transitions of the Markov chain, respectively Then word segmentation of the sentence is obtained by finding the maximum likelihood path using syllable n-gram scores. In experimental results, the algorithm showed 91.58% word accuracy and 96.69% syllable accuracy for word segmentation of 254 sentence newspaper columns without any spaces. The algorithm improved the word accuracy from 91.00% to 96.27% for word segmentation correction at line breaks and yielded the decomposition accuracy of 96.22% for compound-noun decomposition.

Characteristic on the emotional recognition of consumer about the formative language (디자인 조형언어에 대한 소비자의 감성적 인지특성)

  • Min, Kyung-Taek;Heo, Seong-Cheol
    • Science of Emotion and Sensibility
    • /
    • v.12 no.1
    • /
    • pp.87-96
    • /
    • 2009
  • Recently, there is a tendency of consumer's participation gradually increasing in the design shaping process. Consumers make evaluation or suggestion about the shape of the product, and the industries lay out schemes to elicit consumers' participation. However, when it comes to dealing with the shape of the product, consumer and designer has a fundamental difference in their point of view, and it works as interruption to the efficient communication between the consumer and designer. Therefore, this study will examine the difference of consumer's and designer's view of products' shape, and the guidelines of effective molding which elicit the consumers' affective responses. First, I established the sensible image vocabulary based on the shape of the product. And based on the vocabulary, I carried out the same experiments to the consumers and designers. As a result, the affective responses of the two groups toward the shape have similar characteristics and designers' reactions found out to be more dramatic than consumers.

  • PDF

Language Model based on VCCV and Test of Smoothing Techniques for Sentence Speech Recognition (문장음성인식을 위한 VCCV 기반의 언어모델과 Smoothing 기법 평가)

  • Park, Seon-Hee;Roh, Yong-Wan;Hong, Kwang-Seok
    • The KIPS Transactions:PartB
    • /
    • v.11B no.2
    • /
    • pp.241-246
    • /
    • 2004
  • In this paper, we propose VCCV units as a processing unit of language model and compare them with clauses and morphemes of existing processing units. Clauses and morphemes have many vocabulary and high perplexity. But VCCV units have low perplexity because of the small lexicon and the limited vocabulary. The construction of language models needs an issue of the smoothing. The smoothing technique used to better estimate probabilities when there is an insufficient data to estimate probabilities accurately. This paper made a language model of morphemes, clauses and VCCV units and calculated their perplexity. The perplexity of VCCV units is lower than morphemes and clauses units. We constructed the N-grams of VCCV units with low perplexity and tested the language model using Katz, absolute, modified Kneser-Ney smoothing and so on. In the experiment results, the modified Kneser-Ney smoothing is tested proper smoothing technique for VCCV units.

A Study on Improving Speech Recognition Rate (H/W, S/W) of Speech Impairment by Neurological Injury (신경학적 손상에 의한 언어장애인 음성 인식률 개선(H/W, S/W)에 관한 연구)

  • Lee, Hyung-keun;Kim, Soon-hub;Yang, Ki-Woong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.11
    • /
    • pp.1397-1406
    • /
    • 2019
  • In everyday mobile phone calls between the disabled and non-disabled people due to neurological impairment, the communication accuracy is often hindered by combining the accuracy of pronunciation due to the neurological impairment and the pronunciation features of the disabled. In order to improve this problem, the limiting method is MEMS (micro electro mechanical systems), which includes an induction line that artificially corrects difficult vocalization according to the oral characteristics of the language impaired by improving the word of out of vocabulary. mechanical System) Microphone device improvement. S/W improvement is decision tree with invert function, and improved matrix-vector rnn method is proposed considering continuous word characteristics. Considering the characteristics of H/W and S/W, a similar dictionary was created, contributing to the improvement of speech intelligibility for smooth communication.

A Study on the Diphone Recognition of Korean Connected Words and Eojeol Reconstruction (한국어 연결단어의 이음소 인식과 어절 형성에 관한 연구)

  • ;Jeong, Hong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.4
    • /
    • pp.46-63
    • /
    • 1995
  • This thesis described an unlimited vocabulary connected speech recognition system using Time Delay Neural Network(TDNN). The recognition unit is the diphone unit which includes the transition section of two phonemes, and the number of diphone unit is 329. The recognition processing of korean connected speech is composed by three part; the feature extraction section of the input speech signal, the diphone recognition processing and post-processing. In the feature extraction section, the extraction of diphone interval in input speech signal is carried and then the feature vectors of 16th filter-bank coefficients are calculated for each frame in the diphone interval. The diphone recognition processing is comprised by the three stage hierachical structure and is carried using 30 Time Delay Neural Networks. particularly, the structure of TDNN is changed so as to increase the recognition rate. The post-processing section, mis-recognized diphone strings are corrected using the probability of phoneme transition and the probability o phoneme confusion and then the eojeols (Korean word or phrase) are formed by combining the recognized diphones.

  • PDF

A Study on the Korean Broadcasting Speech Recognition (한국어 방송 음성 인식에 관한 연구)

  • 김석동;송도선;이행세
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.1
    • /
    • pp.53-60
    • /
    • 1999
  • This paper is a study on the korean broadcasting speech recognition. Here we present the methods for the large vocabuary continuous speech recognition. Our main concerns are the language modeling and the search algorithm. The used acoustic model is the uni-phone semi-continuous hidden markov model and the used linguistic model is the N-gram model. The search algorithm consist of three phases in order to utilize all available acoustic and linguistic information. First, we use the forward Viterbi beam search to find word end frames and to estimate related scores. Second, we use the backword Viterbi beam search to find word begin frames and to estimate related scores. Finally, we use A/sup */ search to combine the above two results with the N-grams language model and to get recognition results. Using these methods maximum 96.0% word recognition rate and 99.2% syllable recognition rate are achieved for the speaker-independent continuous speech recognition problem with about 12,000 vocabulary size.

  • PDF

A Real-Time Implementation of Isolated Word Recognition System Based on a Hardware-Efficient Viterbi Scorer (효율적인 하드웨어 구조의 Viterbi Scorer를 이용한 실시간 격리단어 인식 시스템의 구현)

  • Cho, Yun-Seok;Kim, Jin-Yul;Oh, Kwang-Sok;Lee, Hwang-Soo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.13 no.2E
    • /
    • pp.58-67
    • /
    • 1994
  • Hidden Markov Model (HMM)-based algorithms have been used successfully in many speech recognition systems, especially large vocabulary systems. Although general purpose processors can be employed for the system, they inevitably suffer from the computational complexity and enormous data. Therefore, it is essential for real-time speech recognition to develop specialized hardware to accelerate the recognition steps. This paper concerns with a real-time implementation of an isolated word recognition system based on HMM. The speech recognition system consists of a host computer (PC), a DSP board, and a prototype Viterbi scoring board. The DSP board extracts feature vectors of speech signal. The Viterbi scoring board has been implemented using three field-programmable gate array chips. It employs a hardware-efficient Viterbi scoring architecture and performs the Viterbi algorithm for HMM-based speech recognition. At the clock rate of 10 MHz, the system can update about 100,000 states within a single frame of 10ms.

  • PDF

Automatic Target Recognition Study using Knowledge Graph and Deep Learning Models for Text and Image data (지식 그래프와 딥러닝 모델 기반 텍스트와 이미지 데이터를 활용한 자동 표적 인식 방법 연구)

  • Kim, Jongmo;Lee, Jeongbin;Jeon, Hocheol;Sohn, Mye
    • Journal of Internet Computing and Services
    • /
    • v.23 no.5
    • /
    • pp.145-154
    • /
    • 2022
  • Automatic Target Recognition (ATR) technology is emerging as a core technology of Future Combat Systems (FCS). Conventional ATR is performed based on IMINT (image information) collected from the SAR sensor, and various image-based deep learning models are used. However, with the development of IT and sensing technology, even though data/information related to ATR is expanding to HUMINT (human information) and SIGINT (signal information), ATR still contains image oriented IMINT data only is being used. In complex and diversified battlefield situations, it is difficult to guarantee high-level ATR accuracy and generalization performance with image data alone. Therefore, we propose a knowledge graph-based ATR method that can utilize image and text data simultaneously in this paper. The main idea of the knowledge graph and deep model-based ATR method is to convert the ATR image and text into graphs according to the characteristics of each data, align it to the knowledge graph, and connect the heterogeneous ATR data through the knowledge graph. In order to convert the ATR image into a graph, an object-tag graph consisting of object tags as nodes is generated from the image by using the pre-trained image object recognition model and the vocabulary of the knowledge graph. On the other hand, the ATR text uses the pre-trained language model, TF-IDF, co-occurrence word graph, and the vocabulary of knowledge graph to generate a word graph composed of nodes with key vocabulary for the ATR. The generated two types of graphs are connected to the knowledge graph using the entity alignment model for improvement of the ATR performance from images and texts. To prove the superiority of the proposed method, 227 documents from web documents and 61,714 RDF triples from dbpedia were collected, and comparison experiments were performed on precision, recall, and f1-score in a perspective of the entity alignment..

Automatic Speech Recognition Research at Fujitsu (후지쯔에 있어서의 음성 자동인식의 현상과 장래)

  • Nara, Yasuhiro;Kimura, Shinta;Loken-Kim, K.H.
    • The Journal of the Acoustical Society of Korea
    • /
    • v.10 no.1
    • /
    • pp.82-91
    • /
    • 1991
  • The history of automatic speech recognition research, and current and future speech products at Fujitsu are introduced here. The speech recognition research at Fujitsu started in 1970. Our research efforts have results in the production of a speaker dependent 12,000 word discrete / connected word recognizer(F2360), and a speaker independent 17 word discrete word recognizer(F2355L/S). Currently, we are working on a larger vocabulary speech recognizer, in which an input utterance will be matched with networks representing possible phonemic variations. Its application to text input is also discussed.

  • PDF