• Title/Summary/Keyword: word database

Search Result 235, Processing Time 0.029 seconds

Natural Language Information Retrieval by Fuzzy Inference (퍼지 추론에 의한 자연언어 정보 검색)

  • Park, Hyeon-Gyu;O, Jong-Hun;Kim, Myeong-Ho;Choe, Gi-Seon;Lee, Gwang-Hyeong
    • The KIPS Transactions:PartB
    • /
    • v.8B no.3
    • /
    • pp.243-250
    • /
    • 2001
  • 인터넷 전자 상거래 시스템에서 주로 일어나는 정보 검색은 사용자의 상품정보 요구라고 할 수 있다. 이와 같이 사용자가 원하는 상품 정보를 웹 환경에서 검색하기 위해서는 편리한 검색 환경의 제공뿐만 아니라, 검색 성능의 효율성 또한 우수해야 한다. 인터넷 인구와 온라인 쇼핑몰의 급격한 증가로 인해 다양한 조건 검색에 의한 상품검색 요구가 증대되고 있다. 또한, 이러한 상품의 검색 결과는 사용자의 의도와 의미상으로 밀접한 관계를 가져야 한다. 자연언어 정보검색은 이러한 요구의 중요한 대안으로 대두되고 있으나, 자연언어 자체가 가지는 애매한 의미의 해석 등으로 인하여 상용 시스템에 적용하는데 많은 어려움이 있다. 본 논문에서는 이러한 문제점을 해결하기 위하여 퍼지추론을 이용한다. 입력된 자연언어 질의에서 형태소 분석을 통하여 데이터베이스 질의에 사용될 수 있는 의미어(content word)를 추출한 후, 의미어들을 재구성하여 템플릿을 작성한다. 작성된 템플릿은 퍼지 추론을 통하여 의미의 애매성을 해소하고 데이터베이스 질의로 변환하여 사용자의 질의 의도와 부합되는 검색 결과를 제시한다.

  • PDF

Speaker Adaptation Using i-Vector Based Clustering

  • Kim, Minsoo;Jang, Gil-Jin;Kim, Ji-Hwan;Lee, Minho
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.7
    • /
    • pp.2785-2799
    • /
    • 2020
  • We propose a novel speaker adaptation method using acoustic model clustering. The similarity of different speakers is defined by the cosine distance between their i-vectors (intermediate vectors), and various efficient clustering algorithms are applied to obtain a number of speaker subsets with different characteristics. The speaker-independent model is then retrained with the training data of the individual speaker subsets grouped by the clustering results, and an unknown speech is recognized by the retrained model of the closest cluster. The proposed method is applied to a large-scale speech recognition system implemented by a hybrid hidden Markov model and deep neural network framework. An experiment was conducted to evaluate the word error rates using Resource Management database. When the proposed speaker adaptation method using i-vector based clustering was applied, the performance, as compared to that of the conventional speaker-independent speech recognition model, was improved relatively by as much as 12.2% for the conventional fully neural network, and by as much as 10.5% for the bidirectional long short-term memory.

An Effective Metric for Measuring the Degree of Web Page Changes (효과적인 웹 문서 변경도 측정 방법)

  • Kwon, Shin-Young;Kim, Sung-Jin;Lee, Sang-Ho
    • Journal of KIISE:Databases
    • /
    • v.34 no.5
    • /
    • pp.437-447
    • /
    • 2007
  • A variety of similarity metrics have been used to measure the degree of web page changes. In this paper, we first define criteria for web page changes to evaluate the effectiveness of the similarity metrics in terms of six important types of web page changes. Second, we propose a new similarity metric appropriate for measuring the degree of web page changes. Using real web pages and synthesized pages, we analyze the five existing metrics (i.e., the byte-wise comparison, the TF IDF cosine distance, the word distance, the edit distance, and the shingling) and ours under the proposed criteria. The analysis result shows that our metric represents the changes more effectively than other metrics. We expect that our study can help users select an appropriate metric for particular web applications.

Emotion Robust Speech Recognition using Speech Transformation (음성 변환을 사용한 감정 변화에 강인한 음성 인식)

  • Kim, Weon-Goo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.20 no.5
    • /
    • pp.683-687
    • /
    • 2010
  • This paper studied some methods which use frequency warping method that is the one of the speech transformation method to develope the robust speech recognition system for the emotional variation. For this purpose, the effect of emotional variations on the speech signal were studied using speech database containing various emotions and it is observed that speech spectrum is affected by the emotional variation and this effect is one of the reasons that makes the performance of the speech recognition system worse. In this paper, new training method that uses frequency warping in training process is presented to reduce the effect of emotional variation and the speech recognition system based on vocal tract length normalization method is developed to be compared with proposed system. Experimental results from the isolated word recognition using HMM showed that new training method reduced the error rate of the conventional recognition system using speech signal containing various emotions.

Clinical trials on ophthalmology with Acupuncture Reviewed in PubMed Database (Pubmed 검색을 통한 안 질환 관련 침 임상시험 현황 연구)

  • Jung, Dal-Lim;Kim, Jong-Che;Hong, Seung-Ug
    • The Journal of Korean Medicine Ophthalmology and Otolaryngology and Dermatology
    • /
    • v.25 no.2
    • /
    • pp.49-60
    • /
    • 2012
  • Objective : Acupuncture has been used for treating eye disease for thousands of years but there are few evidence based medicine (EBM) for its use. This study is a review of Clinical trials related to the treatment of eye disease by acupuncture therapy. Methods : We referred a Pubmed site by using searching word of "eye, acupuncture" (Limits: 10 Year, Clinical Trials, Humans, English). Results : Eight journals with twelve papers were searched. Eight of these studies were randomized and ten trials of acupuncture treatments reported significant effect. On the topic of these clinical trials, five of them were about dry eyes, four about disorder of refraction, two about intraocular pressure and one about visual function. The result of clinical studies represented the significant cure rate. Conclusion : We found that RCTs about the acupucnture for eye diseases were more and more published. But their average impact factor was 2.16 and average modified Jadad score was 3.89, so there needs more qualifying studies.

Frequency Analysis of Scientific Texts on the Hypoxia Using Bibliographic Data (논문 서지정보를 이용한 빈산소수괴 연구 분야의 연구용어 빈도분석)

  • Lee, GiSeop;Lee, JiYoung;Cho, HongYeon
    • Ocean and Polar Research
    • /
    • v.41 no.2
    • /
    • pp.107-120
    • /
    • 2019
  • The frequency analysis of scientific terms using bibliographic information is a simple concept, but as relevant data become more widespread, manual analysis of all data is practically impossible or only possible to a very limited extent. In addition, as the scale of oceanographic research has expanded to become much more comprehensive and widespread, the allocation of research resources on various topics has become an important issue. In this study, the frequency analysis of scientific terms was performed using text mining. The data used in the analysis is a general-purpose scholarship database, totaling 2,878 articles. Hypoxia, which is an important issue in the marine environment, was selected as a research field and the frequencies of related words were analyzed. The most frequently used words were 'Organic matter', 'Bottom water', and 'Dead zone' and specific areas showed high frequency. The results of this research can be used as a basis for the allocation of research resources to the frequency of use of related terms in specific fields when planning a large research project represented by single word.

Model adaptation employing DNN-based estimation of noise corruption function for noise-robust speech recognition (잡음 환경 음성 인식을 위한 심층 신경망 기반의 잡음 오염 함수 예측을 통한 음향 모델 적응 기법)

  • Yoon, Ki-mu;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.38 no.1
    • /
    • pp.47-50
    • /
    • 2019
  • This paper proposes an acoustic model adaptation method for effective speech recognition in noisy environments. In the proposed algorithm, the noise corruption function is estimated employing DNN (Deep Neural Network), and the function is applied to the model parameter estimation. The experimental results using the Aurora 2.0 framework and database demonstrate that the proposed model adaptation method shows more effective in known and unknown noisy environments compared to the conventional methods. In particular, the experiments of the unknown environments show 15.87 % of relative improvement in the average of WER (Word Error Rate).

A Study on the Creating Metaverse Service Platform for Web-based Vehicle Dynamics Simulation (웹 기반 차량동역학 시뮬레이션을 위한 메타버스 서비스 플랫폼 구축에 관한 연구)

  • Kwon, Seong-Jin
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.25 no.5
    • /
    • pp.757-764
    • /
    • 2022
  • Recently the car tuning has become a trailblazing and creative culture that expresses the personality of the owner. In this paper, the "Car-Vatar", which is the compound word formed from the words "Car" and "Avatar", has been developed to investigate car tuning on the metaverse engineering platform. The Car-Vatar has been developed as a web-based vehicle dynamic simulation service for providing information about car tuning. That has been focused on investigating diverse vehicular performances, such as acceleration, braking, handling and fuel efficiency, according to the tuning vehicles and tuning parts on the virtual engineering platform. The Car-Vatar platform has provided two major services; one is real-time 3D tuning information system for the dress-up and performance-up tuning parts, the other is diverse vehicle dynamics system for the performance-up tuning parts. To check the validation of the Car-Vatar platform, the comparison between virtual simulation results and driving test results has been discussed on various driving environments.

A Knowledge Service Using Automatic Document Sharing based on Intelligent OMDR (지능형 OMDR 기반의 자동 문서 공유 에이전트를 이용한 지식서비스)

  • Su-Kyoung Kim;Kee-Hong Ahn
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.11a
    • /
    • pp.747-750
    • /
    • 2008
  • 본 연구는 온톨로지, 자연어 처리, 메타데이터 등의 시맨틱 웹 기반 기술들을 이용하여 시맨틱 웹 응용을 위한 전체적인 기술 적용과 그의 활용에 목적을 두고 있다. 이를 위해 OWL을 기반으로 조직이나 기관의 지식 주제별 도메인 온톨로지와, 기존 워드넷(WordNet)이나 더브린 코어 메타데이터(Dublin Core Meta Data)와 조직에 정의된 데이터베이스의 스키마를 MDR로 구축하여 상호 연결하여 온톨로지가 갖는 지능적 추론과 규칙 서비스와 표준화된 메타데이터의 결합 방법을 제공한다. 이는 기존에 온톨로지와 메타데이터의 재활용과 연결(Alignment)에 있어 연구적으로 높은 가치가 있다. 그리고 조직의 사용자가 문서를 작성할 때 문서의 내용에 대해 자연어 처리 기술과 온톨로지의 기술을 이용해 적합한 용어나 메타데이터를 자동으로 제공하여 작성된 문서의 공유와 재사용성을 높이고, 작성된 문서를 XML 형식으로 구성되는 XML 기반 지능 문서 데이터베이스(XMB Based Intelligent Document Database)에 저장하여 유사한 문서를 작성하거나 사용할 필요가 있는 사용자에게 문서 등록과 검색 에이전트(Document Registry and Retrieval Agent)를 통해 이러한 제공하여 문서 지식의 사유화를 최소화 하고, 유사 문서의 재작성과 또는 특정 문서의 작성에 필요한 시간이나 경비를 줄이게 된다. 또한 웹상이나 PDA 같은 개인 휴대장치를 통해서도 서 등록과 검색 에이전트를 통해 문서를 검색하고 사용할 수 있게 한다면 언제 어디서나 해당 서비스를 활용하는 유비쿼터스와 시맨틱 웹의 실질적 응용을 거둘 수도 있으리라 사료된다.

A study on the improving and constructing the content for the Sijo database in the Period of Modern Enlightenment (계몽기·근대시조 DB의 개선 및 콘텐츠화 방안 연구)

  • Chang, Chung-Soo
    • Sijohaknonchong
    • /
    • v.44
    • /
    • pp.105-138
    • /
    • 2016
  • Recently with the research function, "XML Digital collection of Sijo Texts in the Period of Modern Enlightenment" DB data is being provided through the Korean Research Memory (http://www.krm.or.kr) and the foundation for the constructing the contents of Sijo Texts in the Period of Modern Enlightenment has been laid. In this paper, by reviewing the characteristics and problems of Digital collection of Sijo Texts in the Period of Modern Enlightenment and searching for the improvement, I tried to find a way to make it into the content. This database has the primary meaning in the integrating and glancing at the vast amounts of Sijo in the Period of Modern Enlightenment to reaching 12,500 pieces. In addition, it is the first Sijo data base which is provide the variety of search features according to literature, name of poet, title of work, original text, per period, and etc. However, this database has the limits to verifying the overall aspects of the Sijo in the Period of Modern Enlightenment. The title and original text, which is written in the archaic word or Chinese character, could not be searched, because the standard type text of modern language is not formatted. And also the works and the individual Sijo works released after 1945 were missing in the database. It is inconvenient to extract the datum according to the poet, because poets are marked in the various ways such as one's real name, nom de plume and etc. To solve this kind of problems and improve the utilization of the database, I proposed the providing the standard type text of modern language, giving the index terms about content, providing the information on the work format and etc. Furthermore, if the Sijo database in the Period of Modern Enlightenment which is prepared the character of the Sijo Culture Information System could be built, it could be connected with the academic, educational contents. For the specific plan, I suggested as follow, - learning support materials for the Modern history and the national territory recognition on the Modern Age - source materials for studying indigenous animals and plants characters creating the commercial characters - applicability as the Sijo learning tool such as Sijo Game.

  • PDF