• Title/Summary/Keyword: word clustering

Search Result 190, Processing Time 0.026 seconds

Trends in Genomics & Informatics: a statistical review of publications from 2003 to 2018 focusing on the most-studied genes and document clusters

  • Kim, Ji-Hyeon;Nam, Hee-Jo;Park, Hyun-Seok
    • Genomics & Informatics
    • /
    • v.17 no.3
    • /
    • pp.25.1-25.6
    • /
    • 2019
  • Genomics & Informatics (NLM title abbreviation: Genomics Inform) is the official journal of the Korea Genome Organization. Herein, we conduct a statistical analysis of the publications of Genomics & Informatics over the 16 years since its inception, with a particular focus on issues relating to article categories, word clouds, and the most-studied genes, drawing on recent reviews of the use of word frequencies in journal articles. Trends in the studies published in Genomics & Informatics are discussed both individually and collectively.

Isolated-Word Recognition Using Adaptively Partitioned Multisection Codebooks (음성적응(音聲適應) 구간분할(區間分割) 멀티섹션 코드북을 이용(利用)한 고립단어인식(孤立單語認識))

  • Ha, Kyeong-Min;Jo, Jeong-Ho;Hong, Jae-Kuen;Kim, Soo-Joong
    • Proceedings of the KIEE Conference
    • /
    • 1988.07a
    • /
    • pp.10-13
    • /
    • 1988
  • An isolated-word recognition method using adaptively partitioned multisection codebooks is proposed. Each training utterance was divided into several sections according to its pattern extracted by labeling technique. For each pattern, reference codebooks were generated by clustering the training vectors of the same section. In recognition procedure, input speech was divided into the sections by the same method used in codebook generation procedure, and recognized to the reference word whose codebook represented the smallest average distortion. The proposed method was tested for 100 Korean words and attained recognition rate about 96 percent.

  • PDF

A Study on Creating Reference Pattern for Recognition of Korean Isolated Word (한국어 단독음 인식을 위한 표준패턴 설정에 관한 연구)

  • Kim, Gye-Guk;Go, Deok-Yeong;Lee, Jong-Ak
    • The Journal of the Acoustical Society of Korea
    • /
    • v.6 no.1
    • /
    • pp.23-28
    • /
    • 1987
  • This paper discusses a reference pattern creation for a speaker-independent Korean isolated word by using the clustering. Tn this paper we permitted to top 3 clusters and created reference pattern by Minimax Criterion. The features parameter used the LPC Coefficients and Autocorrelation and simple Itakura distance measure was used to measure similarity between patterns. With word reference patterns obtained as described above the recognition rate was within one choice only $55.9\%$, two choice only $76.9\%$, three choice only $89.5\%$.

  • PDF

Moving Data Pictures (움직이는 데이터 그림)

  • Huh, Myung-Hoe
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.6
    • /
    • pp.999-1007
    • /
    • 2013
  • This research shows several types of moving pictures from the data: 1) the word cloud of Korean texts, 2) the heat map of n ${\times}$ p matrices, 3) the moving image of p ${\times}$ p scatterplot matrix, 4) the local projective display of k clusters (Huh and Lee, 2012). Moving pictures may reveal the hidden information and beauty of the datasets and ignite the curiosity of information consumers. Video files are attached.

Isolated Word Recognition using Modified Dynamic Averaging Method (변형된 Dynamic Averaging 방법을 이용한 단독어인식)

  • Jeoung, Eui-Bung;Ko, Young-Hyuk;Lee, Jong-Arc
    • The Journal of the Acoustical Society of Korea
    • /
    • v.10 no.2
    • /
    • pp.23-28
    • /
    • 1991
  • This paper is a study on isolated word recognition by independent speaker, we propose DTW speech recognition system by modified dynamic averaging method as reference pattern. 57 city names are selected as recognition vocabulary and 2th LPC cepstrum coefficients are used as the feature parameter. In this paper, besides recognition experiment using modified dynamic averaging method as reference pattern, we perform recognition experiments using causal method, dynamic averaging method, linear averaging method and clustering method with the same data in the same conditions for comparison with it. Through the experiment result, it is proved that recogntion rate by DTW using modified dynamic averaging method is the best as 97.6 percent.

  • PDF

An Informetric Analysis on Intellectual Structures with Multiple Features of Academic Library Research Papers (복수 자질에 의한 지적 구조의 계량정보학적 분석연구: 국내 대학도서관 분야 연구논문을 대상으로)

  • Choi, Sang-Hee
    • Journal of the Korean Society for information Management
    • /
    • v.28 no.2
    • /
    • pp.65-78
    • /
    • 2011
  • The purpose of this study is to identify topic areas of academic library research using two informetric methods; word clustering and Pathfinder network. For the data analysis, 139 articles published in major library and information science journals from 2005 to 2009 were collected from the Korean Science Citation Index database. The keywords that represent research topics were gathered from two sections: an and titles in references. Results showed that reference titles usefully represent topics in detail, and combinings and reference titles can produce an expanded topic map.

Question and Answering System through Search Result Summarization of Q&A Documents (Q&A 문서의 검색 결과 요약을 활용한 질의응답 시스템)

  • Yoo, Dong Hyun;Lee, Hyun Ah
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.4
    • /
    • pp.149-154
    • /
    • 2014
  • A user should pick up relevant answers by himself from various search results when using user participation question answering community like Knowledge-iN. If refined answers are automatically provided, usability of question answering community must be improved. This paper divides questions in Q&A documents into 4 types(word, list, graph and text), then proposes summarizing methods for each question type using document statistics. Summarized answers for word, list and text type are obtained by question clustering and calculating scores for words using frequency, proximity and confidence of answers. Answers for graph type is shown by extracting user opinion from answers.

An Effective Increment리 Content Clustering Method for the Large Documents in U-learning Environment (U-learning 환경의 대용량 학습문서 판리를 위한 효율적인 점진적 문서)

  • Joo, Kil-Hong;Choi, Jin-Tak
    • Journal of the Korea Computer Industry Society
    • /
    • v.5 no.9
    • /
    • pp.859-872
    • /
    • 2004
  • With the rapid advance of computer and communication techonology, the recent trend of education environment is edveloping in the ubiquitous learning (u-learning) direction that learners select and organize the contents, time and order of learning by themselves. Since the amount of education information through the internet is increasing rapidly and it is managed in document in an effective way is necessary. The document clustering is integrated documents to subject by classifying a set of documents through their similarity among them. Accordingly, the document clustering can be used in exploring and searching a document and it can increased accuracy of search. This paper proposes an efficient incremental clustering method for a set of documents increase gradually. The incremental document clustering algorithm assigns a set of new documents to the legacy clusters which have been identified in advance. In addition, to improve the correctness of the clustering, removing the stop words can be proposed.

  • PDF

The Method of Using the Automatic Word Clustering System for the Evaluation of Verbal Lexical-Semantic Network (동사 어휘의미망 평가를 위한 단어클러스터링 시스템의 활용 방안)

  • Kim Hae-Gyung;Yoon Ae-Sun
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.40 no.3
    • /
    • pp.175-190
    • /
    • 2006
  • For the recent several years, there has been much interest in lexical semantic network However it seems to be very difficult to evaluate the effectiveness and correctness of it and invent the methods for applying it into various problem domains. In order to offer the fundamental ideas about how to evaluate and utilize lexical semantic networks, we developed two automatic vol·d clustering systems, which are called system A and system B respectively. 68.455.856 words were used to learn both systems. We compared the clustering results of system A to those of system B which is extended by the lexical-semantic network. The system B is extended by reconstructing the feature vectors which are used the elements of the lexical-semantic network of 3.656 '-ha' verbs. The target data is the 'multilingual Word Net-CoroNet'. When we compared the accuracy of the system A and system B, we found that system B showed the accuracy of 46.6% which is better than that of system A. 45.3%.

A Clustering Technique of Radar Signals using 4-Dimensional Features (4차원 특징 벡터에 의한 레이더 신호 클러스터링 기법)

  • Lee, Jong-Tae;Ju, Young-Kwan;Kim, Gwan-Tae;Jeon, Joong-Nam
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.10
    • /
    • pp.137-144
    • /
    • 2014
  • The Electronic Support System collects and analyzes the received radar signals in order to cope with the electronic attack in real-time. The radar-pulse clustering system classifies the radar signals that are considered to be emitted by a single source. This paper proposed a radar-pulse clustering algorithm based on four kinds of features: the direction, frequency, pulse width, and the difference of arrival time between two successive pulses. The experiment results show that the proposing algorithm could trace the moving emitter and classify the timely separated signals into different classes.