• Title/Summary/Keyword: lexical search

Search Result 21, Processing Time 0.026 seconds

A Study on Natural Language Document and Query Processor for Information Retrieval in Digital Library (디지털 도서관 환경에서의 정보 검색을 위한 자연어 문서 및 질의 처리기에 관한 연구)

  • 윤성희
    • Journal of the Korea Computer Industry Society
    • /
    • v.2 no.12
    • /
    • pp.1601-1608
    • /
    • 2001
  • Digital library is the most important database system that needs information retrieval engine for natural language documents and multimedia data. This paper describes the experimental results of information retrieval engine and browser based on natural language processing. It includes lexical analysis, syntax processing, stemming, and keyword indexing for the natural language text. With the experimental database ‘Earth and Space Science’ that has lots of images and titles and their descriptive text in natural language, text-based search engine was tested. Combined with content-based image search engine, it is expected to be a multimedia information retrieval system in digital library

  • PDF

ON THE INCANTATORY FEATURES OF KOREAN SHAMANIC LANGUAGE (한국 무속어의 주술적 특성과 그 해석)

  • Choong-yon Park
    • Lingua Humanitatis
    • /
    • v.1 no.1
    • /
    • pp.295-321
    • /
    • 2001
  • This paper attempts to demonstrate how the linguistic and mythological features of the shamanic language make it incantatory, or ′enchanting′. Passages used in shamanic rites manifest linguistic characteristics that point to their own norms and conventions, as well as some mythological features that contribute to the undecipherablity of the shamanic language. Focusing on the estranged linguistic and mythological features, I propose that shamanic languages can be best interpreted in terms of the linguistic hierarchization, a notion that has been developed since Roman Jakobson′s poetics. The present study adopts Eisele′s framework that reinterprets Jakobsonian hierarchization into a slightly revised notion on the basis of the "degree of combinatorial freedom" and the "degree of semantic immediacy", looking into a set of paradigm examples in search of some parallel structures characterizing the shamanic language. The enchanting effect of this peculiar form of language, it is argued, is due mostly to the frequent use of lexical parallelism, which works in the reverse direction of the normal process of interpretation.

  • PDF

Analysis of Impact Between Data Analysis Performance and Database

  • Kyoungju Min;Jeongyun Cho;Manho Jung;Hyangbae Lee
    • Journal of information and communication convergence engineering
    • /
    • v.21 no.3
    • /
    • pp.244-251
    • /
    • 2023
  • Engineering or humanities data are stored in databases and are often used for search services. While the latest deep-learning technologies, such like BART and BERT, are utilized for data analysis, humanities data still rely on traditional databases. Representative analysis methods include n-gram and lexical statistical extraction. However, when using a database, performance limitation is often imposed on the result calculations. This study presents an experimental process using MariaDB on a PC, which is easily accessible in a laboratory, to analyze the impact of the database on data analysis performance. The findings highlight the fact that the database becomes a bottleneck when analyzing large-scale text data, particularly over hundreds of thousands of records. To address this issue, a method was proposed to provide real-time humanities data analysis web services by leveraging the open source database, with a focus on the Seungjeongwon-Ilgy, one of the largest datasets in the humanities fields.

An Example-Based Engligh Learing Environment for Writing

  • Miyoshi, Yasuo;Ochi, Youji;Okamoto, Ryo;Yano, Yoneo
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.01a
    • /
    • pp.292-297
    • /
    • 2001
  • In writing learning as a second/foreign language, a learner has to acquire not only lexical and syntactical knowledge but also the skills to choose suitable words for content which s/he is interested in. A learning system should extrapolate learner\\`s intention and give example phrases that concern with the content in order to support this on the system. However, a learner cannot always represent a content of his/her desired phrase as inputs to the system. Therefore, the system should be equipped with a diagnosis function for learner\\`s intention. Additionally, a system also should be equipped with an analysis function to score similarity between learner\\`s intention and phrases which is stored in the system on both syntactic and idiomatic level in order to present appropriate example phrases to a learner. In this paper, we propose architecture of an interactive support method for English writing learning which is based an analogical search technique of sample phrases from corpora. Our system can show a candidate of variation/next phrases to write and an analogous sentence that a learner wants to represents from corpora.

  • PDF

A Korean Mobile Conversational Agent System (한국어 모바일 대화형 에이전트 시스템)

  • Hong, Gum-Won;Lee, Yeon-Soo;Kim, Min-Jeoung;Lee, Seung-Wook;Lee, Joo-Young;Rim, Hae-Chang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.6
    • /
    • pp.263-271
    • /
    • 2008
  • This paper presents a Korean conversational agent system in a mobile environment using natural language processing techniques. The aim of a conversational agent in mobile environment is to provide natural language interface and enable more natural interaction between a human and an agent. Constructing such an agent, it is required to develop various natural language understanding components and effective utterance generation methods. To understand spoken style utterance, we perform morphosyntactic analysis, shallow semantic analysis including modality classification and predicate argument structure analysis, and to generate a system utterance, we perform example based search which considers lexical similarity, syntactic similarity and semantic similarity.

  • PDF

A Study on the Extension of Disaster Safety Information Service based on Linked Open Data (LOD기반의 재난안전 정보서비스 확장에 관한 연구)

  • Kim, Tae-Young;Gang, Ju-Yeon;Kim, Hye-Young;Kim, Yong
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.51 no.3
    • /
    • pp.163-188
    • /
    • 2017
  • This study aims to propose disaster safety information service model based on LOD for effective management and dissemination of the information. To achieve the aim of this study, current state of disaster safety information was analyzed through online search and face-to-face interviews, and then the information was divided into 6 types. Finally, this study proposed specific process of building disaster safety information LOD service with considerations reflecting the information characteristics. The process for building LOD was based on Guidelines for Building Linked Data written by National Information Society Agency. Especially, ontology concept model was defined by using standard lexical resources and modeling tools based on 6 types of disaster safety information, and classes and properties were proposed. The results of this study will make disaster safety information more useful for common people.

A Multimedia Bulletin Board System Providing Semantic-based Searching (의미 기반 정보 검색을 제공하는 멀티미디어 게시판 시스템)

  • Jung Eui-Hyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.6 s.38
    • /
    • pp.75-84
    • /
    • 2005
  • Bulletin board systems have evolved to support diverse multimedia data as well as text. However, current board systems have an weakness : it takes much time and efforts for users to figure out contents of articles. Most board systems provide a searching function with lexical level data access for solving that problem, however it fails to serve users' intented searching results. Moreover, it is nearly impossible to search proper articles if they contain multimedia data. This paper proposed a bulletin board system adopting the Semantic Web to solve this issue. The proposed system provides users with new ontology which is used for describing articles' domain knowledge and multimedia features. Users can describe their own board ontology using the proposed ontology. To support semantic-based searching for diverse domain knowledge without modification of the system, the system dynamically generated input/query interface and RDF data access module according to the board ontology written by administrators. The proposed board system shows that semantic-based searching is feasible and effective for users to find their intended articles.

  • PDF

Ontology Selection Ranking Model based on Semantic Similarity Approach (의미적 유사성에 기반한 온톨로지 선택 랭킹 모델)

  • Oh, Sun-Ju;Ahn, Joong-Ho;Park, Jin-Soo
    • The Journal of Society for e-Business Studies
    • /
    • v.14 no.2
    • /
    • pp.95-116
    • /
    • 2009
  • Ontologies have provided supports in integrating heterogeneous and distributed information. More and more ontologies and tools have been developed in various domains. However, building ontologies requires much time and effort. Therefore, ontologies need to be shared and reused among users. Specifically, finding the desired ontology from an ontology repository will benefit users. In the past, most of the studies on retrieving and ranking ontologies have mainly focused on lexical level supports. In those cases, it is impossible to find an ontology that includes concepts that users want to use at the semantic level. Most ontology libraries and ontology search engines have not provided semantic matching capability. Retrieving an ontology that users want to use requires a new ontology selection and ranking mechanism based on semantic similarity matching. We propose an ontology selection and ranking model consisting of selection criteria and metrics which are enhanced in semantic matching capabilities. The model we propose presents two novel features different from the previous research models. First, it enhances the ontology selection and ranking method practically and effectively by enabling semantic matching of taxonomy or relational linkage between concepts. Second, it identifies what measures should be used to rank ontologies in the given context and what weight should be assigned to each selection measure.

  • PDF

An XML Tag Indexing Method Using on Lexical Similarity (XML 태그를 분류에 따른 가중치 결정)

  • Jeong, Hye-Jin;Kim, Yong-Sung
    • The KIPS Transactions:PartB
    • /
    • v.16B no.1
    • /
    • pp.71-78
    • /
    • 2009
  • For more effective index extraction and index weight determination, studies of extracting indices are carried out by using document content as well as structure. However, most of studies are concentrating in calculating the importance of context rather than that of XML tag. These conventional studies determine its importance from the aspect of common sense rather than verifying that through an objective experiment. This paper, for the automatic indexing by using the tag information of XML document that has taken its place as the standard for web document management, classifies major tags of constructing a paper according to its importance and calculates the term weight extracted from the tag of low weight. By using the weight obtained, this paper proposes a method of calculating the final weight while updating the term weight extracted from the tag of high weight. In order to determine more objective weight, this paper tests the tag that user considers as important and reflects it in calculating the weight by classifying its importance according to the result. Then by comparing with the search performance while using the index weight calculated by applying a method of determining existing tag importance, it verifies effectiveness of the index weight calculated by applying the method proposed in this paper.

A Korean Document Sentiment Classification System based on Semantic Properties of Sentiment Words (감정 단어의 의미적 특성을 반영한 한국어 문서 감정분류 시스템)

  • Hwang, Jae-Won;Ko, Young-Joong
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.4
    • /
    • pp.317-322
    • /
    • 2010
  • This paper proposes how to improve performance of the Korean document sentiment-classification system using semantic properties of the sentiment words. A sentiment word means a word with sentiment, and sentiment features are defined by a set of the sentiment words which are important lexical resource for the sentiment classification. Sentiment feature represents different sentiment intensity in general field and in specific domain. In general field, we can estimate the sentiment intensity using a snippet from a search engine, while in specific domain, training data can be used for this estimation. When the sentiment intensity of the sentiment features are estimated, it is called semantic orientation and is used to estimate the sentiment intensity of the sentences in the text documents. After estimating sentiment intensity of the sentences, we apply that to the weights of sentiment features. In this paper, we evaluate our system in three different cases such as general, domain-specific, and general/domain-specific semantic orientation using support vector machine. Our experimental results show the improved performance in all cases, and, especially in general/domain-specific semantic orientation, our proposed method performs 3.1% better than a baseline system indexed by only content words.