• Title/Summary/Keyword: 단어 데이터베이스

Search Result 208, Processing Time 0.021 seconds

An Effcient Two-Level Hybrid Signature File Method for Large Text Databases (대용량 텍스트 데이터베이스를 위한 효율적인 2단계 합성 요약 화일 방법)

  • Yoo, Jae-Soo;Gang, Hyeong-Il
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.4
    • /
    • pp.923-932
    • /
    • 1997
  • In this paper, we propose a two-level hybrid signature file method(THM) to dffciently deal with large txt databases that use a term discrimination concept.In addition, we apply Yoo's clustering scheme to the two-level hybeid signature file method. The clustering schme groups similar signatures together according to the similarity of the highly discriminatiory tems so that we may achive better performance on retrival. The space-time ana-lyhtical model of the proposed two-level hybrid method is provided. Based on the analytical model and experiments, we compare it with the exsting methods, i.e. the bit-sliced method(BM), the-level method(TM), and the hybrid method(HM). As a result, we show that THM achives the best retrival performance in a large database with 100,000 records when the mumber fo matching records is less than 160.

  • PDF

Development of a Hand Shape Editor for Sign Language Expression (수화 표현을 위한 손 모양 편집 프로그램의 개발)

  • Oh, Young-Joon;Park, Kwang-Hyun;Bien, Zeung-Nam
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.44 no.4 s.316
    • /
    • pp.48-54
    • /
    • 2007
  • Hand shape is one of important elements in Korean Sign Language (KSL), which is a communication method for the deaf. To express sign motion in a virtual reality environment based on OpenGL, we need an editor which can insert and modify sign motion data. However, it is very difficult that people, who lack knowledge of sign 1anguage, exactly edit and express hand shape using the existing editors. We also need a program to efficiently construct and store the hand shape data because the number of data is very large in a sign word dictionary. In this paper we developed a KSL hand shape editor to easily construct and edit hand shape by a graphical user interface (GUI), and to store it in a database. Hand shape codes are used in a sign word editor to synthesize sign motion and decreases total amount of KSL data.

Ontology-based Culture·Tourist Attraction Search Application (온톨로지 기반의 문화·관광지 검색 어플리케이션 구현)

  • Hwang, Tae-won;Seo, Jung-hee;Park, Hung-bog
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.05a
    • /
    • pp.772-774
    • /
    • 2017
  • Currently, there are many simple searches for local culture and tourism, but systematic information retrieval using ontology technology is weak. The keyword-based search, which is an existing search method, derives a search result that is different from a user's wanted intention. On the other hand, semantic search using ontology constructs shows the information related to the search term by creating a relation between words and words. Therefore, when tourists search for cultural and tourist attractions in the area, they provide information that includes meaning relevance in the search results. If the ontology provides information on the culture, sightseeing area, transportation, Can be more easily grasped. In this paper, we propose an ontology-based retrieval system based on culture and tourist sites utilizing public institutions database by using mobile application by extending search system which relied only on existing internal database to provide accurate and reliable information to users. This efficient structure of the ontology makes it possible to provide information suitable for the user quickly and accurately.

  • PDF

A Convergence Study of the Research Trends on Stress Urinary Incontinence using Word Embedding (워드임베딩을 활용한 복압성 요실금 관련 연구 동향에 관한 융합 연구)

  • Kim, Jun-Hee;Ahn, Sun-Hee;Gwak, Gyeong-Tae;Weon, Young-Soo;Yoo, Hwa-Ik
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.8
    • /
    • pp.1-11
    • /
    • 2021
  • The purpose of this study was to analyze the trends and characteristics of 'stress urinary incontinence' research through word frequency analysis, and their relationships were modeled using word embedding. Abstract data of 9,868 papers containing abstracts in PubMed's MEDLINE were extracted using a Python program. Then, through frequency analysis, 10 keywords were selected according to the high frequency. The similarity of words related to keywords was analyzed by Word2Vec machine learning algorithm. The locations and distances of words were visualized using the t-SNE technique, and the groups were classified and analyzed. The number of studies related to stress urinary incontinence has increased rapidly since the 1980s. The keywords used most frequently in the abstract of the paper were 'woman', 'urethra', and 'surgery'. Through Word2Vec modeling, words such as 'female', 'urge', and 'symptom' were among the words that showed the highest relevance to the keywords in the study on stress urinary incontinence. In addition, through the t-SNE technique, keywords and related words could be classified into three groups focusing on symptoms, anatomical characteristics, and surgical interventions of stress urinary incontinence. This study is the first to examine trends in stress urinary incontinence-related studies using the keyword frequency analysis and word embedding of the abstract. The results of this study can be used as a basis for future researchers to select the subject and direction of the research field related to stress urinary incontinence.

Development of Similar Bibliographic Retrieval System based on Neighboring Words and Keyword Topic Information (인접한 단어와 키워드 주제어 정보에 기반한 유사 문헌 검색 시스템 개발)

  • Kim, Kwang-Young;Kwak, Seung-Jin
    • Journal of Korean Library and Information Science Society
    • /
    • v.40 no.3
    • /
    • pp.367-387
    • /
    • 2009
  • The similar bibliographic retrieval system follows whether it selects a thing of the extracted index term and or not the difference in which the similar document retrieval system There be many in the search result is generated. In this research, the method minimally making the error of the selection of the extracted candidate index term is provided In this research, the word information in which it is adjacent by using candidate index terms extracted from the similar literature and the keyword topic information were used. And by using the related author information and the reranking method of the search result, the similar bibliographic system in which an accuracy is high was developed. In this paper, we conducted experiments for similar bibliographic retrieval system on a collection of Korean journal articles of science and technology arena. The performance of similar bibliographic retrieval system was proved through an experiment and user evaluation.

  • PDF

A Study about interception on Hurtfulness Site using Aho-Corasik machine (AC 머신을 이용한 유해 사이트 차단에 관한 연구)

  • 정현수;정규철;김후남;박기홍
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2004.05b
    • /
    • pp.541-544
    • /
    • 2004
  • Change is doing our life more conveniently and abundantly by knowledge information society, but side effect and that is happening considerable and gropes solution in reply that did not expect in advance is urgent real condition. It can be called one of representative dysfunction of information-oriented society that human nature is revealed in open state to great many objectionable material and poisonous information such as violence kind that teenagerses who do not grow are gotten abroad through Information network system yet. So, to solve these fallacy, word-weighting process, where several harmful words which can be optained in internet site are discriminance and weighted, is utilized by using AC machine. At the result, the isolation rate of harmful site rose up to 90%, which means this process is greatly efficient.

  • PDF

A Method for Automatic Detection of Character Encoding of Multi Language Document File (다중 언어로 작성된 문서 파일에 적용된 문자 인코딩 자동 인식 기법)

  • Seo, Min Ji;Kim, Myung Ho
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.4
    • /
    • pp.170-177
    • /
    • 2016
  • Character encoding is a method for changing a document to a binary document file using the code table for storage in a computer. When people decode a binary document file in a computer to be read, they must know the code table applied to the file at the encoding stage in order to get the original document. Identifying the code table used for encoding the file is thus an essential part of decoding. In this paper, we propose a method for detecting the character code of the given binary document file automatically. The method uses many techniques to increase the detection rate, such as a character code range detection, escape character detection, character code characteristic detection, and commonly used word detection. The commonly used word detection method uses multiple word database, which means this method can achieve a much higher detection rate for multi-language files as compared with other methods. If the proportion of language is 20% less than in the document, the conventional method has about 50% encoding recognition. In the case of the proposed method, regardless of the proportion of language, there is up to 96% encoding recognition.

A Prediction System of User Preferences for Newly Released Items Based on Words (새로 출시되는 품목들을 위한 단어 기반의 사용자 선호도 예측 기법)

  • Choi, Yoon-Seok;Moon, Byung-Ro
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.16 no.2
    • /
    • pp.156-163
    • /
    • 2006
  • CF systems are widely used in recommendation due to the easy implementation and the outstanding performance. They have several problems such as the sparsity problem, the first-rater problem, and recommending explanation. Many studies are suggested to resolve these problems. While the influence of the sparsity problem lessens as the users' data are accumulated, but the first-rater problem is originated from the CF systems and there are a number of researches to overcome the disadvantages of CF systems based on the content-based methods. Also CF systems are black boxes, providing no explanation of working of the recommendation. In this paper we present a content-based prediction system based on the preference words, which exposes the reasoning behind a recommendation. Our system predicts user's rating of a new movie and we suggest a semiotic network-based method to solve the mismatching problem between the items. For experimental comparison, we used EachMovie and IMDb dataset.

A Personalized Retrieval System Based on Classification and User Query (분류와 사용자 질의어 정보에 기반한 개인화 검색 시스템)

  • Kim, Kwang-Young;Shim, Kang-Seop;Kwak, Seung-Jin
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.43 no.3
    • /
    • pp.163-180
    • /
    • 2009
  • In this paper, we describe a developmental system for establishing personal information tendency based on user queries. For each query, the system classified it based on the category information using a kNN classifier. As category information, we used DDC field which is already assigned to each record in the database. The system accumulates category information for all user queries and the user's personalized feature for the target database. We then developed a personalized retrieval system reflecting the personalized feature to produce search result. Our system re-ranks the result documents by adding more weights to the documents for which categories match with the user's personalized feature. By using user's tendency information, the ambiguity problem of the word could be solved. In this paper, we conducted experiments for personalized search and word sense disambiguation (WSD) on a collection of Korean journal articles of science and technology arena. Our experimental result and user's evaluation show that the performance of the personalized search system and WSD is proved to be useful for actual field services.

Database metadata standardization processing model using web dictionary crawling (웹 사전 크롤링을 이용한 데이터베이스 메타데이터 표준화 처리 모델)

  • Jeong, Hana;Park, Koo-Rack;Chung, Young-suk
    • Journal of Digital Convergence
    • /
    • v.19 no.9
    • /
    • pp.209-215
    • /
    • 2021
  • Data quality management is an important issue these days. Improve data quality by providing consistent metadata. This study presents algorithms that facilitate standard word dictionary management for consistent metadata management. Algorithms are presented to automate synonyms management of database metadata through web dictionary crawling. It also improves the accuracy of the data by resolving homonym distinction issues that may arise during the web dictionary crawling process. The algorithm proposed in this study increases the reliability of metadata data quality compared to the existing passive management. It can also reduce the time spent on registering and managing synonym data. Further research on the new data standardization partial automation model will need to be continued, with a detailed understanding of some of the automatable tasks in future data standardization activities.