• Title/Summary/Keyword: 연관어분석

Search Result 219, Processing Time 0.032 seconds

A Big Data Analysis of the News Trends on Wireless Emergency Alert Service (뉴스 빅데이터를 활용한 재난문자 뉴스 게재 경향 분석)

  • Lee, Hyunji;Byun, Yoonkwan;Chang, Sekchin;Choi, Seong Jong;Oh, Seunghee;Lee, Yongtae
    • Journal of Broadcast Engineering
    • /
    • v.24 no.5
    • /
    • pp.726-734
    • /
    • 2019
  • This study investigates the number of news and correlated keywords concerning to Korean Wireless Emergency Alert(KWEA). The news was collected using BIGKinds, a news big data system provided by the Korea Press Foundation. When analyzing the annual published news articles, we investigated the frequency of the news grouped by disaster types, and the frequency of the news distinguishing between the earthquake and non-earthquake disasters, and finally the frequency of correlated keywords concerning to the disasters. We found that the KWEA news totaled 182 in 2016 due to the unprecedented powerful KyongJu earthquake, an increase of 20 times over the previous year. Ever since 2016, the news about the KWEA continued to hit high figures consistently. After the peak in KyongJu earthquake in 2016, the proportion of non-earthquakes had also increased in 2017 and 2018. Next, the keyword correlation analysis showed that the KWEA news gave major coverage to the following entities: The Ministry of the Interior and Safety which operates the KWEA, Korea Meteorological Administration, and the general public.

Elementary Educational Contents Retrieval System Using Semantic Web Technology (시맨틱 웹 기술을 활용한 초등학교 학습자료 검색 시스템)

  • Lee, Hee-Kyoung;Jun, Woo-Chun
    • 한국정보교육학회:학술대회논문집
    • /
    • 2004.08a
    • /
    • pp.622-630
    • /
    • 2004
  • 웹의 활용이 보편화되면서 웹을 통한 자료의 검색이 증가하고 있으나, 웹상의 방대한 자료 중에서 학습자가 꼭 필요한 학습자료를 찾는 것은 쉬운 일이 아니다. 검색엔진을 이용하면 원하는 정보를 어느 정도 찾을 수 있으나 사용자 의존적인 검색엔진의 특성상 결과가 만족스럽지 못한 경우도 있으며 연관이 없는 정보를 필터링하기 위해 최종적인 내용을 찾기까지 많은 시간을 낭비하는 경우가 많다. 이에 털 연구에서는 자원의 의미정보를 구조화하여 정보의 효율적인 검색, 통합, 재사용을 가능하도록 하는 시맨틱 웹 (Semantic Web)기술을 활용하여 초등학교 학습자료에 적합한 온톨로지 (Ontology)를 구축하여 이를 기반으로 초등학교 학습자료를 검색할 수 있는 시스템을 설계하고 구현하였다. 본 검색시스템의 특징은 다음과 같다. 첫째, 학습자료와 연관된 사용자 질의어를 보다 상세하게 입력받는다. 둘째, 사용자 질의어를 바탕으로 온톨로지에 질의하여 검색결과를 얻는다. 셋째, 검색하고자 하는 내용의 의미를 분석하여 요구된 의미에 적합한 자료만을 검색결과로 제시한다.

  • PDF

Analysis of Internet User Features using Multi-dimensional Association Analysis (다차원 연관 분석을 이용한 인터넷 이용자의 특징 분석)

  • Lee, Su-Eun;Jung, Yong-Gyu
    • Journal of Service Research and Studies
    • /
    • v.1 no.1
    • /
    • pp.61-69
    • /
    • 2011
  • Data mining that can not be extracted with a simple query in the form of "useful" means to find information in large databases from the existing and unknown knowledge. It is based on this insight about the data can be defined as a gain. In this paper, we use the Internet to find useful patterns on the Web or saved data to the target Web site, which is to analyze the characteristics of users. A general statistical information on Internet users to the data by applying a relevance analysis, Internet use affect the amount of time to analyze the characteristics of Internet users. Only through experiments extracting data from the association rules, producing optimal results apply for the data pre-processing and algorithm for mining the Web to Internet users. characteristics were analyzed.

  • PDF

A Study on File Search Engine Based on DBMS (DBMS을 활용한 파일 검색엔진 연구)

  • Kim, HyoungSeuk;Yu, Heonchang
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2016.10a
    • /
    • pp.548-551
    • /
    • 2016
  • 기존 그리드 기반의 전통적인 RDBMS는 비구조적 데이터에 대한 색인이 지원되지 않았다. 이러한 제약 조건들로 인해 파일 문서 및 비 구조화된 데이터의 검색 엔진으로는 부적합하였다. 최근에 다양한 검색 오픈소스(Solr, Lucene)등으로 검색 엔진이 개발되어 활용되고 있지만, 검색한 결과와 기존 데이터의 연동이 쉽지 않고 구조 변경이 어려우며, 사용자의 다양한 요구 사항 수용이 쉽지 않은 단점을 가지고 있다. 따라서 본 연구에서는 빠른 검색을 위한 색인 (index) 최적화와 대용량 데이터 처리를 위한 파티션 기반 데이터의 분할 및 정복 (divide and conquer) 처리, 이중화된 검색어 색인 기능을 구현하였다. 또한 동의어 사전을 구축하여 연관 관계 분석이 가능하도록 DB를 구축하여 검색어와 동의어의 상호 관계성을 유지하였으며 오픈 소스보다 발전한 형태의 검색 엔진을 개발하는 것을 목표로 하였다. 본 연구를 위해 약 400만건 이상의 다양한 포맷 (Ms-office, Hwp, Pdf, Text)등의 파일 문서를 샘플로 실험을 진행하였다.

Analysis of related words of drama viewership through SNS unstructured data crawling (SNS 비정형데이터 크롤링을 통한 드라마 시청률의 연관어 분석)

  • Kang, Sun-Kyoung;Lee, Hyun-Chang;Shin, Seong-Yoon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.05a
    • /
    • pp.169-170
    • /
    • 2017
  • In this paper, we analyze contents of formal and non - standardized data to understand what factors affect the ratings of drama. The formalized data collection collected 19 items from the four areas of drama information, person information, broadcasting information, and audience rating information of each broadcasting company. In order to collect unstructured data, crawling techniques were used to collect bulletin boards, pre - broadcast blogs and post - broadcast blogs for each drama. From the collected data, it was found that the differences according to broadcasting time, the start time, genre, and day of broadcasting were similar among broadcasting companies.

  • PDF

Multi-level Morphology and Morphological Analysis Model for Korean (다층 형태론과 한국어 형태소 분석 모델)

  • Kang, Seung-Shik
    • Annual Conference on Human and Language Technology
    • /
    • 1994.11a
    • /
    • pp.140-145
    • /
    • 1994
  • 형태소 분석은 단위 형태소를 분리한 후에 변형이 일어난 형태소의 원형을 복원하고, 분리된 단위 형태소들로부터 단어 형성 규칙에 맞는 연속된 형태소들을 구하는 과정이다. 이러한 일련의 분석 과정은 독립적인 특성이 강하면서 각 모듈이 서로 밀접하게 연관되어 있으므로 Two-level 모델에서는 형태론적 변형뿐만 아니라 형태소 분리 문제를 통합 규칙으로 처리하고 있다. 그러나 한국어에 Two-level 모델을 적응해 보면 형태소 분리와 형태론적 변형이 복합되어 있어서 교착어의 특성과 관계되는 단어 유형을 분석할 때 비효율적인 요소가 발견된다. 따라서 본 논문에서는 교착어인 한국어의 형태소 분석시에 발생하는 문제점들을 해결하는데 적합한 방법론으로 다층 형태론(multi-level morphology)과 다단계 모델(multi-level model)을 제안한다.

  • PDF

Efficient Blog Retrieval System by Topic-based Weighting (주제어 가중치 기법에 의한 효율적인 블로그 검색 시스템)

  • Shin, Hyeon-Il;Yun, Un-Il;Ryu, Keun-Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.4
    • /
    • pp.1-9
    • /
    • 2010
  • In the new generation of Web, commonly called "Web 2.0", blogging has facilitated the publishing information or his/her opinion on the web. Various blog retrieval algorithms have been proposed to search for blogs more effectively. However, actually keyword-based searching or link-analysis blog ranking system cannot satisfy the user's requirement. In this paper, we suggest a topic-based weighting blog retrieval system in which the links between blog writings and searching words are considered to improve the search results. Our system extracts topics from each blog and weights them much higher than other guide words. In the comparison with other systems, we see that the proposed topic-base system has better recall rate of search results.

Analyzing the Study Trends of 'Sense of Place' Using Text Mining Techniques (텍스트마이닝 기법을 활용한 국내외 장소성 관련 연구동향 분석)

  • Lee, Ina;Kim, Hea-Jin
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.30 no.2
    • /
    • pp.189-209
    • /
    • 2019
  • Main Path Analysis (MPA) is one of the text mining techniques that extracts the core literature that contributes knowledge transfer based on citation information in the literature. This study applied various text mining techniques to abstract of the paper related with sense-of-place, which is published at Korea and abroad from 1990 to 2018 so that could discuss in a macro perspective. The main path analysis results showed that from 1990, overseas research on sense-of-place has been carried out in the order of personal identity, public land management, environmental education and urban development-related areas. Also, by using the network analysis, this study found that sense-of-place was discussed at various levels in Korea, including urban development, culture, literature, and history. On the other hand, it has been found that there are few topic changes in international studies, and that discussions on health, identity, landscape and urban development have been going on steadily since the 1990s. This study has implications that it presents a new perspective of grasping the overall flow of relevant research.

Syntactic Analysis and Keyword Expansion for Performance Enhancement of Information Retrieval System (정보 검색 시스템의 성능 향상을 위한 구문 분석과 검색어 확장)

  • 윤성희
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.5 no.4
    • /
    • pp.303-308
    • /
    • 2004
  • Natural language query is the best user interface for the users of information retrieval systems. This paper Proposes a retrieval system with expanded keyword from syntactically-analyzed structures of user's natural language query based on natural language processing technique. Through the steps combining or splitting the compound nouns based on syntactic tree traversal, and expanding the other-formed or shorten-formed keyword into multiple keyword, the system performance was enhanced up to 11.3% precision and 4.7% correctness.

  • PDF

Exploring 'Tradition' Terminology Trends based on Keyword Analysis (1920~2017) (키워드 분석 기반 '전통' 용어의 트렌드 분석 (1920~2017))

  • Kim, Min-Jeong;Kim, Chul Joo
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.12
    • /
    • pp.421-431
    • /
    • 2018
  • The purpose of this study is to analyze the trends of 'traditional' terminology in Korea. We focus on an empirical investigation of how media reports are conveying 'tradition' terminology in our society by applying text mining and social network analysis techniques. The analysis covered 2,481,143 news articles related to 'tradition' terminology that appeared in the media since the 1920's. In this research, frequency analysis, association analysis and social network analysis were used on articles related to 'tradition' terminology from 1920 to 2017 by decade. By applying these data science techniques, we can grasp the meaning of social culture phenomenon related 'tradition' with objective and value-neutral position and understand the social symbolism which contains the tradition of the times.