• Title/Summary/Keyword: Wikipedia

Search Result 152, Processing Time 0.026 seconds

A Method to Solve the Entity Linking Ambiguity and NIL Entity Recognition for efficient Entity Linking based on Wikipedia (위키피디아 기반의 효과적인 개체 링킹을 위한 NIL 개체 인식과 개체 연결 중의성 해소 방법)

  • Lee, Hokyung;An, Jaehyun;Yoon, Jeongmin;Bae, Kyoungman;Ko, Youngjoong
    • Journal of KIISE
    • /
    • v.44 no.8
    • /
    • pp.813-821
    • /
    • 2017
  • Entity Linking find the meaning of an entity mention, which indicate the entity using different expressions, in a user's query by linking the entity mention and the entity in the knowledge base. This task has four challenges, including the difficult knowledge base construction problem, multiple presentation of the entity mention, ambiguity of entity linking, and NIL entity recognition. In this paper, we first construct the entity name dictionary based on Wikipedia to build a knowledge base and solve the multiple presentation problem. We then propose various methods for NIL entity recognition and solve the ambiguity of entity linking by training the support vector machine based on several features, including the similarity of the context, semantic relevance, clue word score, named entity type similarity of the mansion, entity name matching score, and object popularity score. We sequentially use the proposed two methods based on the constructed knowledge base, to obtain the good performance in the entity linking. In the result of the experiment, our system achieved 83.66% and 90.81% F1 score, which is the performance of the NIL entity recognition to solve the ambiguity of the entity linking.

Dynamic ontology construction algorithm from Wikipedia and its application toward real-time nation image analysis (국가이미지 분석을 위한 위키피디아 실시간 동적 온톨로지 구축 알고리즘 및 적용)

  • Lee, Youngwhan
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.4
    • /
    • pp.979-991
    • /
    • 2016
  • Measuring nation images was a challenging task when employing offline surveys was the only option. It was not only prohibitively expensive, but too much time-consuming and therefore unfitted to this rapidly changing world. Although demands for monitoring real-time nation images were ever-increasing, an affordable and reliable solution to measure nation images has not been available up to this date. The researcher in this study developed a semi-automatic ontology construction algorithm, named "double-crossing double keyword collection (or DCDKC)" to measure nation images from Wikipedia in real-time. The ontology, WikiOnto, can be used to reflect dynamic image changes. In this study, an instance of WikiOnto was constructed by applying the algorithm to the big-three exporting countries in East Asia, Korea, Japan, and China. Then, the numbers of page views for words in the instance of WikiOnto were counted. A collection of the counting for each country was compared to each other to inspect the possibility to use for dynamic nation images. As for the conclusion, the result shows how the images of the three countries have changed for the period the study was performed. It confirms that DCDKC can very well be used for a real-time nation-image monitoring system.

Classification of Speleology in Wikipedia

  • Oh, Jong-Woo
    • Journal of the Speleological Society of Korea
    • /
    • no.82
    • /
    • pp.17-25
    • /
    • 2007
  • The use of a low-frequency cave radio can also verify survey accuracy. A receiving unit on the surface can pinpoint the depth and location of a transmitter in a cave passage by measurement of the geometry of its radio waves. A survey over the surface from the receiver back to the cave entrance forms an artificial loop with the underground survey, whose loop-closure error can then be determined. In the past, caves were reluctant to redraw complex cave maps after detecting survey errors. Today, computer cartography can automatically redraw cave maps after data has been corrected.

Phase-based Model Using Web Documents for Korean Unknown Word Recognition (웹문서를 이용한 단계별 한국어 미등록어 인식 모델)

  • Park, So-Young
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.9
    • /
    • pp.1898-1904
    • /
    • 2009
  • Recently, real documents such as newspapers as well as blogs include newly coined words such as "Wikipedia". However, most previous information processing technologies cannot deal with these newly coined words because they construct their dictionaries based on materials acquired during system development. In this paper, we propose a model to automatically recognize Korean unknown words excluded from the previously constructed dictionary. The proposed model consists of an unknown noun recognition phase based on full text analysis, an unknown verb recognition phase based on web document frequency, and an unknown noun recognition phase based on web document frequency. The proposed model can recognize accurately the unknown words occurred once and again in a document by the full text analysis. Also, the proposed model can recognize broadly the unknown words occurred once in the document by using web documents. Besides, the proposed model fan recognize both a Korean unknown verb, which syllables can be changed from its base form by inflection, and a Korean unknown noun, which syllables are not changed in any eojeol. Experimental results shows that the proposed model improves precision 1.01% and recall 8.50% as compared with a previous model.

Politics of Collective Intelligence - Paradigm Shift of Knowledge and its Possibility on Democracy - (집단지성의 정치 - 지식패러다임의 변화와 민주주의의 가능성 -)

  • Jho, Whasun;Cho, Jaedong
    • Informatization Policy
    • /
    • v.17 no.4
    • /
    • pp.61-79
    • /
    • 2010
  • This study focuses on the emergence of collective intelligence and its impact on the democracy in the information era. Scholars have posed very different-optimistic and pessimistic-views on the possibility of collective knowledge produced by the public. Focusing on the cases of a free online encylopedia known as wikipedia and 2008 Candlelight Demonstration against the imports of US beef in Korea, this paper analyzes the mechanism of collective intelligence and its political implications on the democracy. Specifically, this article approaches changes in new knowledge paradigm with two different variables: the degree of connectivity and the quality of deliberation. Applying two different sets of variables helps us to distinguish the possibilities of collective intelligence and anti-intelligence, which would suggest social and political implications for the democracy in a country. This study finds a critical difference in terms of the quality of deliberation, measured by the indicators such as diversity, independence, and integration mechanism for online deliberation.

  • PDF

A Proposition of Incorporating Time and Space in a Virtual World (다차원 가상세계 모델 개발을 위한 연구 -시간축이 부여된 가상세계 모델을 중심으로-)

  • Kihl, Tae-Suk;Chang, Ju-No;Baek, Hyoung-Mok;Rhee, Dae-Woong
    • Journal of Korea Game Society
    • /
    • v.9 no.4
    • /
    • pp.21-32
    • /
    • 2009
  • In this paper, we present a model of a virtual world that incorporates different time periods, in contrast to current popular virtual worlds like Second Life, to utilize the digital space fully. The construction of a virtual world in which we include historical information in the virtual life simulation utilizing the world map and current space information is proposed. The reason for incorporating time is that the virtual world varies according to the politics, economics, society, and culture of a particular time period, so users are able to play in a distinct virtual world as a resident and make their communities of their own free will. Like the online encyclopedia Wikipedia, the model proposed in this paper is a project designed to be maintained by and expanded through the interactivity of users, but unlike Wikipedia, users of this virtual world will be able to live and interact in a world of their own creation in addition to contributing real information.

  • PDF

A Semi-automatic Construction method of a Named Entity Dictionary Based on Wikipedia (위키피디아 기반 개체명 사전 반자동 구축 방법)

  • Song, Yeongkil;Jeong, Seokwon;Kim, Harksoo
    • Journal of KIISE
    • /
    • v.42 no.11
    • /
    • pp.1397-1403
    • /
    • 2015
  • A named entity(NE) dictionary is an important resource for the performance of NE recognition. However, it is not easy to construct a NE dictionary manually since human annotation is time consuming and labor-intensive. To save construction time and reduce human labor, we propose a semi-automatic system for the construction of a NE dictionary. The proposed system constructs a pseudo-document with Wiki-categories per NE class by using an active learning technique. Then, it calculates similarities between Wiki entries and pseudo-documents using the BM25 model, a well-known information retrieval model. Finally, it classifies each Wiki entry into NE classes based on similarities. In experiments with three different types of NE class sets, the proposed system showed high performance(macro-average F1-score of 0.9028 and micro-average F1-score 0.9554).

System for Neologism Information Support in Real-Time Streaming Service (실시간 스트리밍 서비스에서 신조어 정보 제공 시스템)

  • Seungyong, Lee;Neunghoe, Kim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.1
    • /
    • pp.203-207
    • /
    • 2023
  • Recently, real-time streaming services are gaining popularity among the MZ generation and the market size is continuously growing. Pre-recorded and edited videos only show one-way communication, but real-time streaming services have the advantage of responding immediately to questions and requests from users, as they enable two-way communication. With the transition from face-to-face culture to non-face-to-face culture due to the COVID-19 pandemic, the number of users communicating in real-time video for activities such as classes, meetings, and leisure has dramatically increased. However, as real-time streaming services become more active and diverse generations participate, there is a problem of conflicts arising from language differences, including the use of neologisms. To address this issue, this paper proposes a method of collecting the meaning of neologisms through the Wikipedia API and presenting them to each other, so that they can understand each other's intentions.

Learning Material Bookmarking Service based on Collective Intelligence (집단지성 기반 학습자료 북마킹 서비스 시스템)

  • Jang, Jincheul;Jung, Sukhwan;Lee, Seulki;Jung, Chihoon;Yoon, Wan Chul;Yi, Mun Yong
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.179-192
    • /
    • 2014
  • Keeping in line with the recent changes in the information technology environment, the online learning environment that supports multiple users' participation such as MOOC (Massive Open Online Courses) has become important. One of the largest professional associations in Information Technology, IEEE Computer Society, announced that "Supporting New Learning Styles" is a crucial trend in 2014. Popular MOOC services, CourseRa and edX, have continued to build active learning environment with a large number of lectures accessible anywhere using smart devices, and have been used by an increasing number of users. In addition, collaborative web services (e.g., blogs and Wikipedia) also support the creation of various user-uploaded learning materials, resulting in a vast amount of new lectures and learning materials being created every day in the online space. However, it is difficult for an online educational system to keep a learner' motivation as learning occurs remotely, with limited capability to share knowledge among the learners. Thus, it is essential to understand which materials are needed for each learner and how to motivate learners to actively participate in online learning system. To overcome these issues, leveraging the constructivism theory and collective intelligence, we have developed a social bookmarking system called WeStudy, which supports learning material sharing among the users and provides personalized learning material recommendations. Constructivism theory argues that knowledge is being constructed while learners interact with the world. Collective intelligence can be separated into two types: (1) collaborative collective intelligence, which can be built on the basis of direct collaboration among the participants (e.g., Wikipedia), and (2) integrative collective intelligence, which produces new forms of knowledge by combining independent and distributed information through highly advanced technologies and algorithms (e.g., Google PageRank, Recommender systems). Recommender system, one of the examples of integrative collective intelligence, is to utilize online activities of the users and recommend what users may be interested in. Our system included both collaborative collective intelligence functions and integrative collective intelligence functions. We analyzed well-known Web services based on collective intelligence such as Wikipedia, Slideshare, and Videolectures to identify main design factors that support collective intelligence. Based on this analysis, in addition to sharing online resources through social bookmarking, we selected three essential functions for our system: 1) multimodal visualization of learning materials through two forms (e.g., list and graph), 2) personalized recommendation of learning materials, and 3) explicit designation of learners of their interest. After developing web-based WeStudy system, we conducted usability testing through the heuristic evaluation method that included seven heuristic indices: features and functionality, cognitive page, navigation, search and filtering, control and feedback, forms, context and text. We recruited 10 experts who majored in Human Computer Interaction and worked in the same field, and requested both quantitative and qualitative evaluation of the system. The evaluation results show that, relative to the other functions evaluated, the list/graph page produced higher scores on all indices except for contexts & text. In case of contexts & text, learning material page produced the best score, compared with the other functions. In general, the explicit designation of learners of their interests, one of the distinctive functions, received lower scores on all usability indices because of its unfamiliar functionality to the users. In summary, the evaluation results show that our system has achieved high usability with good performance with some minor issues, which need to be fully addressed before the public release of the system to large-scale users. The study findings provide practical guidelines for the design and development of various systems that utilize collective intelligence.

Answer Constraints Extraction on User Question for Wikipedia QA (위키피디아 QA를 위한 질의문의 정답제약 추출)

  • Wang, JiHyun;Heo, Jeong;Lee, Hyungjik;Bae, Yongjin;Kim, Hyunki
    • Annual Conference on Human and Language Technology
    • /
    • 2017.10a
    • /
    • pp.248-250
    • /
    • 2017
  • 질의응답 시스템에서 정답을 제약하기 위한 위키피디아 영역의 정답제약 9개를 정의하고 질문 문장에서 제약표현을 추출하는 방법을 제안한다. 다어절의 정답제약 표현을 추출하기 위해서 언어분석 결과를 활용하여 정답제약 후보를 생성하며 후보단위로 정답제약 표현을 학습하기 위한 자질을 제시한다. 기계학습 방법을 이용하여 정답제약 후보 별로 정답제약 태그를 분류하여 정답제약 표현을 추출한다. 성능 실험은 각 정답제약 태그 별로 F1-Score 평가를 수행하였다.

  • PDF