• Title/Summary/Keyword: Wikipedia

Search Result 152, Processing Time 0.022 seconds

Research on Key Success Factors of Social Authoring system : Focused on Linux and Wikipedia (리눅스와 위키피디아를 중심으로 분석한 소셜 저작 시스템의 성공요소에 대한 연구)

  • Lee, Seo-Young;Lee, Bong-Gyou
    • Journal of Internet Computing and Services
    • /
    • v.13 no.4
    • /
    • pp.73-82
    • /
    • 2012
  • The worldwide increase of cognitive surplus leads to successful social authoring projects. In this research, social authoring mechanisms used in Linux and Wikipedia projects were analyzed to identify the key success factors. In addition, tools used in the recent successful social media such as Facebook and Ushahidi were evaluated to extract components which may be applied in social authoring systems. Based on these analyses, the improvement factors in the design of social authoring projects were suggested. The social authoring projects are expected to be more successfully achieved by providing the core components proposed in this article.

Construction of Korean Knowledge Base Based on Machine Learning from Wikipedia (위키백과로부터 기계학습 기반 한국어 지식베이스 구축)

  • Jeong, Seok-won;Choi, Maengsik;Kim, Harksoo
    • Journal of KIISE
    • /
    • v.42 no.8
    • /
    • pp.1065-1070
    • /
    • 2015
  • The performance of many natural language processing applications depends on the knowledge base as a major resource. WordNet, YAGO, Cyc, and BabelNet have been extensively used as knowledge bases in English. In this paper, we propose a method to construct a YAGO-style knowledge base automatically for Korean (hereafter, K-YAGO) from Wikipedia and YAGO. The proposed system constructs an initial K-YAGO simply by matching YAGO to info-boxes in Wikipedia. Then, the initial K-YAGO is expanded through the use of a machine learning technique. Experiments with the initial K-YAGO shows that the proposed system has a precision of 0.9642. In the experiments with the expanded part of K-YAGO, an accuracy of 0.9468 was achieved with an average macro F1-measure of 0.7596.

Coreference Resolution for Korean using Mention Pair with SVM (SVM 기반의 멘션 페어 모델을 이용한 한국어 상호참조해결)

  • Choi, Kyoung-Ho;Park, Cheon-Eum;Lee, Changki
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.4
    • /
    • pp.333-337
    • /
    • 2015
  • In this paper, we suggest a Coreference Resolution system for Korean using Mention Pair with SVM. The system introduced in this paper, also be able to extract Mention from document which is including automatically tagged name entity information, dependency trees and POS tags. We also built a corpus, including 214 documents with Coreference tags, referencing online news and Wikipedia for training the system and testing the system's performance. The corpus had 14 documents from online news, along with 200 question-and-answer documents from Wikipedia. When we tested the system by corpus, the performance of the system was extracted by MUC-F1 55.68%, B-cube-F1 57.19%, and CEAFE-F1 61.75%.

Constructing a Large Interlinked Ontology Network for the Web of Data (데이터의 웹을 위한 상호연결된 대규모 온톨로지 네트워크 구축)

  • Kang, Sin-Jae
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.15 no.1
    • /
    • pp.15-23
    • /
    • 2010
  • This paper presents a method of constructing a large interlinked ontology network for the Web of Data through the mapping among typical ontologies. When an ontology is open to the public, and more easily shared and used by people, its value is increased more and more. By linking CoreOnto, an IT core ontology constructed in Korea, to the worldwide ontology network, CoreOnto can be open to abroad and enhanced its usability. YAGO is an ontology constructed by combining category information of Wikipedia and taxonomy of WordNet, and used as the backbone of DBpedia, an ontology constructed by analyzing Wikipedia structure. So a mapping method is suggested by linking CoreOnto to YAGO and DBpedia through the synset of WordNet.

The Quality Control System on Online Collaboration System (온라인 협업 시스템의 품질 관리 체계에 대한 연구)

  • Ko, Sung-Seok;Cho, Mi-Yeon
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.33 no.2
    • /
    • pp.127-132
    • /
    • 2010
  • Recently, the importance of quality control on the online collaboration system is increasing, since it is realized that many participants do not guarantee the quality of online collaboration's output anymore. Hence in this paper, we propose the framework of quality control in order to assure the quality of output from online collaboration system. The proposed framework provides the solution strategies to overcome the challenges of current collaboration systems. To do that, first of all we define the basic process (create, initiate, discuss, complete) of the online collaboration system based on wiki-based system or open source project, and then we find the challenges in each online collaboration process, and propose the effective strategy to overcome these challenges with reference cases including Wikipedia, OSS project, etc.

Discovering Semantic Relationships between Words by using Wikipedia (위키피디아에 기반한 단어 사이의 의미적 연결 관계 탐색)

  • Kim, Ju-Hwang;Hong, Min-sung;Lee, O-Joun;Jung, Jason J.
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2015.07a
    • /
    • pp.17-18
    • /
    • 2015
  • 본 논문에서는 위키피디아를 이용하여 단어 사이의 유사도와 내포된 연결 단어들에 대한 탐색 기법을 제안 한다. 위키피디아에서 제공하는 API를 이용하여 두 단어 사이를 탐색함으로써, 기존 단어 사이의 유사도를 계산하는 방식보다 더 간단하고 폭 넓은 의미 집단을 포괄할 수 있다. 이는 그래프적 특성에 기반하며 그래프를 구성하는 방식으로써 동적 방식과 정적 방식으로 구성된다.

  • PDF

Automatic Processing of Predicative Nouns for Korean Semantic Recognition. (한국어 의미역 인식을 위한 서술성 명사의 자동처리 연구)

  • Lee, Sukeui;Im, Su-Jong
    • Korean Linguistics
    • /
    • v.80
    • /
    • pp.151-175
    • /
    • 2018
  • This paper proposed a method of semantic recognition to improve the extraction of correct answers of the Q&A system through machine learning. For this purpose, the semantic recognition method is described based on the distribution of predicative nouns. Predicative noun vocabularies and sentences were collected from Wikipedia documents. The predicative nouns are typed by analyzing the environment in which the predicative nouns appear in sentences. This paper proposes a semantic recognition method of predicative nouns to which rules can be applied. In Chapter 2, previous studies on predicative nouns were reviewed. Chapter 3 explains how predicative nouns are distributed. In this paper, every predicative nouns that can not be processed by rules are excluded, therefore, the predicative nouns noun forms combined with the case marker '의' were excluded. In Chapter 4, we extracted 728 sentences composed of 10,575 words from Wikipedia. A semantic analysis engine tool of ETRI was used and presented a predicative nouns noun that can be handled semantic recognition language.

User-based Document Summarization using Non-negative Matrix Factorization and Wikipedia (비음수행렬분해와 위키피디아를 이용한 사용자기반의 문서요약)

  • Park, Sun;Jeong, Min-A;Lee, Seong-Ro
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.49 no.2
    • /
    • pp.53-60
    • /
    • 2012
  • In this paper, we proposes a new document summarization method using the expanded query by wikipedia and the semantic feature representing inherent structure of document set. The proposed method can expand the query from user's initial query using the relevance feedback based on wikipedia in order to reflect the user require. It can well represent the inherent structure of documents using the semantic feature by the non-negative matrix factorization (NMF). In addition, it can reduce the semantic gap between the user require and the result of document summarization to extract the meaningful sentences using the expanded query and semantic features. The experimental results demonstrate that the proposed method achieves better performance than the other methods to summary document.

An Effect of Semantic Relatedness on Entity Disambiguation: Using Korean Wikipedia (개체중의성해소에서 의미관련도 활용 효과 분석: 한국어 위키피디아를 사용하여)

  • Kang, In-Su
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.2
    • /
    • pp.111-118
    • /
    • 2015
  • Entity linking is to link entity's name mentions occurring in text to corresponding entities within knowledge bases. Since the same entity mention may refer to different entities according to their context, entity linking needs to deal with entity disambiguation. Most recent works on entity disambiguation focus on semantic relatedness between entities and attempt to integrate semantic relatedness with entity prior probabilities and term co-occurrence. To the best of my knowledge, however, it is hard to find studies that analyze and present the pure effects of semantic relatedness on entity disambiguation. From the experimentation on Korean Wikipedia data set, this article empirically evaluates entity disambiguation approaches using semantic relatedness in terms of the following aspects: (1) the difference among semantic relatedness measures such as NGD, PMI, Jaccard, Dice, Simpson, (2) the influence of ambiguities in co-occurring entity mentions' set, and (3) the difference between individual and collective disambiguation approaches.

Building Concept Networks using a Wikipedia-based 3-dimensional Text Representation Model (위키피디아 기반의 3차원 텍스트 표현모델을 이용한 개념망 구축 기법)

  • Hong, Ki-Joo;Kim, Han-Joon;Lee, Seung-Yeon
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.9
    • /
    • pp.596-603
    • /
    • 2015
  • A concept network is an essential knowledge base for semantic search engines, personalized search systems, recommendation systems, and text mining. Recently, studies of extending concept representation using external ontology have been frequently conducted. We thus propose a new way of building 3-dimensional text model-based concept networks using the world knowledge-level Wikipedia ontology. In fact, it is desirable that 'concepts' derived from text documents are defined according to the theoretical framework of formal concept analysis, since relationships among concepts generally change over time. In this paper, concept networks hidden in a given document collection are extracted more reasonably by representing a concept as a term-by-document matrix.