• Title/Summary/Keyword: 위키

Search Result 169, Processing Time 0.028 seconds

Webtoon Search utilizing Genre Similarity with Word2Vec (Word2Vec 기반 장르 유사성을 활용한 웹툰 검색)

  • Lee, ChangMin;Ahn, JeJeong;Kang, DongYeon;Lee, Hyunah
    • Annual Conference on Human and Language Technology
    • /
    • 2019.10a
    • /
    • pp.503-505
    • /
    • 2019
  • 본 논문에서는 기존 웹툰 장르 검색 시스템의 단점을 보완하기 위해 키워드 기반 유사 장르 검색 시스템을 제안한다. 기존 웹툰의 장르와 키워드를 분석하여 44개의 장르를 설정하고 해당 장르에 적합한 웹툰을 수집한다. 나무위키와 위키피디아 문서로 학습된 Word2Vec모델에 기반하여 계산한 사용자 입력 키워드와 44개의 장르간 유사도로 사용자 입력에 가장 유사한 장르를 찾는다. 유사 장르에 포함되는 웹툰을 결과로 출력하여 사용자가 선호하는 장르의 웹툰을 제시한다. 실험 결과에서는 나무위키에서 '장르'로 검색하여 얻는 작은 크기의 문서 집합에서 Word2Vec을 학습한 모델에서 가장 높은 검색 성능을 보였다.

  • PDF

Conversation Dataset Generation and Improve Search Performance via Large Language Model (Large Language Model을 통한 대화 데이터셋 자동 생성 및 검색 성능 향상)

  • Hyeongjun Choi;Beomseok Hong;Wonseok Choi;Youngsub Han;Byoung-Ki Jeon;Seung-Hoon Na
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.295-300
    • /
    • 2023
  • 대화 데이터와 같은 데이터는 사람이 수작업으로 작성해야 하기 때문에 데이터셋 구축에 시간과 비용이 크게 발생한다. 현재 대두되고 있는 Large Language Model은 이러한 대화 생성에서 보다 자연스러운 대화 생성이 가능하다는 이점이 존재한다. 이번 연구에서는 LLM을 통해 사람이 만든 적은 양의 데이터셋을 Fine-tuning 하여 위키백과 문서로부터 데이터셋을 만들어내고, 이를 통해 문서 검색 모델의 성능을 향상시켰다. 그 결과 학습 데이터와 같은 문서집합에서 MRR 3.7%p, 위키백과 전체에서 MRR 4.5%p의 성능 향상을 확인했다.

  • PDF

Effectiveness of Adaptive Navigation System for Group Activity at the Wiki-based Collaborative Learning (Wiki 기반 협력학습에서 적응적 내비게이션 시스템이 그룹 활동에 미치는 효과)

  • Han, Hee-Seop;Kim, Hyeoncheol
    • The Journal of Korean Association of Computer Education
    • /
    • v.9 no.1
    • /
    • pp.41-48
    • /
    • 2006
  • The latest several studies show that Wiki is a very efficient tools for collaborative learning in the distributed environments. Even though Wiki supports efficient knowledge sharing between group members, there are still some problems to be solved for collaborative learning. Since the structure of group contents becomes more complex and the links between pages are dynamically changed, each member of group has difficulties to perceive the changed contents and links on group pages. We designed the adaptive navigation system to guide individual browsing paths of each member through the calculating of friendship and the state of pages. At first we developed the relation model between member and each pages by the historical log that stored the change of pages and the activity of members, and then we implemented the adaptive navigation system using the model. Experimental results show that this adaptive system is very effective to share the group knowledge and to promote collaborative learning activities.

  • PDF

An Experimental Study on Feature Selection Using Wikipedia for Text Categorization (위키피디아를 이용한 분류자질 선정에 관한 연구)

  • Kim, Yong-Hwan;Chung, Young-Mee
    • Journal of the Korean Society for information Management
    • /
    • v.29 no.2
    • /
    • pp.155-171
    • /
    • 2012
  • In text categorization, core terms of an input document are hardly selected as classification features if they do not occur in a training document set. Besides, synonymous terms with the same concept are usually treated as different features. This study aims to improve text categorization performance by integrating synonyms into a single feature and by replacing input terms not in the training document set with the most similar term occurring in training documents using Wikipedia. For the selection of classification features, experiments were performed in various settings composed of three different conditions: the use of category information of non-training terms, the part of Wikipedia used for measuring term-term similarity, and the type of similarity measures. The categorization performance of a kNN classifier was improved by 0.35~1.85% in $F_1$ value in all the experimental settings when non-learning terms were replaced by the learning term with the highest similarity above the threshold value. Although the improvement ratio is not as high as expected, several semantic as well as structural devices of Wikipedia could be used for selecting more effective classification features.

Generation United News In Education Using Knowledge Sharing Service (지식공유서비스를 활용한 세대통합형 NIE)

  • Jang, Jae-Kyung;Kim, Ho-Sung
    • 한국HCI학회:학술대회논문집
    • /
    • 2007.02b
    • /
    • pp.213-218
    • /
    • 2007
  • 정보생산을 촉진하는 새로운 형태의 사이버공간이 나타나면서 지식의 창출과 지식을 얻는 형태가 변화하고 있다. 최종사용자에게 웹 애플리케이션을 제공하는 컴퓨팅 플랫폼인 web 2.0의 도입으로 손쉽게 자신이 필요한 정보를 분류할 수 있는 웹 컨텐츠가 활발히 제공되고 있으며 제공된 컨텐츠를 활용하여 수많은 정보 속에서 자신만의 보석을 찾아 손쉽게 지식을 쌓아 관리를 하고 있다. 이러한 지식공유서비스는 무분별한 정보, 중복된 지식, 그리고 단순한 관리로 인해 자신이 원하는 지식을 얻기란 쉽지 않은 것이 사실이다. 본 논문에서는 온라인상의 정보를 탐색하기 위해서 인터넷을 이용하던 네티즌들이 정보생산자로서 참여하는 공간으로 등장한 '지식공유서비스'를 기반으로 지식 창출 및 관리자로서 시니어를 활용하여 지식의 수용자로서 유아들을 위한 NIE (News In Education) 활용 교육 체계를 제안한다. 뉴스는 유아부터 성인까지 활용될 수 있는 좋은 교육 자료로서 NIE를 통하여 사고력, 논리력, 표현력, 창의력 등 여러 영역에 걸친 능력을 향상시킬 수 있다. 특히 유치원이나 학교의 교과과정에 맞추어 이러한 능력들을 더욱 배가 시킬 수 있다는 점에서 NIE가 더욱 각광받고 있다. 본 연구는 미디어 융합의 결과로 인터넷 뉴스를 활용해 생활과 분리되지 않은 통합교육을 할 수 있는 SCORM 기반의 유아용 콘텐츠를 생성하여 유아 교육에 활용하고자 한다. 또한, 유아용 NIE 교육 콘텐츠는 시니어들을 NIE 강사로 양성하였을 때 학습 자료로도 활용된다. 시니어들을 NIE 강사로 양성함으로써 시니어의 일자리 창출 및 지역사회 통합과 1세대인 여성시니어와 3세대인 아동 간의 세대통합을 이끌어 낼 수 있도록 하는 것에 목적을 두고 시니어 NIE 콘텐츠를 생성하고자 한다. NIE에 생성되는 지식을 생성하고 관리하기 위한 지식 솔루션으로 위키의 기능을 추가하여 개발하고자 한다. 위키를 사용하므로 개별적으로 존재하던 지식을 공동의 지식으로 공유할 수 있으며 의견을 하나로 통합하는 과정에서도 유용하게 사용될 수 있을 것이다. 위키를 이용한 시니어 NIE 콘텐츠에서는 교수 학습 계획안 및 NIE 아이디어를 공동 작업을 통하여 효율적으로 지식을 생성할 수 있으며 여러 사람들이 여러 단계를 거치면서 하나의 정제된 지식을 생성하게 되므로 양질의 교수 학습 계획안이나 NIE 아이디어를 창출할 수 있을 것이다.

  • PDF

Wiki-based Interactive Electronic Technical Manuals (IETM) for Construction Project Management : Through a Case Study of Urban Regeneration Projects (Wiki기반 건설 사업관리 전자매뉴얼 : 도시환경정비사업 사례연구를 통해)

  • Park, Moon-Seo;Kang, Sung-Hoon;Lee, Hyun-Soo
    • Korean Journal of Construction Engineering and Management
    • /
    • v.11 no.3
    • /
    • pp.3-12
    • /
    • 2010
  • Recently, workers on construction sites can hardly understand their tasks and the process of the tasks because the construction projects are becoming large and complex. Due to the complexity, workers need a tool that can help them understand their works and some paper-based manuals exist to support them. However, the existing paperbased manuals are not actively used by workers due to the low credibility of information on the manual. In particular, paper-based manual can not be updated frequently because those manuals have to be manually updated by experts. Thus, the credibility of information is decreased and the decrease let the users distrust information of the paper-based manual. Therefore, this research analyzed the feature of wiki-based system and suggested an wiki-based IETM(Interactive Electronic Technical Manual) system model that can be modified by users, and then conducted a survey through a prototype based on urban regeneration project. The result of the survey indicated that wikibased IETM can improve the stakeholder communication by reinforcing the process of creating knowledge as well as knowledge itself.

Development and Application of Reading Discussion System based on Wiki for Improving Interaction (상호작용 증진을 위한 위키 기반 독서토론학습 시스템 개발 및 적용)

  • Park, Jeong-Ae;Park, Sun-Ju
    • Journal of The Korean Association of Information Education
    • /
    • v.13 no.2
    • /
    • pp.183-192
    • /
    • 2009
  • Reading discussion system based on a Web is a new teaching method which employs a Web as a space for discussion. It enables learners to exchange their opinions to enhance understanding and expand thinking. It is highly required to promote attitude of respecting other's opinions, reasonable thinking, and ability to express thinking in discussion learning based on a Web for interaction. In this study, I developed a reading discussion system, applying the Wiki program, for learners' interaction and tested its usefulness. Through this, it will be possible not to have existing vertical discussion which makes replies on a Web board but to have multidimensional discussion.

  • PDF

Building a Korean-English Parallel Corpus by Measuring Sentence Similarities Using Sequential Matching of Language Resources and Topic Modeling (언어 자원과 토픽 모델의 순차 매칭을 이용한 유사 문장 계산 기반의 위키피디아 한국어-영어 병렬 말뭉치 구축)

  • Cheon, JuRyong;Ko, YoungJoong
    • Journal of KIISE
    • /
    • v.42 no.7
    • /
    • pp.901-909
    • /
    • 2015
  • In this paper, to build a parallel corpus between Korean and English in Wikipedia. We proposed a method to find similar sentences based on language resources and topic modeling. We first applied language resources(Wiki-dictionary, numbers, and online dictionary in Daum) to match word sequentially. We construct the Wiki-dictionary using titles in Wikipedia. In order to take advantages of the Wikipedia, we used translation probability in the Wiki-dictionary for word matching. In addition, we improved the accuracy of sentence similarity measuring method by using word distribution based on topic modeling. In the experiment, a previous study showed 48.4% of F1-score with only language resources based on linear combination and 51.6% with the topic modeling considering entire word distributions additionally. However, our proposed methods with sequential matching added translation probability to language resources and achieved 9.9% (58.3%) better result than the previous study. When using the proposed sequential matching method of language resources and topic modeling after considering important word distributions, the proposed system achieved 7.5%(59.1%) better than the previous study.

A Semantic Text Model with Wikipedia-based Concept Space (위키피디어 기반 개념 공간을 가지는 시멘틱 텍스트 모델)

  • Kim, Han-Joon;Chang, Jae-Young
    • The Journal of Society for e-Business Studies
    • /
    • v.19 no.3
    • /
    • pp.107-123
    • /
    • 2014
  • Current text mining techniques suffer from the problem that the conventional text representation models cannot express the semantic or conceptual information for the textual documents written with natural languages. The conventional text models represent the textual documents as bag of words, which include vector space model, Boolean model, statistical model, and tensor space model. These models express documents only with the term literals for indexing and the frequency-based weights for their corresponding terms; that is, they ignore semantical information, sequential order information, and structural information of terms. Most of the text mining techniques have been developed assuming that the given documents are represented as 'bag-of-words' based text models. However, currently, confronting the big data era, a new paradigm of text representation model is required which can analyse huge amounts of textual documents more precisely. Our text model regards the 'concept' as an independent space equated with the 'term' and 'document' spaces used in the vector space model, and it expresses the relatedness among the three spaces. To develop the concept space, we use Wikipedia data, each of which defines a single concept. Consequently, a document collection is represented as a 3-order tensor with semantic information, and then the proposed model is called text cuboid model in our paper. Through experiments using the popular 20NewsGroup document corpus, we prove the superiority of the proposed text model in terms of document clustering and concept clustering.

Dynamic ontology construction algorithm from Wikipedia and its application toward real-time nation image analysis (국가이미지 분석을 위한 위키피디아 실시간 동적 온톨로지 구축 알고리즘 및 적용)

  • Lee, Youngwhan
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.4
    • /
    • pp.979-991
    • /
    • 2016
  • Measuring nation images was a challenging task when employing offline surveys was the only option. It was not only prohibitively expensive, but too much time-consuming and therefore unfitted to this rapidly changing world. Although demands for monitoring real-time nation images were ever-increasing, an affordable and reliable solution to measure nation images has not been available up to this date. The researcher in this study developed a semi-automatic ontology construction algorithm, named "double-crossing double keyword collection (or DCDKC)" to measure nation images from Wikipedia in real-time. The ontology, WikiOnto, can be used to reflect dynamic image changes. In this study, an instance of WikiOnto was constructed by applying the algorithm to the big-three exporting countries in East Asia, Korea, Japan, and China. Then, the numbers of page views for words in the instance of WikiOnto were counted. A collection of the counting for each country was compared to each other to inspect the possibility to use for dynamic nation images. As for the conclusion, the result shows how the images of the three countries have changed for the period the study was performed. It confirms that DCDKC can very well be used for a real-time nation-image monitoring system.