• Title/Summary/Keyword: Semantic tagging

Search Result 37, Processing Time 0.02 seconds

CTKOS : Categorized Tag-based Knowledge Organization System (카테고리형 태그 기반의 지식조직체계 구현)

  • Yoo, Dong-Hee;Kim, Gun-Woo;Choi, Keun-Ho;Suh, Yong-Moo
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.59-74
    • /
    • 2011
  • As more users are willingly participating in the creation of web contents, flat folksonomy using simple tags has emerged as a powerful instrument to classify and share a huge amount of knowledge on the web. However, flat folksonomy has semantic problems, such as ambiguity and misunderstanding of tags. To alleviate such problems, many studies have built structured folksonomy with a hierarchical structure or relationships among tags. However, structured folksonomy also has some fundamental problems, such as limited tagging to pre-defined vocabulary for new tags and the timeconsuming manual effort required for selecting tags. To resolve these problems, we suggested a new method of attaching a categorized tag (CT), followed by its category, to web content. CTs are automatically integrated into collaboratively-built structured folksonomy (CSF) in real time, reflecting the tag-and-category relationships by majority users. Then, we developed a CT-based knowledge organization system (CTKOS), which builds the CSF to classify organizational knowledge and allows us to locate the appropriate knowledge.

A Comparative Study of XML and HTML: Focusing on Their Characteristics and Retrieval Functions (디지털도서관 문서양식으로서의 XML과 HTML의 특성 및 검색 기능 비교 연구)

  • 김현희;장혜원
    • Journal of the Korean Society for information Management
    • /
    • v.16 no.2
    • /
    • pp.105-134
    • /
    • 1999
  • For efficient and precise searches in the Web environment, resources should be coded in a structured way. HTML does not cover semantic structure because of its fixed tagging. XML, which has emerged as an alternative standard markuplanguage, uses custom tags that allow structural searching. Therefore, this study aims to compare XML with HTML in terms of their characteristics and retrieval functions. In order to test retrieval functions of XML- and HTML-based systems, we constructed an experimental XML-based system. The XML-based system has several advantages over the HTML system. However, some improvements are needed to make the XML system more comprehensive and effective. First, XML document search engines with user-friendly interfaces are needed. Second, popular Web browsers such as Explorer and Communicator need to support XML 1.0 specification completely. Third, Open DTD format, which will allow information retrieval systems to retrieve documents and compress them into one single format, is also needed to control Web documents more efficiently.

  • PDF

Detecting Inconsistent Code Identifiers (코드 비 일관적 식별자 검출 기법)

  • Lee, Sungnam;Kim, Suntae;Park, Sooyoung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.5
    • /
    • pp.319-328
    • /
    • 2013
  • Software maintainers try to comprehend software source code by intensively using source code identifiers. Thus, use of inconsistent identifiers throughout entire source code causes to increase cost of software maintenance. Although participants can adopt peer reviews to handle this problem, it might be impossible to go through entire source code if the volume of code is huge. This paper introduces an approach to automatically detecting inconsistent identifiers of Java source code. This approach consists of tokenizing and POS tagging all identifiers in the source code, classifying syntactic and semantic similar terms, and finally detecting inconsistent identifiers by applying proposed rules. In addition, we have developed tool support, named CodeAmigo, to support the proposed approach. We applied it to two popular Java based open source projects in order to show feasibility of the approach by computing precision.

On development of supporting tool for Folksonomy Mining based on Formal Concept Analysis (형식개념분석을 이용한 폭소노미 마이닝 기법과 지원도구의 개발)

  • Kang, Yu-Kyung;Hwang, Suk-Hyung;Yang, Hae-Sool
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.10 no.8
    • /
    • pp.1877-1893
    • /
    • 2009
  • Folksonomy is a user-generated taxonomy to organize information by which a user assigns tags to resources published on the web. Triadic datas that indicate relations of between users, tags, and resources, are created by collaborative tagging from many users in folksonomy-based system. Such the folksonomy data has been utilized in the field of the semantic web and web2.0 as metadata about web resources. In this paper, we propose FCA-based folksonomy data mining approach in order to extract the useful information from folksonomy data with various points of view. And we developed tool for supporting our approach. In order to verify the usefulness of our proposed approach and FMT, we have done some experiments for data of del.icio.us, which is a popular folksonomy-based bookmarking system. And we report about result of our experiments.

An Efficient Technique for Image Tag Ranking using Semantic Relationship between Tags (태그간 의미관계를 이용한 효율적인 이미지 태그 랭킹 기법)

  • Hong, Hyun-Ki;Heu, Jee-Uk;Jeong, Jin-Woo;Lee, Dong-Ho
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2010.06c
    • /
    • pp.31-36
    • /
    • 2010
  • 최근 대두되고 있는 웹2.0의 특징은 일반 사용자들이 능동적으로 정보를 생산해내고 공유하는데 있다. 웹 2.0의 참여형 아키텍쳐를 구성하는 핵심요소로 인식되고 있는 폭소노미(Folksonomy)는 과거 택소노미(Taxonomy)와 같이 전문가에 의하여 구축되는 분류 체계가 아닌 사용자들이 협동적으로 태그(Tag)들을 만들고 관리하는 소셜 태깅(Social Tagging)에 의한 분류 시스템이다. 최근 이러한 폭소노미를 활용하여 이미지를 공유하고 검색하고자 하는 다양한 시도들이 진행되고 있다. 그러나 Flickr와 같은 태그 기반 이미지 공유 시스템에서는 태그의 문법적, 의미적 모호성과 이미지에 대한 태그들의 중요성 또는 상관관계를 고려하지 않아 태그 기반 검색 시 정확성 및 신뢰성을 보장할 수 없다. 이러한 문제를 해결하기 위해 폭소노미에 기반한 이미지 공유 데이터베이스에서 적합한 태그들을 태그 전달(Tag Propagation)하거나 확률 및 출현빈도에 기반하여 태그 랭킹을 수행하기 위한 연구들이 활발히 진행되고 있지만 여전히 만족할만한 성능을 보이지 못하고 있다. 본 논문에서는 이미지 공유 데이터베이스에서 유사한 이미지들로부터 이미지에 보다 적합한 태그들을 부여하기 위해서, WordNet을 활용하여 태그들 간의 의미관계에 기반한 효율적인 태그 랭킹 기법을 제안한다. 또한, 신뢰성 있는 태그 기반 검색을 위하여 제안한 태그 랭킹 기법이 현재 이미지 공유 시스템의 랭킹 결과보다 정확성을 높일 수 있음을 실험 예제를 통하여 확인하였다.

  • PDF

Fashion attribute-based mixed reality visualization service (패션 속성기반 혼합현실 시각화 서비스)

  • Yoo, Yongmin;Lee, Kyounguk;Kim, Kyungsun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.2-5
    • /
    • 2022
  • With the advent of deep learning and the rapid development of ICT (Information and Communication Technology), research using artificial intelligence is being actively conducted in various fields of society such as politics, economy, and culture and so on. Deep learning-based artificial intelligence technology is subdivided into various domains such as natural language processing, image processing, speech processing, and recommendation system. In particular, as the industry is advanced, the need for a recommendation system that analyzes market trends and individual characteristics and recommends them to consumers is increasingly required. In line with these technological developments, this paper extracts and classifies attribute information from structured or unstructured text and image big data through deep learning-based technology development of 'language processing intelligence' and 'image processing intelligence', and We propose an artificial intelligence-based 'customized fashion advisor' service integration system that analyzes trends and new materials, discovers 'market-consumer' insights through consumer taste analysis, and can recommend style, virtual fitting, and design support.

  • PDF

A Study of 'Emotion Trigger' by Text Mining Techniques (텍스트 마이닝을 이용한 감정 유발 요인 'Emotion Trigger'에 관한 연구)

  • An, Juyoung;Bae, Junghwan;Han, Namgi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.69-92
    • /
    • 2015
  • The explosion of social media data has led to apply text-mining techniques to analyze big social media data in a more rigorous manner. Even if social media text analysis algorithms were improved, previous approaches to social media text analysis have some limitations. In the field of sentiment analysis of social media written in Korean, there are two typical approaches. One is the linguistic approach using machine learning, which is the most common approach. Some studies have been conducted by adding grammatical factors to feature sets for training classification model. The other approach adopts the semantic analysis method to sentiment analysis, but this approach is mainly applied to English texts. To overcome these limitations, this study applies the Word2Vec algorithm which is an extension of the neural network algorithms to deal with more extensive semantic features that were underestimated in existing sentiment analysis. The result from adopting the Word2Vec algorithm is compared to the result from co-occurrence analysis to identify the difference between two approaches. The results show that the distribution related word extracted by Word2Vec algorithm in that the words represent some emotion about the keyword used are three times more than extracted by co-occurrence analysis. The reason of the difference between two results comes from Word2Vec's semantic features vectorization. Therefore, it is possible to say that Word2Vec algorithm is able to catch the hidden related words which have not been found in traditional analysis. In addition, Part Of Speech (POS) tagging for Korean is used to detect adjective as "emotional word" in Korean. In addition, the emotion words extracted from the text are converted into word vector by the Word2Vec algorithm to find related words. Among these related words, noun words are selected because each word of them would have causal relationship with "emotional word" in the sentence. The process of extracting these trigger factor of emotional word is named "Emotion Trigger" in this study. As a case study, the datasets used in the study are collected by searching using three keywords: professor, prosecutor, and doctor in that these keywords contain rich public emotion and opinion. Advanced data collecting was conducted to select secondary keywords for data gathering. The secondary keywords for each keyword used to gather the data to be used in actual analysis are followed: Professor (sexual assault, misappropriation of research money, recruitment irregularities, polifessor), Doctor (Shin hae-chul sky hospital, drinking and plastic surgery, rebate) Prosecutor (lewd behavior, sponsor). The size of the text data is about to 100,000(Professor: 25720, Doctor: 35110, Prosecutor: 43225) and the data are gathered from news, blog, and twitter to reflect various level of public emotion into text data analysis. As a visualization method, Gephi (http://gephi.github.io) was used and every program used in text processing and analysis are java coding. The contributions of this study are as follows: First, different approaches for sentiment analysis are integrated to overcome the limitations of existing approaches. Secondly, finding Emotion Trigger can detect the hidden connections to public emotion which existing method cannot detect. Finally, the approach used in this study could be generalized regardless of types of text data. The limitation of this study is that it is hard to say the word extracted by Emotion Trigger processing has significantly causal relationship with emotional word in a sentence. The future study will be conducted to clarify the causal relationship between emotional words and the words extracted by Emotion Trigger by comparing with the relationships manually tagged. Furthermore, the text data used in Emotion Trigger are twitter, so the data have a number of distinct features which we did not deal with in this study. These features will be considered in further study.