• Title/Summary/Keyword: Semantic embedding

Search Result 62, Processing Time 0.024 seconds

Quantitative and Qualitative Considerations to Apply Methods for Identifying Content Relevance between Knowledge Into Managing Knowledge Service (지식 간 내용적 연관성 파악 기법의 지식 서비스 관리 접목을 위한 정량적/정성적 고려사항 검토)

  • Yoo, Keedong
    • The Journal of Society for e-Business Studies
    • /
    • v.26 no.3
    • /
    • pp.119-132
    • /
    • 2021
  • Identification of associated knowledge based on content relevance is a fundamental functionality in managing service and security of core knowledge. This study compares the performance of methods to identify associated knowledge based on content relevance, i.e., the associated document network composition performance of keyword-based and word-embedding approach, to examine which method exhibits superior performance in terms of quantitative and qualitative perspectives. As a result, the keyword-based approach showed superior performance in core document identification and semantic information representation, while the word embedding approach showed superior performance in F1-Score and Accuracy, association intensity representation, and large-volume document processing. This study can be utilized for more realistic associated knowledge service management, reflecting the needs of companies and users.

Improving The Performance of Triple Generation Based on Distant Supervision By Using Semantic Similarity (의미 유사도를 활용한 Distant Supervision 기반의 트리플 생성 성능 향상)

  • Yoon, Hee-Geun;Choi, Su Jeong;Park, Seong-Bae;Park, Se-Young
    • Annual Conference on Human and Language Technology
    • /
    • 2015.10a
    • /
    • pp.23-28
    • /
    • 2015
  • 본 논문에서는 한국어 트리플 생성 시스템의 정확도를 향상시키기 위한 distant supervision 기반의 신뢰도 측정 방법을 제안한다. 기존의 많은 패턴 기반의 트리플 생성 시스템에는 distant supervision의 기본 가정으로 인해 다수의 오류 패턴이 발생할 여지가 크다. 기존의 연구에서는 오류 패턴을 제거하기 위하여 발생 빈도, 공기 횟수 등의 통계에 기반하여 간접적으로 신뢰도를 측정하였다. 본 논문에서는 한국어 패턴과 영어 프로퍼티 사이의 의미 유사도를 측정함으로써 통계에 기반한 방법보다 더 정확한 신뢰도 측정 방법을 제안한다. 비지도 학습 방법인 워드임베딩을 활용하여 어휘의 의미를 학습하고, 이들 사이의 유사도를 측정한다. 한국어 패턴과 영어 프로퍼티의 어휘 불일치 문제를 해결하기 위하여 정준상관분석을 활용하였다. 실험 결과에 따르면 본 논문에서 제안한 패턴 신뢰도 측정 방법은 통계 기반의 방법에 비해 정확률이 9%나 더 높은 트리플 집합을 생성함을 보여주어, 의미 유사도를 반영한 신뢰도 측정이 기존의 통계 기반 신뢰도 측정보다 고품질 트리플 생성에 더 적합함을 확인하였다.

  • PDF

Development of Multimedia Annotation and Retrieval System using MPEG-7 based Semantic Metadata Model (MPEG-7 기반 의미적 메타데이터 모델을 이용한 멀티미디어 주석 및 검색 시스템의 개발)

  • An, Hyoung-Geun;Koh, Jae-Jin
    • The KIPS Transactions:PartD
    • /
    • v.14D no.6
    • /
    • pp.573-584
    • /
    • 2007
  • As multimedia information recently increases fast, various types of retrieval of multimedia data are becoming issues of great importance. For the efficient multimedia data processing, semantics based retrieval techniques are required that can extract the meaning contents of multimedia data. Existing retrieval methods of multimedia data are annotation-based retrieval, feature-based retrieval and annotation and feature integration based retrieval. These systems take annotator a lot of efforts and time and we should perform complicated calculation for feature extraction. In addition. created data have shortcomings that we should go through static search that do not change. Also, user-friendly and semantic searching techniques are not supported. This paper proposes to develop S-MARS(Semantic Metadata-based Multimedia Annotation and Retrieval System) which can represent and extract multimedia data efficiently using MPEG-7. The system provides a graphical user interface for annotating, searching, and browsing multimedia data. It is implemented on the basis of the semantic metadata model to represent multimedia information. The semantic metadata about multimedia data is organized on the basis of multimedia description schema using XML schema that basically comply with the MPEG-7 standard. In conclusion. the proposed scheme can be easily implemented on any multimedia platforms supporting XML technology. It can be utilized to enable efficient semantic metadata sharing between systems, and it will contribute to improving the retrieval correctness and the user's satisfaction on embedding based multimedia retrieval algorithm method.

A Knowledge-based Model for Semantic Oriented Contextual Advertising

  • Maree, Mohammed;Hodrob, Rami;Belkhatir, Mohammed;Alhashmi, Saadat M.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.5
    • /
    • pp.2122-2140
    • /
    • 2020
  • Proper and precise embedding of commercial ads within Webpages requires Ad-hoc analysis and understanding of their content. By the successful implementation of this step, both publishers and advertisers gain mutual benefits through increasing their revenues on the one hand, and improving user experience on the other. In this research work, we propose a novel multi-level context-based ads serving approach through which ads will be served at generic publisher websites based on their contextual relevance. In the proposed approach, knowledge encoded in domain-specific and generic semantic repositories is exploited in order to analyze and segment Webpages into sets of contextually-relevant segments. Semantically-enhanced indexes are also constructed to index ads based on their textual descriptions provided by advertisers. A modified cosine similarity matching algorithm is employed to embed each ad from the Ads repository into one or more contextually-relevant segments. In order to validate our proposal, we have implemented a prototype of an ad serving system with two datasets that consist of (11429 ads and 93 documents) and (11000 documents and 15 ads), respectively. To demonstrate the effectiveness of the proposed techniques, we experimentally tested the proposed method and compared the produced results against five baseline metrics that can be used in the context of ad serving systems. In addition, we compared the results produced by our system with other state-of-the-art models. Findings demonstrate that the accuracy of conventional ad matching techniques has improved by exploiting the proposed semantically-enhanced context-based ad serving model.

A Situation Semantic Account of English Embedded Tense (상황의미론에 기초한 영어 내포 시제 연구: 태도문을 중심으로)

  • 조영순
    • Language and Information
    • /
    • v.4 no.2
    • /
    • pp.27-40
    • /
    • 2000
  • The purpose of this paper is to propose a way of analyzing English embedded tense in terms of temporal per- spective time. To this end, the notion of temporal perspective time and Cooper and Ginzburg's(1996) attitude account are employed. Temporal perspective time is used to define the tense and to capture the anaphoric property of embedded tense,: the embedded temporal perspective time draws the embedding event time by anaphora. The ambiguity in the sequence of tense construction is described in terms of the attitude tense constraint reflecting the anaphoric property and two definitions of the past tense. The double access property in the present-under-past construction is described in terms of the constraint, the notion of eventuality, and the situation theoretic existential quantifier.

  • PDF

Question Retrieval using Deep Semantic Matching for Community Question Answering (심층적 의미 매칭을 이용한 cQA 시스템 질문 검색)

  • Kim, Seon-Hoon;Jang, Heon-Seok;Kang, In-Ho
    • 한국어정보학회:학술대회논문집
    • /
    • 2017.10a
    • /
    • pp.116-121
    • /
    • 2017
  • cQA(Community-based Question Answering) 시스템은 온라인 커뮤니티를 통해 사용자들이 질문을 남기고 답변을 작성할 수 있도록 만들어진 시스템이다. 신규 질문이 인입되면, 기존에 축적된 cQA 저장소에서 해당 질문과 가장 유사한 질문을 검색하고, 그 질문에 대한 답변을 신규 질문에 대한 답변으로 대체할 수 있다. 하지만, 키워드 매칭을 사용하는 전통적인 검색 방식으로는 문장에 내재된 의미들을 이용할 수 없다는 한계가 있다. 이를 극복하기 위해서는 의미적으로 동일한 문장들로 학습이 되어야 하지만, 이러한 데이터를 대량으로 확보하기에는 어려움이 있다. 본 논문에서는 질문이 제목과 내용으로 분리되어 있는 대량의 cQA 셋에서, 질문 제목과 내용을 의미 벡터 공간으로 사상하고 두 벡터의 상대적 거리가 가깝게 되도록 학습함으로써 의사(pseudo) 유사 의미의 성질을 내재화 하였다. 또한, 질문 제목과 내용의 의미 벡터 표현(representation)을 위하여, semi-training word embedding과 CNN(Convolutional Neural Network)을 이용한 딥러닝 기법을 제안하였다. 유사 질문 검색 실험 결과, 제안 모델을 이용한 검색이 키워드 매칭 기반 검색보다 좋은 성능을 보였다.

  • PDF

Siamese Network for Learning Robust Feature of Hippocampi

  • Ahmed, Samsuddin;Jung, Ho Yub
    • Smart Media Journal
    • /
    • v.9 no.3
    • /
    • pp.9-17
    • /
    • 2020
  • Hippocampus is a complex brain structure embedded deep into the temporal lobe. Studies have shown that this structure gets affected by neurological and psychiatric disorders and it is a significant landmark for diagnosing neurodegenerative diseases. Hippocampus features play very significant roles in region-of-interest based analysis for disease diagnosis and prognosis. In this study, we have attempted to learn the embeddings of this important biomarker. As conventional metric learning methods for feature embedding is known to lacking in capturing semantic similarity among the data under study, we have trained deep Siamese convolutional neural network for learning metric of the hippocampus. We have exploited Gwangju Alzheimer's and Related Dementia cohort data set in our study. The input to the network was pairs of three-view patches (TVPs) of size 32 × 32 × 3. The positive samples were taken from the vicinity of a specified landmark for the hippocampus and negative samples were taken from random locations of the brain excluding hippocampi regions. We have achieved 98.72% accuracy in verifying hippocampus TVPs.

Pattern Recognition Using Attributed Grammar (속성문법에 의한 물체인식)

  • Yim, Seung-Cheol;Kim, Tae-Kyun;Kwon, Oh-Suk
    • Proceedings of the KIEE Conference
    • /
    • 1988.07a
    • /
    • pp.675-678
    • /
    • 1988
  • This paper describes the method of syntactic-semantic pattern recognition and description for two dimensional object which is adjusted or changed in size and its orientation. To avoid the complexity and ambiguity which is arised in the case of syntactic or decision-theoretic method is used individually, an attributed grammar is introduced which applies computative attributes to pattern primitives, and then uses decision-theoretic method for attributes and syntactic method for pattern structure. A primitive extraction embedding parsing and grobal rule for classification is also applied for more effective pattern recognition and description.

  • PDF

Improving a CNN-based Image Annotation System Using Multi-Labeled Images (다중 레이블 이미지를 활용한 CNN기반 이미지 어노테이션 시스템의 개선)

  • Kim, Taeksoo;Kim, Sangbum
    • Annual Conference on Human and Language Technology
    • /
    • 2015.10a
    • /
    • pp.99-103
    • /
    • 2015
  • 최근 딥러닝 기술의 발전에 힘입어 이미지로부터 자동으로 관련된 단어 혹은 문장을 생성하는 연구들이 진행되고 있는데, 많은 연구들은 이미지와 단어가 1:1로 대응된 잘 정련된 학습 집합을 필요로 한다. 한편 스마트폰 보급의 확산으로 인스타그램, 폴라 등의 이미지 기반 SNS가 급속하게 성장함에 따라 인터넷에는 한 이미지의 복수개의 단어(태그)가 부착되어있는 데이터들이 폭증하고 있는 것이 현실이다. 본 논문에서는 소규모의 잘 정련된 학습 집합뿐 아니라 이러한 대규모의 다중 레이블 데이터를 같이 활용하여 이미지로부터 태그를 생성하는 개선된 CNN구조 및 학습알고리즘을 제안한다. 기존의 분류 기반 모델에 은닉층을 추가하고 새로운 학습 방법을 도입한 결과, 어노테이션 성능이 기존 모델보다 11% 이상 향상되었다.

  • PDF

Fake News Detection Using Deep Learning

  • Lee, Dong-Ho;Kim, Yu-Ri;Kim, Hyeong-Jun;Park, Seung-Myun;Yang, Yu-Jun
    • Journal of Information Processing Systems
    • /
    • v.15 no.5
    • /
    • pp.1119-1130
    • /
    • 2019
  • With the wide spread of Social Network Services (SNS), fake news-which is a way of disguising false information as legitimate media-has become a big social issue. This paper proposes a deep learning architecture for detecting fake news that is written in Korean. Previous works proposed appropriate fake news detection models for English, but Korean has two issues that cannot apply existing models: Korean can be expressed in shorter sentences than English even with the same meaning; therefore, it is difficult to operate a deep neural network because of the feature scarcity for deep learning. Difficulty in semantic analysis due to morpheme ambiguity. We worked to resolve these issues by implementing a system using various convolutional neural network-based deep learning architectures and "Fasttext" which is a word-embedding model learned by syllable unit. After training and testing its implementation, we could achieve meaningful accuracy for classification of the body and context discrepancies, but the accuracy was low for classification of the headline and body discrepancies.