• Title/Summary/Keyword: Knowledge embedding

Search Result 60, Processing Time 0.024 seconds

Constrained Sparse Concept Coding algorithm with application to image representation

  • Shu, Zhenqiu;Zhao, Chunxia;Huang, Pu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.9
    • /
    • pp.3211-3230
    • /
    • 2014
  • Recently, sparse coding has achieved remarkable success in image representation tasks. In practice, the performance of clustering can be significantly improved if limited label information is incorporated into sparse coding. To this end, in this paper, a novel semi-supervised algorithm, called constrained sparse concept coding (CSCC), is proposed for image representation. CSCC considers limited label information into graph embedding as additional hard constraints, and hence obtains embedding results that are consistent with label information and manifold structure information of the original data. Therefore, CSCC can provide a sparse representation which explicitly utilizes the prior knowledge of the data to improve the discriminative power in clustering. Besides, a kernelized version of our proposed CSCC, namely kernel constrained sparse concept coding (KCSCC), is developed to deal with nonlinear data, which leads to more effective clustering performance. The experimental evaluations on the MNIST, PIE and Yale image sets show the effectiveness of our proposed algorithms.

A Deep Learning Model for Extracting Consumer Sentiments using Recurrent Neural Network Techniques

  • Ranjan, Roop;Daniel, AK
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.8
    • /
    • pp.238-246
    • /
    • 2021
  • The rapid rise of the Internet and social media has resulted in a large number of text-based reviews being placed on sites such as social media. In the age of social media, utilizing machine learning technologies to analyze the emotional context of comments aids in the understanding of QoS for any product or service. The classification and analysis of user reviews aids in the improvement of QoS. (Quality of Services). Machine Learning algorithms have evolved into a powerful tool for analyzing user sentiment. Unlike traditional categorization models, which are based on a set of rules. In sentiment categorization, Bidirectional Long Short-Term Memory (BiLSTM) has shown significant results, and Convolution Neural Network (CNN) has shown promising results. Using convolutions and pooling layers, CNN can successfully extract local information. BiLSTM uses dual LSTM orientations to increase the amount of background knowledge available to deep learning models. The suggested hybrid model combines the benefits of these two deep learning-based algorithms. The data source for analysis and classification was user reviews of Indian Railway Services on Twitter. The suggested hybrid model uses the Keras Embedding technique as an input source. The suggested model takes in data and generates lower-dimensional characteristics that result in a categorization result. The suggested hybrid model's performance was compared using Keras and Word2Vec, and the proposed model showed a significant improvement in response with an accuracy of 95.19 percent.

Word Sense Disambiguation Using Knowledge Embedding (지식 임베딩 심층학습을 이용한 단어 의미 중의성 해소)

  • Oh, Dongsuk;Yang, Kisu;Kim, Kuekyeng;Whang, Taesun;Lim, Heuiseok
    • Annual Conference on Human and Language Technology
    • /
    • 2019.10a
    • /
    • pp.272-275
    • /
    • 2019
  • 단어 중의성 해소 방법은 지식 정보를 활용하여 문제를 해결하는 지식 기반 방법과 각종 기계학습 모델을 이용하여 문제를 해결하는 지도학습 방법이 있다. 지도학습 방법은 높은 성능을 보이지만 대량의 정제된 학습 데이터가 필요하다. 반대로 지식 기반 방법은 대량의 정제된 학습데이터는 필요없지만 높은 성능을 기대할수 없다. 최근에는 이러한 문제를 보완하기 위해 지식내에 있는 정보와 정제된 학습데이터를 기계학습 모델에 학습하여 단어 중의성 해소 방법을 해결하고 있다. 가장 많이 활용하고 있는 지식 정보는 상위어(Hypernym)와 하위어(Hyponym), 동의어(Synonym)가 가지는 의미설명(Gloss)정보이다. 이 정보의 표상을 기존의 문장의 표상과 같이 활용하여 중의성 단어가 가지는 의미를 파악한다. 하지만 정확한 문장의 표상을 얻기 위해서는 단어의 표상을 잘 만들어줘야 하는데 기존의 방법론들은 모두 문장내의 문맥정보만을 파악하여 표현하였기 때문에 정확한 의미를 반영하는데 한계가 있었다. 본 논문에서는 의미정보와 문맥정보를 담은 단어의 표상정보를 만들기 위해 구문정보, 의미관계 그래프정보를 GCN(Graph Convolutional Network)를 활용하여 임베딩을 표현하였고, 기존의 모델에 반영하여 문맥정보만을 활용한 단어 표상보다 높은 성능을 보였다.

  • PDF

Linking Korean Predicates to Knowledge Base Properties (한국어 서술어와 지식베이스 프로퍼티 연결)

  • Won, Yousung;Woo, Jongseong;Kim, Jiseong;Hahm, YoungGyun;Choi, Key-Sun
    • Journal of KIISE
    • /
    • v.42 no.12
    • /
    • pp.1568-1574
    • /
    • 2015
  • Relation extraction plays a role in for the process of transforming a sentence into a form of knowledge base. In this paper, we focus on predicates in a sentence and aim to identify the relevant knowledge base properties required to elucidate the relationship between entities, which enables a computer to understand the meaning of a sentence more clearly. Distant Supervision is a well-known approach for relation extraction, and it performs lexicalization tasks for knowledge base properties by generating a large amount of labeled data automatically. In other words, the predicate in a sentence will be linked or mapped to the possible properties which are defined by some ontologies in the knowledge base. This lexical and ontological linking of information provides us with a way of generating structured information and a basis for enrichment of the knowledge base.

TREATING UNCERTAINTIES IN A NUCLEAR SEISMIC PROBABILISTIC RISK ASSESSMENT BY MEANS OF THE DEMPSTER-SHAFER THEORY OF EVIDENCE

  • Lo, Chung-Kung;Pedroni, N.;Zio, E.
    • Nuclear Engineering and Technology
    • /
    • v.46 no.1
    • /
    • pp.11-26
    • /
    • 2014
  • The analyses carried out within the Seismic Probabilistic Risk Assessments (SPRAs) of Nuclear Power Plants (NPPs) are affected by significant aleatory and epistemic uncertainties. These uncertainties have to be represented and quantified coherently with the data, information and knowledge available, to provide reasonable assurance that related decisions can be taken robustly and with confidence. The amount of data, information and knowledge available for seismic risk assessment is typically limited, so that the analysis must strongly rely on expert judgments. In this paper, a Dempster-Shafer Theory (DST) framework for handling uncertainties in NPP SPRAs is proposed and applied to an example case study. The main contributions of this paper are two: (i) applying the complete DST framework to SPRA models, showing how to build the Dempster-Shafer structures of the uncertainty parameters based on industry generic data, and (ii) embedding Bayesian updating based on plant specific data into the framework. The results of the application to a case study show that the approach is feasible and effective in (i) describing and jointly propagating aleatory and epistemic uncertainties in SPRA models and (ii) providing 'conservative' bounds on the safety quantities of interest (i.e. Core Damage Frequency, CDF) that reflect the (limited) state of knowledge of the experts about the system of interest.

On Exploiting Permanent Properties of Entities in Temporal Knowledge Graph Embedding (개체들의 영구적인 특성을 고려하는 시간 지식 그래프 임베딩)

  • Lee, JaeHyun;Lee, Yeon-Chang;Kim, Sang-Wook
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.05a
    • /
    • pp.481-482
    • /
    • 2022
  • 시간 지식 그래프 임베딩 방법들은 주어진 시간 지식 그래프에 존재하는 개체 및 관계를 저차원의 임베딩 벡터로 표현하는 것을 목표로 한다. 그러나, 기존 방법들은 개체들의 임베딩 벡터에 그들의 시간에 따라 변화하는 특성을 반영하는 데에만 집중함에 따라, 그들의 영구적인 특성을 무시한다는 한계를 갖는다. 본 논문에서, 우리는 실세계 데이터 집합들을 이용한 실험을 통해, 시간 지식 그래프 임베딩에서 개체들의 영구적인 특성을 고려하는 것이 중요하다는 점을 논의한다.

A Study on the Importance Classification of Semiconductor Technical Documents Using Knowledge Graphs and Embedding Models (임베딩 모델과 지식맵 분석을 활용한 반도체 기술문서 중요도 분류에 관한 연구)

  • Hong, Giwan;Chang, Hangbae
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.288-289
    • /
    • 2021
  • 4차산업혁명과 함께 기존 산업구조가 급속하게 변화하고 기술패권주의가 심화되면서, 기술 패권의 승패에 따라 국가의 글로벌 경쟁력이 크게 좌우된다. 세계 주요국들은 기술경쟁력 확보를 위해 기술혁신과 기술연대의 경쟁을 벌이고 있고, 우리나라도 이러한 동향 속에서 적극적인 R&D 연구 투자와 정책적 지원을 통해 미래 산업 분야의 기술경쟁력 확보를 위해 노력하고 있다. 현재 중국에 의한 기술 탈취나 인력 유출이 발생하고 있고, 이는 산업경쟁력 상실로 이어져 막대한 경제적 피해를 야기할 수 있다. 기술경쟁력을 잃지 않기 위해, 반드시 우리의 산업기술 보호 수단도 마련되어야 한다. 선제적으로 중요한 산업기술을 적절히 식별하여 중요도에 따라 보호수단을 이행하는 것이 산업기술 보호의 시작일 것이다. 이에 따라, 본 논문에서는 지식그래프와 임베딩 모델을 활용하여 우리나라의 핵심산업분야 중 하나인 반도체 분야의 기술문서를 중요도에 따라 수직적으로 분류할 수 있는 방안에 대해 연구하고자 한다.

Pairwise Neural Networks for Predicting Compound-Protein Interaction (약물-표적 단백질 연관관계 예측모델을 위한 쌍 기반 뉴럴네트워크)

  • Lee, Munhwan;Kim, Eunghee;Kim, Hong-Gee
    • Korean Journal of Cognitive Science
    • /
    • v.28 no.4
    • /
    • pp.299-314
    • /
    • 2017
  • Predicting compound-protein interactions in-silico is significant for the drug discovery. In this paper, we propose an scalable machine learning model to predict compound-protein interaction. The key idea of this scalable machine learning model is the architecture of pairwise neural network model and feature embedding method from the raw data, especially for protein. This method automatically extracts the features without additional knowledge of compound and protein. Also, the pairwise architecture elevate the expressiveness and compact dimension of feature by preventing biased learning from occurring due to the dimension and type of features. Through the 5-fold cross validation results on large scale database show that pairwise neural network improves the performance of predicting compound-protein interaction compared to previous prediction models.

A Novel RGB Image Steganography Using Simulated Annealing and LCG via LSB

  • Bawaneh, Mohammed J.;Al-Shalabi, Emad Fawzi;Al-Hazaimeh, Obaida M.
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.1
    • /
    • pp.143-151
    • /
    • 2021
  • The enormous prevalence of transferring official confidential digital documents via the Internet shows the urgent need to deliver confidential messages to the recipient without letting any unauthorized person to know contents of the secret messages or detect there existence . Several Steganography techniques such as the least significant Bit (LSB), Secure Cover Selection (SCS), Discrete Cosine Transform (DCT) and Palette Based (PB) were applied to prevent any intruder from analyzing and getting the secret transferred message. The utilized steganography methods should defiance the challenges of Steganalysis techniques in term of analysis and detection. This paper presents a novel and robust framework for color image steganography that combines Linear Congruential Generator (LCG), simulated annealing (SA), Cesar cryptography and LSB substitution method in one system in order to reduce the objection of Steganalysis and deliver data securely to their destination. SA with the support of LCG finds out the optimal minimum sniffing path inside a cover color image (RGB) then the confidential message will be encrypt and embedded within the RGB image path as a host medium by using Cesar and LSB procedures. Embedding and extraction processes of secret message require a common knowledge between sender and receiver; that knowledge are represented by SA initialization parameters, LCG seed, Cesar key agreement and secret message length. Steganalysis intruder will not understand or detect the secret message inside the host image without the correct knowledge about the manipulation process. The constructed system satisfies the main requirements of image steganography in term of robustness against confidential message extraction, high quality visual appearance, little mean square error (MSE) and high peak signal noise ratio (PSNR).

A Study on Utilization of Vision Transformer for CTR Prediction (CTR 예측을 위한 비전 트랜스포머 활용에 관한 연구)

  • Kim, Tae-Suk;Kim, Seokhun;Im, Kwang Hyuk
    • Knowledge Management Research
    • /
    • v.22 no.4
    • /
    • pp.27-40
    • /
    • 2021
  • Click-Through Rate (CTR) prediction is a key function that determines the ranking of candidate items in the recommendation system and recommends high-ranking items to reduce customer information overload and achieve profit maximization through sales promotion. The fields of natural language processing and image classification are achieving remarkable growth through the use of deep neural networks. Recently, a transformer model based on an attention mechanism, differentiated from the mainstream models in the fields of natural language processing and image classification, has been proposed to achieve state-of-the-art in this field. In this study, we present a method for improving the performance of a transformer model for CTR prediction. In order to analyze the effect of discrete and categorical CTR data characteristics different from natural language and image data on performance, experiments on embedding regularization and transformer normalization are performed. According to the experimental results, it was confirmed that the prediction performance of the transformer was significantly improved when the L2 generalization was applied in the embedding process for CTR data input processing and when batch normalization was applied instead of layer normalization, which is the default regularization method, to the transformer model.