• Title/Summary/Keyword: Dictionary Learning

Search Result 140, Processing Time 0.023 seconds

Super Resolution using Dictionary Data Mapping Method based on Loss Area Analysis (손실 영역 분석 기반의 학습데이터 매핑 기법을 이용한 초해상도 연구)

  • Han, Hyun-Ho;Lee, Sang-Hun
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.3
    • /
    • pp.19-26
    • /
    • 2020
  • In this paper, we propose a method to analyze the loss region of the dictionary-based super resolution result learned for image quality improvement and to map the learning data according to the analyzed loss region. In the conventional learned dictionary-based method, a result different from the feature configuration of the input image may be generated according to the learning image, and an unintended artifact may occur. The proposed method estimate loss information of low resolution images by analyzing the reconstructed contents to reduce inconsistent feature composition and unintended artifacts in the example-based super resolution process. By mapping the training data according to the final interpolation feature map, which improves the noise and pixel imbalance of the estimated loss information using a Gaussian-based kernel, it generates super resolution with improved noise, artifacts, and staircase compared to the existing super resolution. For the evaluation, the results of the existing super resolution generation algorithms and the proposed method are compared with the high-definition image, which is 4% better in the PSNR (Peak Signal to Noise Ratio) and 3% in the SSIM (Structural SIMilarity Index).

Developing the Automated Sentiment Learning Algorithm to Build the Korean Sentiment Lexicon for Finance (재무분야 감성사전 구축을 위한 자동화된 감성학습 알고리즘 개발)

  • Su-Ji Cho;Ki-Kwang Lee;Cheol-Won Yang
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.1
    • /
    • pp.32-41
    • /
    • 2023
  • Recently, many studies are being conducted to extract emotion from text and verify its information power in the field of finance, along with the recent development of big data analysis technology. A number of prior studies use pre-defined sentiment dictionaries or machine learning methods to extract sentiment from the financial documents. However, both methods have the disadvantage of being labor-intensive and subjective because it requires a manual sentiment learning process. In this study, we developed a financial sentiment dictionary that automatically extracts sentiment from the body text of analyst reports by using modified Bayes rule and verified the performance of the model through a binary classification model which predicts actual stock price movements. As a result of the prediction, it was found that the proposed financial dictionary from this research has about 4% better predictive power for actual stock price movements than the representative Loughran and McDonald's (2011) financial dictionary. The sentiment extraction method proposed in this study enables efficient and objective judgment because it automatically learns the sentiment of words using both the change in target price and the cumulative abnormal returns. In addition, the dictionary can be easily updated by re-calculating conditional probabilities. The results of this study are expected to be readily expandable and applicable not only to analyst reports, but also to financial field texts such as performance reports, IR reports, press articles, and social media.

A Semi-automatic Construction method of a Named Entity Dictionary Based on Wikipedia (위키피디아 기반 개체명 사전 반자동 구축 방법)

  • Song, Yeongkil;Jeong, Seokwon;Kim, Harksoo
    • Journal of KIISE
    • /
    • v.42 no.11
    • /
    • pp.1397-1403
    • /
    • 2015
  • A named entity(NE) dictionary is an important resource for the performance of NE recognition. However, it is not easy to construct a NE dictionary manually since human annotation is time consuming and labor-intensive. To save construction time and reduce human labor, we propose a semi-automatic system for the construction of a NE dictionary. The proposed system constructs a pseudo-document with Wiki-categories per NE class by using an active learning technique. Then, it calculates similarities between Wiki entries and pseudo-documents using the BM25 model, a well-known information retrieval model. Finally, it classifies each Wiki entry into NE classes based on similarities. In experiments with three different types of NE class sets, the proposed system showed high performance(macro-average F1-score of 0.9028 and micro-average F1-score 0.9554).

Comparison of the Explanation Texts for Science Terminology in Portal Dictionary, Pyojun Korean Dictionary and Science Textbooks (과학용어에 대한 '포털 사전', '표준국어대사전', '과학교과서' 설명의 비교 분석)

  • Yun, Eunjeong;Park, Yunebae
    • Journal of The Korean Association For Science Education
    • /
    • v.37 no.1
    • /
    • pp.1-8
    • /
    • 2017
  • In terms of science learning for students and of scientific literacy for the general public, understanding science terminology is very important. Aside from the science language education delivered in schools, we assumed that supplementary materials are necessary to search and study the meaning of science terminologies for students or the general public. This study aims to investigate whether explanation texts of portal dictionary, Pyojun Korean Dictionary, and science textbooks are easy to read or understand for students and how the students perceive the explanations they present. The results show that science textbooks are easier to read and understand for students than the portal dictionary or Pyojun Korean Dictionary. However, all three materials have very low level of readability compared with students' average level. There definitely are rooms for improvement to increase their readability.

Learning-based Super-resolution for Text Images (글자 영상을 위한 학습기반 초고해상도 기법)

  • Heo, Bo-Young;Song, Byung Cheol
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.4
    • /
    • pp.175-183
    • /
    • 2015
  • The proposed algorithm consists of two stages: the learning and synthesis stages. At the learning stage, we first collect various high-resolution (HR)-low-resolution (LR) text image pairs, and quantize the LR images, and extract HR-LR block pairs. Based on quantized LR blocks, the LR-HR block pairs are clustered into a pre-determined number of classes. For each class, an optimal 2D-FIR filter is computed, and it is stored into a dictionary with the corresponding LR block for indexing. At the synthesis stage, each quantized LR block in an input LR image is compared with every LR block in the dictionary, and the FIR filter of the best-matched LR block is selected. Finally, a HR block is synthesized with the chosen filter, and a final HR image is produced. Also, in order to cope with noisy environment, we generate multiple dictionaries according to noise level at the learning stage. So, the dictionary corresponding to the noise level of the input image is chosen, and a final HR image is produced using the selected dictionary. Experimental results show that the proposed algorithm outperforms the previous works for noisy images as well as noise-free images.

The Expository Dictionary using the Sign Language about Information Communication for Deaf (청각장애인을 위한 정보통신용어 수화해설 사전)

  • Kim Ho-Yong;Seo Yeong-Geon
    • Journal of Digital Contents Society
    • /
    • v.6 no.4
    • /
    • pp.217-222
    • /
    • 2005
  • The purpose of this study is to design and implement a sign language dictionary for the deaf to understand information communication terminologies. When the deafs who have difficulties in communication use the internet, they an get help from this dictionary in accessing various types of information and expressing their intension. In order for the deaf to utilize the internet as efficiently as ordinary people, they must understand information communication terminologies first In order to implement the dictionary, we defined the concepts of the deaf and examined their characteristics. In addition, we established principles in designing this dictionary and selected some terminologies. When explaining the terminologies. we tried to use expressions common to the deaf, but sometimes modified them to keep the original meanings of the terms in producing sign language videos. This studies are applied as learning aid to information education for the deaf, and the deaf's understanding of ICT was measured through two tests.

  • PDF

KNU Korean Sentiment Lexicon: Bi-LSTM-based Method for Building a Korean Sentiment Lexicon (Bi-LSTM 기반의 한국어 감성사전 구축 방안)

  • Park, Sang-Min;Na, Chul-Won;Choi, Min-Seong;Lee, Da-Hee;On, Byung-Won
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.219-240
    • /
    • 2018
  • Sentiment analysis, which is one of the text mining techniques, is a method for extracting subjective content embedded in text documents. Recently, the sentiment analysis methods have been widely used in many fields. As good examples, data-driven surveys are based on analyzing the subjectivity of text data posted by users and market researches are conducted by analyzing users' review posts to quantify users' reputation on a target product. The basic method of sentiment analysis is to use sentiment dictionary (or lexicon), a list of sentiment vocabularies with positive, neutral, or negative semantics. In general, the meaning of many sentiment words is likely to be different across domains. For example, a sentiment word, 'sad' indicates negative meaning in many fields but a movie. In order to perform accurate sentiment analysis, we need to build the sentiment dictionary for a given domain. However, such a method of building the sentiment lexicon is time-consuming and various sentiment vocabularies are not included without the use of general-purpose sentiment lexicon. In order to address this problem, several studies have been carried out to construct the sentiment lexicon suitable for a specific domain based on 'OPEN HANGUL' and 'SentiWordNet', which are general-purpose sentiment lexicons. However, OPEN HANGUL is no longer being serviced and SentiWordNet does not work well because of language difference in the process of converting Korean word into English word. There are restrictions on the use of such general-purpose sentiment lexicons as seed data for building the sentiment lexicon for a specific domain. In this article, we construct 'KNU Korean Sentiment Lexicon (KNU-KSL)', a new general-purpose Korean sentiment dictionary that is more advanced than existing general-purpose lexicons. The proposed dictionary, which is a list of domain-independent sentiment words such as 'thank you', 'worthy', and 'impressed', is built to quickly construct the sentiment dictionary for a target domain. Especially, it constructs sentiment vocabularies by analyzing the glosses contained in Standard Korean Language Dictionary (SKLD) by the following procedures: First, we propose a sentiment classification model based on Bidirectional Long Short-Term Memory (Bi-LSTM). Second, the proposed deep learning model automatically classifies each of glosses to either positive or negative meaning. Third, positive words and phrases are extracted from the glosses classified as positive meaning, while negative words and phrases are extracted from the glosses classified as negative meaning. Our experimental results show that the average accuracy of the proposed sentiment classification model is up to 89.45%. In addition, the sentiment dictionary is more extended using various external sources including SentiWordNet, SenticNet, Emotional Verbs, and Sentiment Lexicon 0603. Furthermore, we add sentiment information about frequently used coined words and emoticons that are used mainly on the Web. The KNU-KSL contains a total of 14,843 sentiment vocabularies, each of which is one of 1-grams, 2-grams, phrases, and sentence patterns. Unlike existing sentiment dictionaries, it is composed of words that are not affected by particular domains. The recent trend on sentiment analysis is to use deep learning technique without sentiment dictionaries. The importance of developing sentiment dictionaries is declined gradually. However, one of recent studies shows that the words in the sentiment dictionary can be used as features of deep learning models, resulting in the sentiment analysis performed with higher accuracy (Teng, Z., 2016). This result indicates that the sentiment dictionary is used not only for sentiment analysis but also as features of deep learning models for improving accuracy. The proposed dictionary can be used as a basic data for constructing the sentiment lexicon of a particular domain and as features of deep learning models. It is also useful to automatically and quickly build large training sets for deep learning models.

Domain Adaptation Image Classification Based on Multi-sparse Representation

  • Zhang, Xu;Wang, Xiaofeng;Du, Yue;Qin, Xiaoyan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.5
    • /
    • pp.2590-2606
    • /
    • 2017
  • Generally, research of classical image classification algorithms assume that training data and testing data are derived from the same domain with the same distribution. Unfortunately, in practical applications, this assumption is rarely met. Aiming at the problem, a domain adaption image classification approach based on multi-sparse representation is proposed in this paper. The existences of intermediate domains are hypothesized between the source and target domains. And each intermediate subspace is modeled through online dictionary learning with target data updating. On the one hand, the reconstruction error of the target data is guaranteed, on the other, the transition from the source domain to the target domain is as smooth as possible. An augmented feature representation produced by invariant sparse codes across the source, intermediate and target domain dictionaries is employed for across domain recognition. Experimental results verify the effectiveness of the proposed algorithm.

Efficient and Secure Authenticated Key Exchange

  • Park Jong-Min
    • Journal of information and communication convergence engineering
    • /
    • v.3 no.3
    • /
    • pp.163-166
    • /
    • 2005
  • The Key exchange protocols are very crucial tools to provide the secure communication in the broadband satellite access network. They should be required to satisfy various requirements such as security, Key confirmation, and Key freshness. In this paper, Two authenticated key exchange protocols TPEKE-E(Two Pass Encrypted Key Exchange-Exchange-Efficient) and TPEKE-S(Two Pass Encrypted Key xchange-Secure) are introduced. A basic idea of the protocols is that a password can be represented by modular addition N, and the number of possible modular addition N representing the password is $2^N$. The TPEKE-E is secure against the attacks including main-in-the-middle attack and off-line dictionary attack, and the performance is excellent so as beyond to comparison with other authenticated key exchange protocols. The TPEKE-S is a slight modification of the TPEKE-E. The TPEKE-S provides computational in feasibility for learning the password without having performed off line dictionary attack while preserving the performance of the TPEKE-E.

Improvement of Sparse Representation based Classifier using Fisher Discrimination Dictionary Learning for Malignant Mass Detection (피셔 분별 사전학습을 이용해 개선된 Sparse 표현 기반 악성 종괴 검출)

  • Kim, Seong Tae;Lee, Seung Hyun;Min, Hyun-Seok;Ro, Yong Man
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.5
    • /
    • pp.558-565
    • /
    • 2013
  • Mammography, the process of using X-ray to examine the woman breast, is the one of the effective tools for detecting breast cancer at an early state. In screening mammogram, Computer-Aided Detection(CAD) system helps radiologist to diagnose cases by detecting malignant masses. A mass is an important lesion in the breast that can indicate a cancer. Due to various shapes and unclear boundaries of the masses, detecting breast masses is considered a challenging task. To this end, CAD system detects a lot of regions of interest including normal tissues. Thus it is important to develop the well-organized classifier. In this paper, we propose an enhanced sparse representation (SR) based classifier using Fisher discrimination dictionary learning. Experimental results show that the proposed method outperforms the existing support vector machine (SVM) classifier.