• Title/Summary/Keyword: contextual words

Search Result 60, Processing Time 0.022 seconds

Extracting Multiword Sentiment Expressions by Using a Domain-Specific Corpus and a Seed Lexicon

  • Lee, Kong-Joo;Kim, Jee-Eun;Yun, Bo-Hyun
    • ETRI Journal
    • /
    • v.35 no.5
    • /
    • pp.838-848
    • /
    • 2013
  • This paper presents a novel approach to automatically generate Korean multiword sentiment expressions by using a seed sentiment lexicon and a large-scale domain-specific corpus. A multiword sentiment expression consists of a seed sentiment word and its contextual words occurring adjacent to the seed word. The multiword sentiment expressions that are the focus of our study have a different polarity from that of the seed sentiment word. The automatically extracted multiword sentiment expressions show that 1) the contextual words should be defined as a part of a multiword sentiment expression in addition to their corresponding seed sentiment word, 2) the identified multiword sentiment expressions contain various indicators for polarity shift that have rarely been recognized before, and 3) the newly recognized shifters contribute to assigning a more accurate polarity value. The empirical result shows that the proposed approach achieves improved performance of the sentiment analysis system that uses an automatically generated lexicon.

Analyzing Contextual Polarity of Unstructured Data for Measuring Subjective Well-Being (주관적 웰빙 상태 측정을 위한 비정형 데이터의 상황기반 긍부정성 분석 방법)

  • Choi, Sukjae;Song, Yeongeun;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.83-105
    • /
    • 2016
  • Measuring an individual's subjective wellbeing in an accurate, unobtrusive, and cost-effective manner is a core success factor of the wellbeing support system, which is a type of medical IT service. However, measurements with a self-report questionnaire and wearable sensors are cost-intensive and obtrusive when the wellbeing support system should be running in real-time, despite being very accurate. Recently, reasoning the state of subjective wellbeing with conventional sentiment analysis and unstructured data has been proposed as an alternative to resolve the drawbacks of the self-report questionnaire and wearable sensors. However, this approach does not consider contextual polarity, which results in lower measurement accuracy. Moreover, there is no sentimental word net or ontology for the subjective wellbeing area. Hence, this paper proposes a method to extract keywords and their contextual polarity representing the subjective wellbeing state from the unstructured text in online websites in order to improve the reasoning accuracy of the sentiment analysis. The proposed method is as follows. First, a set of general sentimental words is proposed. SentiWordNet was adopted; this is the most widely used dictionary and contains about 100,000 words such as nouns, verbs, adjectives, and adverbs with polarities from -1.0 (extremely negative) to 1.0 (extremely positive). Second, corpora on subjective wellbeing (SWB corpora) were obtained by crawling online text. A survey was conducted to prepare a learning dataset that includes an individual's opinion and the level of self-report wellness, such as stress and depression. The participants were asked to respond with their feelings about online news on two topics. Next, three data sources were extracted from the SWB corpora: demographic information, psychographic information, and the structural characteristics of the text (e.g., the number of words used in the text, simple statistics on the special characters used). These were considered to adjust the level of a specific SWB. Finally, a set of reasoning rules was generated for each wellbeing factor to estimate the SWB of an individual based on the text written by the individual. The experimental results suggested that using contextual polarity for each SWB factor (e.g., stress, depression) significantly improved the estimation accuracy compared to conventional sentiment analysis methods incorporating SentiWordNet. Even though literature is available on Korean sentiment analysis, such studies only used only a limited set of sentimental words. Due to the small number of words, many sentences are overlooked and ignored when estimating the level of sentiment. However, the proposed method can identify multiple sentiment-neutral words as sentiment words in the context of a specific SWB factor. The results also suggest that a specific type of senti-word dictionary containing contextual polarity needs to be constructed along with a dictionary based on common sense such as SenticNet. These efforts will enrich and enlarge the application area of sentic computing. The study is helpful to practitioners and managers of wellness services in that a couple of characteristics of unstructured text have been identified for improving SWB. Consistent with the literature, the results showed that the gender and age affect the SWB state when the individual is exposed to an identical queue from the online text. In addition, the length of the textual response and usage pattern of special characters were found to indicate the individual's SWB. These imply that better SWB measurement should involve collecting the textual structure and the individual's demographic conditions. In the future, the proposed method should be improved by automated identification of the contextual polarity in order to enlarge the vocabulary in a cost-effective manner.

On Flexibility in Architecture Focused on the Contradiction in Designing Flexible Space and Its Design Proposition

  • Kim, Young-Ju
    • Architectural research
    • /
    • v.15 no.4
    • /
    • pp.191-200
    • /
    • 2013
  • Since Modern Movement flexibility has been one of the most attractive words in architecture. However, "overprovision first, division later" has been the most prevailing design method for spatial flexibility, and many of buildings designed for flexible use are practically quite inflexible due to insufficient building systems or/and irresponsible planning. There have been two dominant strategies to achieve architectural flexibility: multi-functionality and polyvalence. These two approaches, which point contradictory directions, actually reflect the difficulty in providing a proper form of architectural flexibility. Multi-functionality can afford changeable environments with satisfying spatial conditions; however it lacks tolerance to accommodate other uses but intended functions by architects. Meanwhile, flexibility by a polyvalent form relies on the vague anticipation of user's various interpretations. In this study by looking up these two different standpoints and historical precedents flexibility in architecture is carefully scrutinized focused on the contradiction, and as an alternative for architectural flexibility contextual relations is proposed. Unlike both multi-functionality and polyvalence, which produce flexibility by changing its own properties, manipulating contextual relations infuses flexibility into space by changing the properties of a building, not of its individual room. By using this contextual relations method, a community-centered school in Manhattan, NY, which was in danger of being closed because of its academic failure, is represented as a flexible space.

Contextual Advertisement System based on Document Clustering (문서 클러스터링을 이용한 문맥 광고 시스템)

  • Lee, Dong-Kwang;Kang, In-Ho;An, Dong-Un
    • The KIPS Transactions:PartB
    • /
    • v.15B no.1
    • /
    • pp.73-80
    • /
    • 2008
  • In this paper, an advertisement-keyword finding method using document clustering is proposed to solve problems by ambiguous words and incorrect identification of main keywords. News articles that have similar contents and the same advertisement-keywords are clustered to construct the contextual information of advertisement-keywords. In addition to news articles, the web page and summary of a product are also used to construct the contextual information. The given document is classified as one of the news article clusters, and then cluster-relevant advertisement-keywords are used to identify keywords in the document. We could achieve 21% precision improvement by our proposed method.

A Music Recommendation Method Using Emotional States by Contextual Information

  • Kim, Dong-Joo;Lim, Kwon-Mook
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.10
    • /
    • pp.69-76
    • /
    • 2015
  • User's selection of music is largely influenced by private tastes as well as emotional states, and it is the unconsciousness projection of user's emotion. Therefore, we think user's emotional states to be music itself. In this paper, we try to grasp user's emotional states from music selected by users at a specific context, and we analyze the correlation between its context and user's emotional state. To get emotional states out of music, the proposed method extracts emotional words as the representative of music from lyrics of user-selected music through morphological analysis, and learns weights of linear classifier for each emotional features of extracted words. Regularities learned by classifier are utilized to calculate predictive weights of virtual music using weights of music chosen by other users in context similar to active user's context. Finally, we propose a method to recommend some pieces of music relative to user's contexts and emotional states. Experimental results shows that the proposed method is more accurate than the traditional collaborative filtering method.

The Effect of Prosodic Position and Word Type on the Production of Korean Plosives

  • Jang, Mi
    • Phonetics and Speech Sciences
    • /
    • v.3 no.4
    • /
    • pp.71-81
    • /
    • 2011
  • This paper investigated how prosodic position and word type affect the phonetic structure of Korean coronal stops. Initial segments of prosodic domains were known to be more strongly articulated and longer relative to prosodic domain-medial segments. However, there are few studies examining whether the properties of prosodic domain-initial segments are affected by the information content of words (real vs. nonsense words). In addition, since the scope of domain-initial effect was known to be local to the initial consonant and the effects on the following vowel have been found to be limited, it is thus worth examining whether the prosodic domain-initial effect extends into the vowel after the initial consonant in a systematic way across different prosodic domains. The acoustic properties of Korean coronal stops (lenis /t/, aspirated /$t^h$/, and tense /t'/) were compared across Intonational Phrase, Phonological Phrase and Word-initial positions both in real and nonsense words. The durational intervals such as VOT and CV duration were cumulatively lengthened for /t/ and /$t^h$/ in the higher prosodic domain-initial positions. However, tense stop /t'/ did not show any variation as a function of prosodic position and word type. The domain-initial lenis stop showed significantly longer duration in nonsense words than in real words. But the prosodic domain-initial effect was not found in the properties of F0 and [H1-H2] of the vowel after initial stops. The present study provided evidence that speakers tend to enhance speech clarity when there is less contextual information as in prosodic domain-initial position and in nonsense words.

  • PDF

PC-SAN: Pretraining-Based Contextual Self-Attention Model for Topic Essay Generation

  • Lin, Fuqiang;Ma, Xingkong;Chen, Yaofeng;Zhou, Jiajun;Liu, Bo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.8
    • /
    • pp.3168-3186
    • /
    • 2020
  • Automatic topic essay generation (TEG) is a controllable text generation task that aims to generate informative, diverse, and topic-consistent essays based on multiple topics. To make the generated essays of high quality, a reasonable method should consider both diversity and topic-consistency. Another essential issue is the intrinsic link of the topics, which contributes to making the essays closely surround the semantics of provided topics. However, it remains challenging for TEG to fill the semantic gap between source topic words and target output, and a more powerful model is needed to capture the semantics of given topics. To this end, we propose a pretraining-based contextual self-attention (PC-SAN) model that is built upon the seq2seq framework. For the encoder of our model, we employ a dynamic weight sum of layers from BERT to fully utilize the semantics of topics, which is of great help to fill the gap and improve the quality of the generated essays. In the decoding phase, we also transform the target-side contextual history information into the query layers to alleviate the lack of context in typical self-attention networks (SANs). Experimental results on large-scale paragraph-level Chinese corpora verify that our model is capable of generating diverse, topic-consistent text and essentially makes improvements as compare to strong baselines. Furthermore, extensive analysis validates the effectiveness of contextual embeddings from BERT and contextual history information in SANs.

Proper Noun Embedding Model for the Korean Dependency Parsing

  • Nam, Gyu-Hyeon;Lee, Hyun-Young;Kang, Seung-Shik
    • Journal of Multimedia Information System
    • /
    • v.9 no.2
    • /
    • pp.93-102
    • /
    • 2022
  • Dependency parsing is a decision problem of the syntactic relation between words in a sentence. Recently, deep learning models are used for dependency parsing based on the word representations in a continuous vector space. However, it causes a mislabeled tagging problem for the proper nouns that rarely appear in the training corpus because it is difficult to express out-of-vocabulary (OOV) words in a continuous vector space. To solve the OOV problem in dependency parsing, we explored the proper noun embedding method according to the embedding unit. Before representing words in a continuous vector space, we replace the proper nouns with a special token and train them for the contextual features by using the multi-layer bidirectional LSTM. Two models of the syllable-based and morpheme-based unit are proposed for proper noun embedding and the performance of the dependency parsing is more improved in the ensemble model than each syllable and morpheme embedding model. The experimental results showed that our ensemble model improved 1.69%p in UAS and 2.17%p in LAS than the same arc-eager approach-based Malt parser.

An Analysis of Vocabulary Rating and Types in Elementary Mathematics Textbooks for Grade 1-2 (초등학교 1~2학년 수학 교과서 어휘의 등급 및 유형별 분석)

  • Park, Mimi;Lee, Eunjung
    • Education of Primary School Mathematics
    • /
    • v.25 no.4
    • /
    • pp.361-375
    • /
    • 2022
  • In this study, the vocabularies in elementary mathematics textbooks for grade 1-2 were analyzed according to 9-degree of semantic system. Also, the types of vocabulary were analyzed using general academic words, mathematics specific concept words, and mathematics general concept words. As a result, percentages of 1-degree and 2-degree vocabulary was the most in both grade 1 and 2 mathematics textbooks. It also shows that some of general academic words were 3-degree vocabulary and some of mathematics specific concept words were either unregistered or 1-degree vocabulary. In particular, general academic words, which are 3-degree vocabulary, may be unfamiliar to 1st and 2nd grade students. Therefore, students should be given the opportunity to guess and understand the contextual meaning of general academic words from the given contexts in textbooks. The frequency of use of mathematics general concept words in grade 2 textbook increased significantly compared to grade 1 textbook. Since mathematics general concept words are academic and technical vocabulary they should be taught explicitly. Based on the results of this study, implications for vocabulary instruction in mathematics textbooks were discussed.

Creative Analysis of Brand Placement in Game Contents

  • Lee, Yong-Jae
    • International Journal of Contents
    • /
    • v.7 no.1
    • /
    • pp.37-44
    • /
    • 2011
  • This research attempts to analyze brand placement in game. Brand placement, being acclaimed as a new beneficiary model in game industry, is raising important mean of advertising. For development of game industry, the interdisciplinary study between game and advertising is indispensible. Therefore, the purpose of this study is to find creative types of brand placement in game for illuminating how advertising works in game contents. The results showed three types of brand placement in game. They are contextual type, prominent type and independent type. Contextual type is one where the brand is present within the game contents without being formally expressed: it plays a passive role. Prominent type is one where the brand is present within the game contents with being formally expressed: it plays an active role. Independent type is one where the brand is present within the game contents with being formally expressed but it is not related with the program: it plays an additional role. The research showed, among these three types, a prominent type is becoming mainstream of brand placement in game. In other words, the prominent type of brand placement is the most effective beneficial alternative in game industry.