• Title/Summary/Keyword: 상황적 긍부정성

Search Result 2, Processing Time 0.015 seconds

Analyzing Contextual Polarity of Unstructured Data for Measuring Subjective Well-Being (주관적 웰빙 상태 측정을 위한 비정형 데이터의 상황기반 긍부정성 분석 방법)

  • Choi, Sukjae;Song, Yeongeun;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.83-105
    • /
    • 2016
  • Measuring an individual's subjective wellbeing in an accurate, unobtrusive, and cost-effective manner is a core success factor of the wellbeing support system, which is a type of medical IT service. However, measurements with a self-report questionnaire and wearable sensors are cost-intensive and obtrusive when the wellbeing support system should be running in real-time, despite being very accurate. Recently, reasoning the state of subjective wellbeing with conventional sentiment analysis and unstructured data has been proposed as an alternative to resolve the drawbacks of the self-report questionnaire and wearable sensors. However, this approach does not consider contextual polarity, which results in lower measurement accuracy. Moreover, there is no sentimental word net or ontology for the subjective wellbeing area. Hence, this paper proposes a method to extract keywords and their contextual polarity representing the subjective wellbeing state from the unstructured text in online websites in order to improve the reasoning accuracy of the sentiment analysis. The proposed method is as follows. First, a set of general sentimental words is proposed. SentiWordNet was adopted; this is the most widely used dictionary and contains about 100,000 words such as nouns, verbs, adjectives, and adverbs with polarities from -1.0 (extremely negative) to 1.0 (extremely positive). Second, corpora on subjective wellbeing (SWB corpora) were obtained by crawling online text. A survey was conducted to prepare a learning dataset that includes an individual's opinion and the level of self-report wellness, such as stress and depression. The participants were asked to respond with their feelings about online news on two topics. Next, three data sources were extracted from the SWB corpora: demographic information, psychographic information, and the structural characteristics of the text (e.g., the number of words used in the text, simple statistics on the special characters used). These were considered to adjust the level of a specific SWB. Finally, a set of reasoning rules was generated for each wellbeing factor to estimate the SWB of an individual based on the text written by the individual. The experimental results suggested that using contextual polarity for each SWB factor (e.g., stress, depression) significantly improved the estimation accuracy compared to conventional sentiment analysis methods incorporating SentiWordNet. Even though literature is available on Korean sentiment analysis, such studies only used only a limited set of sentimental words. Due to the small number of words, many sentences are overlooked and ignored when estimating the level of sentiment. However, the proposed method can identify multiple sentiment-neutral words as sentiment words in the context of a specific SWB factor. The results also suggest that a specific type of senti-word dictionary containing contextual polarity needs to be constructed along with a dictionary based on common sense such as SenticNet. These efforts will enrich and enlarge the application area of sentic computing. The study is helpful to practitioners and managers of wellness services in that a couple of characteristics of unstructured text have been identified for improving SWB. Consistent with the literature, the results showed that the gender and age affect the SWB state when the individual is exposed to an identical queue from the online text. In addition, the length of the textual response and usage pattern of special characters were found to indicate the individual's SWB. These imply that better SWB measurement should involve collecting the textual structure and the individual's demographic conditions. In the future, the proposed method should be improved by automated identification of the contextual polarity in order to enlarge the vocabulary in a cost-effective manner.

WellnessWordNet: A Word Net for Unconstrained Subjective Well-Being Monitor ing Based on Unstructured Data and Contextual Polarity (웰니스워드넷: 비정형데이터와 상황적 긍부정성에 기반하여 주관적 웰빙 상태를 무구속적으로 모니터링하기 위한 워드넷 개발)

  • Song, Yeongeun;Nam, Suhyun;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.1-21
    • /
    • 2016
  • IT-based subjective well-being (SWB) services, a main part of wellness IT, should measure the SWB state of individuals in an unrestrained, cost-effective manner. The dictionaries for sentiment analysis available in the market may be useful for this purpose, but obtaining proper sentiment values using only words from the sentiment lexicon is impossible; therefore, a new dictionary including wellness vocabulary is needed. The existing sentiment dictionaries link only a single sentiment value to a single sentiment word, although sentiment values may vary depending on personal traits. In this study, we develop an extended version of the SenticNet sentiment dictionary dubbed WellnessWordNet. SenticNet is considered the best and most expressive among the already existing sentiment dictionaries. Using the information provided by SenticNet, we created a database including the wellness states (estimated values) of stress, depression, and anger to develop the WellnessWordNet system. The accuracy of the system was validated through actual tests with live subjects. This study is unique and unprecedented in that i) an extended sentiment dictionary, WellnessWordNet, is developed; ii) values for wellness state language are offered; and iii) different sentiment values, namely contextual polarity, for people of the same gender or age group are suggested.