• Title/Summary/Keyword: Information Usage Pattern

Search Result 275, Processing Time 0.023 seconds

Analyzing Contextual Polarity of Unstructured Data for Measuring Subjective Well-Being (주관적 웰빙 상태 측정을 위한 비정형 데이터의 상황기반 긍부정성 분석 방법)

  • Choi, Sukjae;Song, Yeongeun;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.83-105
    • /
    • 2016
  • Measuring an individual's subjective wellbeing in an accurate, unobtrusive, and cost-effective manner is a core success factor of the wellbeing support system, which is a type of medical IT service. However, measurements with a self-report questionnaire and wearable sensors are cost-intensive and obtrusive when the wellbeing support system should be running in real-time, despite being very accurate. Recently, reasoning the state of subjective wellbeing with conventional sentiment analysis and unstructured data has been proposed as an alternative to resolve the drawbacks of the self-report questionnaire and wearable sensors. However, this approach does not consider contextual polarity, which results in lower measurement accuracy. Moreover, there is no sentimental word net or ontology for the subjective wellbeing area. Hence, this paper proposes a method to extract keywords and their contextual polarity representing the subjective wellbeing state from the unstructured text in online websites in order to improve the reasoning accuracy of the sentiment analysis. The proposed method is as follows. First, a set of general sentimental words is proposed. SentiWordNet was adopted; this is the most widely used dictionary and contains about 100,000 words such as nouns, verbs, adjectives, and adverbs with polarities from -1.0 (extremely negative) to 1.0 (extremely positive). Second, corpora on subjective wellbeing (SWB corpora) were obtained by crawling online text. A survey was conducted to prepare a learning dataset that includes an individual's opinion and the level of self-report wellness, such as stress and depression. The participants were asked to respond with their feelings about online news on two topics. Next, three data sources were extracted from the SWB corpora: demographic information, psychographic information, and the structural characteristics of the text (e.g., the number of words used in the text, simple statistics on the special characters used). These were considered to adjust the level of a specific SWB. Finally, a set of reasoning rules was generated for each wellbeing factor to estimate the SWB of an individual based on the text written by the individual. The experimental results suggested that using contextual polarity for each SWB factor (e.g., stress, depression) significantly improved the estimation accuracy compared to conventional sentiment analysis methods incorporating SentiWordNet. Even though literature is available on Korean sentiment analysis, such studies only used only a limited set of sentimental words. Due to the small number of words, many sentences are overlooked and ignored when estimating the level of sentiment. However, the proposed method can identify multiple sentiment-neutral words as sentiment words in the context of a specific SWB factor. The results also suggest that a specific type of senti-word dictionary containing contextual polarity needs to be constructed along with a dictionary based on common sense such as SenticNet. These efforts will enrich and enlarge the application area of sentic computing. The study is helpful to practitioners and managers of wellness services in that a couple of characteristics of unstructured text have been identified for improving SWB. Consistent with the literature, the results showed that the gender and age affect the SWB state when the individual is exposed to an identical queue from the online text. In addition, the length of the textual response and usage pattern of special characters were found to indicate the individual's SWB. These imply that better SWB measurement should involve collecting the textual structure and the individual's demographic conditions. In the future, the proposed method should be improved by automated identification of the contextual polarity in order to enlarge the vocabulary in a cost-effective manner.

Recommending Core and Connecting Keywords of Research Area Using Social Network and Data Mining Techniques (소셜 네트워크와 데이터 마이닝 기법을 활용한 학문 분야 중심 및 융합 키워드 추천 서비스)

  • Cho, In-Dong;Kim, Nam-Gyu
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.127-138
    • /
    • 2011
  • The core service of most research portal sites is providing relevant research papers to various researchers that match their research interests. This kind of service may only be effective and easy to use when a user can provide correct and concrete information about a paper such as the title, authors, and keywords. However, unfortunately, most users of this service are not acquainted with concrete bibliographic information. It implies that most users inevitably experience repeated trial and error attempts of keyword-based search. Especially, retrieving a relevant research paper is more difficult when a user is novice in the research domain and does not know appropriate keywords. In this case, a user should perform iterative searches as follows : i) perform an initial search with an arbitrary keyword, ii) acquire related keywords from the retrieved papers, and iii) perform another search again with the acquired keywords. This usage pattern implies that the level of service quality and user satisfaction of a portal site are strongly affected by the level of keyword management and searching mechanism. To overcome this kind of inefficiency, some leading research portal sites adopt the association rule mining-based keyword recommendation service that is similar to the product recommendation of online shopping malls. However, keyword recommendation only based on association analysis has limitation that it can show only a simple and direct relationship between two keywords. In other words, the association analysis itself is unable to present the complex relationships among many keywords in some adjacent research areas. To overcome this limitation, we propose the hybrid approach for establishing association network among keywords used in research papers. The keyword association network can be established by the following phases : i) a set of keywords specified in a certain paper are regarded as co-purchased items, ii) perform association analysis for the keywords and extract frequent patterns of keywords that satisfy predefined thresholds of confidence, support, and lift, and iii) schematize the frequent keyword patterns as a network to show the core keywords of each research area and connecting keywords among two or more research areas. To estimate the practical application of our approach, we performed a simple experiment with 600 keywords. The keywords are extracted from 131 research papers published in five prominent Korean journals in 2009. In the experiment, we used the SAS Enterprise Miner for association analysis and the R software for social network analysis. As the final outcome, we presented a network diagram and a cluster dendrogram for the keyword association network. We summarized the results in Section 4 of this paper. The main contribution of our proposed approach can be found in the following aspects : i) the keyword network can provide an initial roadmap of a research area to researchers who are novice in the domain, ii) a researcher can grasp the distribution of many keywords neighboring to a certain keyword, and iii) researchers can get some idea for converging different research areas by observing connecting keywords in the keyword association network. Further studies should include the following. First, the current version of our approach does not implement a standard meta-dictionary. For practical use, homonyms, synonyms, and multilingual problems should be resolved with a standard meta-dictionary. Additionally, more clear guidelines for clustering research areas and defining core and connecting keywords should be provided. Finally, intensive experiments not only on Korean research papers but also on international papers should be performed in further studies.

SANET-CC : Zone IP Allocation Protocol for Offshore Networks (SANET-CC : 해상 네트워크를 위한 구역 IP 할당 프로토콜)

  • Bae, Kyoung Yul;Cho, Moon Ki
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.4
    • /
    • pp.87-109
    • /
    • 2020
  • Currently, thanks to the major stride made in developing wired and wireless communication technology, a variety of IT services are available on land. This trend is leading to an increasing demand for IT services to vessels on the water as well. And it is expected that the request for various IT services such as two-way digital data transmission, Web, APP, etc. is on the rise to the extent that they are available on land. However, while a high-speed information communication network is easily accessible on land because it is based upon a fixed infrastructure like an AP and a base station, it is not the case on the water. As a result, a radio communication network-based voice communication service is usually used at sea. To solve this problem, an additional frequency for digital data exchange was allocated, and a ship ad-hoc network (SANET) was proposed that can be utilized by using this frequency. Instead of satellite communication that costs a lot in installation and usage, SANET was developed to provide various IT services to ships based on IP in the sea. Connectivity between land base stations and ships is important in the SANET. To have this connection, a ship must be a member of the network with its IP address assigned. This paper proposes a SANET-CC protocol that allows ships to be assigned their own IP address. SANET-CC propagates several non-overlapping IP addresses through the entire network from land base stations to ships in the form of the tree. Ships allocate their own IP addresses through the exchange of simple requests and response messages with land base stations or M-ships that can allocate IP addresses. Therefore, SANET-CC can eliminate the IP collision prevention (Duplicate Address Detection) process and the process of network separation or integration caused by the movement of the ship. Various simulations were performed to verify the applicability of this protocol to SANET. The outcome of such simulations shows us the following. First, using SANET-CC, about 91% of the ships in the network were able to receive IP addresses under any circumstances. It is 6% higher than the existing studies. And it suggests that if variables are adjusted to each port's environment, it may show further improved results. Second, this work shows us that it takes all vessels an average of 10 seconds to receive IP addresses regardless of conditions. It represents a 50% decrease in time compared to the average of 20 seconds in the previous study. Also Besides, taking it into account that when existing studies were on 50 to 200 vessels, this study on 100 to 400 vessels, the efficiency can be much higher. Third, existing studies have not been able to derive optimal values according to variables. This is because it does not have a consistent pattern depending on the variable. This means that optimal variables values cannot be set for each port under diverse environments. This paper, however, shows us that the result values from the variables exhibit a consistent pattern. This is significant in that it can be applied to each port by adjusting the variable values. It was also confirmed that regardless of the number of ships, the IP allocation ratio was the most efficient at about 96 percent if the waiting time after the IP request was 75ms, and that the tree structure could maintain a stable network configuration when the number of IPs was over 30000. Fourth, this study can be used to design a network for supporting intelligent maritime control systems and services offshore, instead of satellite communication. And if LTE-M is set up, it is possible to use it for various intelligent services.

The Analysis on the Relationship between Firms' Exposures to SNS and Stock Prices in Korea (기업의 SNS 노출과 주식 수익률간의 관계 분석)

  • Kim, Taehwan;Jung, Woo-Jin;Lee, Sang-Yong Tom
    • Asia pacific journal of information systems
    • /
    • v.24 no.2
    • /
    • pp.233-253
    • /
    • 2014
  • Can the stock market really be predicted? Stock market prediction has attracted much attention from many fields including business, economics, statistics, and mathematics. Early research on stock market prediction was based on random walk theory (RWT) and the efficient market hypothesis (EMH). According to the EMH, stock market are largely driven by new information rather than present and past prices. Since it is unpredictable, stock market will follow a random walk. Even though these theories, Schumaker [2010] asserted that people keep trying to predict the stock market by using artificial intelligence, statistical estimates, and mathematical models. Mathematical approaches include Percolation Methods, Log-Periodic Oscillations and Wavelet Transforms to model future prices. Examples of artificial intelligence approaches that deals with optimization and machine learning are Genetic Algorithms, Support Vector Machines (SVM) and Neural Networks. Statistical approaches typically predicts the future by using past stock market data. Recently, financial engineers have started to predict the stock prices movement pattern by using the SNS data. SNS is the place where peoples opinions and ideas are freely flow and affect others' beliefs on certain things. Through word-of-mouth in SNS, people share product usage experiences, subjective feelings, and commonly accompanying sentiment or mood with others. An increasing number of empirical analyses of sentiment and mood are based on textual collections of public user generated data on the web. The Opinion mining is one domain of the data mining fields extracting public opinions exposed in SNS by utilizing data mining. There have been many studies on the issues of opinion mining from Web sources such as product reviews, forum posts and blogs. In relation to this literatures, we are trying to understand the effects of SNS exposures of firms on stock prices in Korea. Similarly to Bollen et al. [2011], we empirically analyze the impact of SNS exposures on stock return rates. We use Social Metrics by Daum Soft, an SNS big data analysis company in Korea. Social Metrics provides trends and public opinions in Twitter and blogs by using natural language process and analysis tools. It collects the sentences circulated in the Twitter in real time, and breaks down these sentences into the word units and then extracts keywords. In this study, we classify firms' exposures in SNS into two groups: positive and negative. To test the correlation and causation relationship between SNS exposures and stock price returns, we first collect 252 firms' stock prices and KRX100 index in the Korea Stock Exchange (KRX) from May 25, 2012 to September 1, 2012. We also gather the public attitudes (positive, negative) about these firms from Social Metrics over the same period of time. We conduct regression analysis between stock prices and the number of SNS exposures. Having checked the correlation between the two variables, we perform Granger causality test to see the causation direction between the two variables. The research result is that the number of total SNS exposures is positively related with stock market returns. The number of positive mentions of has also positive relationship with stock market returns. Contrarily, the number of negative mentions has negative relationship with stock market returns, but this relationship is statistically not significant. This means that the impact of positive mentions is statistically bigger than the impact of negative mentions. We also investigate whether the impacts are moderated by industry type and firm's size. We find that the SNS exposures impacts are bigger for IT firms than for non-IT firms, and bigger for small sized firms than for large sized firms. The results of Granger causality test shows change of stock price return is caused by SNS exposures, while the causation of the other way round is not significant. Therefore the correlation relationship between SNS exposures and stock prices has uni-direction causality. The more a firm is exposed in SNS, the more is the stock price likely to increase, while stock price changes may not cause more SNS mentions.

A Study on Requirement and Degree of the Satisfaction about Cosmeceuticals of Women (우리나라 여성들의 기능성화장품에 대한 요구 및 만족도 연구)

  • Kim Kang-Mi;Kim Ju-Duck
    • Journal of the Society of Cosmetic Scientists of Korea
    • /
    • v.30 no.4 s.48
    • /
    • pp.571-582
    • /
    • 2004
  • Recently the well-being, which is regarded as the new cultural code, has brought a new change in the cosmetic industry. The application of the functional products is getting mere and lots of functional cosmetics are now diversifying from the skin-care into the make-up as well as the herbal products. So the future in the market of functional cosmetic products is prospected to be positive. Therefore, cosmetic companies need an approach bases on the concept of the well-being. So to speak, they need to understand the needs of customers accurately from the customers point of view. Also it is a crucial issue that how the unique characteristics of functional cosmetic products as well as the development of products base on the concept of well-being make in balance. In this study, we attempt to inspect the advanced domestic market of the functional products due to the well-being trend and try to propose an option of making an advance it through the customers survey (for example, their need and their satisfaction on the functional products, etc) on the functional cosmetic products. For this purpose, it has been surveyed on adult female customers aged 19 to 60 located in Seoul and Gyeonggi province. 379 questionnaires among 510 were used in the final analysis. Collected data was analyzed using the statistical package for the social science (SPSS) program that can give the information about the general characteristics of the subjects like the frequency and percentage. And we used Cronbach's u reliability test, $x^2\;(chi-square)$ frequency analysis, t-test, and one-wat ANOVA to investigate the customers need, their degree of the satisfaction on the functional products of their own, factors of their perception on the quality on them. We think that the results of our study can act not only as the fundamental data on the customers need, their usage pattern, and their degree of the satisfaction, but also as the important tips of planning the marketing strategies.