• Title/Summary/Keyword: social media big data

Search Result 282, Processing Time 0.025 seconds

Community residents' knowledge level and related factor on electronic wave (전자파에 대한 지역사회 주민의 지식수준과 관련요인)

  • 이규수;남철현;김성우;김귀희
    • Korean Journal of Health Education and Promotion
    • /
    • v.19 no.3
    • /
    • pp.73-85
    • /
    • 2002
  • This study was conducted to examine community residents' knowledge level and related factor on electronic wave in order to provide basic data for development of education and publicity program. 2,000 people, who lived in five big cities and five small and medium cities, were selected ad subjects of this study. The data were collected from May 1, 200 I to August 31, 2001. The results of this study are as follows. According to the average knowledge level of harmful affect of electronic wave on health in general characteristics, female was higher(37.40 ± 5.24 points) than male; ‘forties’ was highest(37.77 ± 5.69 points); ‘married spouse’ was high(36.84 ± 5.59 points); ‘living in small-ta-medium city’ was high(36.84 ± 5.32 points). ‘university graduate’ was highest(37.41 ± 5.32 points) in education level, ‘middle class’ was high(36.61 ± 4.96 points) in economic status, ‘professional technician’ was higher(36.68 ± 6.55 points) than other occupations in occupational type. According to the knowledge level of harmful affect of electronic wave on health in health condition by self-judgment, ‘good health condition’ was highest(36.77 ± 4.99 points). In the case of the knowledge level of those who visited medical institutions for last one year, ‘never visited’ was highest(37.19 ± 5.02 points). In the kind of medical institutions, ‘those who visited general hospital’ was highest(36.58 ± 5.63 points). In the way of knowledge obtainments of electronic wave through education and publicity media, ‘school education’ was highest(37.55 ± 5.19 points). According to the score of awareness level of disease incidence related to electronic wave, allergy and erethism was highest(57.8 points on the basis of 100 points). It appeared in order of leukemia, skin disease or skin cancer, dementia, various cancers, cataract, and brain tumor. The variables which significantly influenced knowledge level of harm of electronic wave were knowledge obtainments of electronic wave, age, economic status, daily TV watching period, sex, period of daily cellular phone use, period of working with computer, and daily VTR watching period. The knowledge of community residents concerning harmful affect of electronic wave on health is needed because people's opportunity of exposing to electronic wave is increasing. Especially, it is the demands of the times to provide information on knowledge of each equipment which generate electronic wave. The government, the product manufacturing companies, related social organizations, and education institutions must make efforts to develop the education program which is needed to make people have right knowledge and attitude.

Occupational Therapy in Long-Term Care Insurance For the Elderly Using Text Mining (텍스트 마이닝을 활용한 노인장기요양보험에서의 작업치료: 2007-2018년)

  • Cho, Min Seok;Baek, Soon Hyung;Park, Eom-Ji;Park, Soo Hee
    • Journal of Society of Occupational Therapy for the Aged and Dementia
    • /
    • v.12 no.2
    • /
    • pp.67-74
    • /
    • 2018
  • Objective : The purpose of this study is to quantitatively analyze the role of occupational therapy in long - term care insurance for the elderly using text mining, one of the big data analysis techniques. Method : For the analysis of newspaper articles, "Long - Term Care Insurance for the Elderly + Occupational Therapy for the Elderly" was collected after the period from 2007 to 208. Naver, which has a high share of the domestic search engine, utilized the database of Naver News by utilizing Textom, a web crawling tool. After collecting the article title and original text of 510 news data from the collection of the elderly long term care insurance + occupational therapy search, we analyzed the article frequency and key words by year. Result : In terms of the frequency of articles published by year, the number of articles published in 2015 and 2017 was the highest with 70 articles (13.7%), and the top 10 terms of the key word analysis showed the highest frequency of 'dementia' (344) In terms of key words, dementia, treatment, hospital, health, service, rehabilitation, facilities, institution, grade, elderly, professional, salary, industrial complex and people are related. Conclusion : In this study, it is meaningful that the textual mining technique was used to more objectively confirm the social needs and the role of the occupational therapist for the dementia and rehabilitation in the related key keywords based on the media reporting trend of the elderly long - term care insurance for 11 years. Based on the results of this study, future research should expand research field and period and supplement the research methodology through various analysis methods according to the year.

The Effect of Health and Environmental Message Framing on Consumer Attitude and WoM: Focused on Vegan Product (건강과 환경 메시지 프레이밍에 따른 소비자 태도와 구전에 미치는 영향: 비건 제품을 중심으로)

  • Park, Seoyoung;Lim, Boram
    • Journal of Service Research and Studies
    • /
    • v.13 no.3
    • /
    • pp.127-146
    • /
    • 2023
  • Recently, digital advertising has shifted towards delivering messages through short ads of less than 15 seconds, and on social media, ads need to convey the message within 5 seconds before consumers skip them. Although the length of advertisements has decreased, advancements in artificial intelligence algorithms and big data analysis have made it possible to deliver personalized messages that cater to consumers' interests. In this changing landscape, the importance of delivering tailored messages through short and efficient ads is increasing. In this study, we examined the effects of message framing as part of effective message delivery. Specifically, we examined the differences in the effects of two framings, "health" and "environment," for vegan products. The growing consumer interest in health and the environment has elevated the interest in vegan products, and the vegan market is expanding rapidly. Consumers purchase vegan products not only for personal health benefits but also due to their ethical responsibility towards the environment, which can be considered ethical consumption. Previous research has not shown the differences in the effects between health and environment message framings, and the research has been limited to vegan food products. This study investigates the differences in the effects of health and environment message framings using a dish soap product category. By identifying which advertising messages, either health or environment, are more effective in promoting vegan products, this study provides insights for companies to enhance their message framing strategies effectively.

Investigating Dynamic Mutation Process of Issues Using Unstructured Text Analysis (비정형 텍스트 분석을 활용한 이슈의 동적 변이과정 고찰)

  • Lim, Myungsu;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.1-18
    • /
    • 2016
  • Owing to the extensive use of Web media and the development of the IT industry, a large amount of data has been generated, shared, and stored. Nowadays, various types of unstructured data such as image, sound, video, and text are distributed through Web media. Therefore, many attempts have been made in recent years to discover new value through an analysis of these unstructured data. Among these types of unstructured data, text is recognized as the most representative method for users to express and share their opinions on the Web. In this sense, demand for obtaining new insights through text analysis is steadily increasing. Accordingly, text mining is increasingly being used for different purposes in various fields. In particular, issue tracking is being widely studied not only in the academic world but also in industries because it can be used to extract various issues from text such as news, (SocialNetworkServices) to analyze the trends of these issues. Conventionally, issue tracking is used to identify major issues sustained over a long period of time through topic modeling and to analyze the detailed distribution of documents involved in each issue. However, because conventional issue tracking assumes that the content composing each issue does not change throughout the entire tracking period, it cannot represent the dynamic mutation process of detailed issues that can be created, merged, divided, and deleted between these periods. Moreover, because only keywords that appear consistently throughout the entire period can be derived as issue keywords, concrete issue keywords such as "nuclear test" and "separated families" may be concealed by more general issue keywords such as "North Korea" in an analysis over a long period of time. This implies that many meaningful but short-lived issues cannot be discovered by conventional issue tracking. Note that detailed keywords are preferable to general keywords because the former can be clues for providing actionable strategies. To overcome these limitations, we performed an independent analysis on the documents of each detailed period. We generated an issue flow diagram based on the similarity of each issue between two consecutive periods. The issue transition pattern among categories was analyzed by using the category information of each document. In this study, we then applied the proposed methodology to a real case of 53,739 news articles. We derived an issue flow diagram from the articles. We then proposed the following useful application scenarios for the issue flow diagram presented in the experiment section. First, we can identify an issue that actively appears during a certain period and promptly disappears in the next period. Second, the preceding and following issues of a particular issue can be easily discovered from the issue flow diagram. This implies that our methodology can be used to discover the association between inter-period issues. Finally, an interesting pattern of one-way and two-way transitions was discovered by analyzing the transition patterns of issues through category analysis. Thus, we discovered that a pair of mutually similar categories induces two-way transitions. In contrast, one-way transitions can be recognized as an indicator that issues in a certain category tend to be influenced by other issues in another category. For practical application of the proposed methodology, high-quality word and stop word dictionaries need to be constructed. In addition, not only the number of documents but also additional meta-information such as the read counts, written time, and comments of documents should be analyzed. A rigorous performance evaluation or validation of the proposed methodology should be performed in future works.

A Study on the Landscape Cognition of Wind Power Plant in Social Media (소셜미디어에 나타난 풍력발전시설의 경관 인식 연구)

  • Woo, Kyung-Sook;Suh, Joo-Hwan
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.50 no.5
    • /
    • pp.69-79
    • /
    • 2022
  • This study aims to assess the current understanding of the landscape of wind power facilities as renewable energy sources that supply sightseeing, tourism, and other opportunities. Therefore, social media data related to the landscape of wind power facilities experienced by visitors from different regions was analyzed. The analysis results showed that the common characteristics of the landscape of wind power facilities are based on the scale of wind power facilities, the distance between overlook points of wind power facilities, the visual openness of the wind power facilities from the overlook points, and the terrain where the wind power facilities are located. In addition, the preference for wind power facilities is higher in places where the shape of wind power facilities and the surrounding landscape can be clearly seen- flat ground or the sea are considered better landscapes. Negative keywords about the landscape appear on Gade Mountain in Taibai, Meifeng Mountain in Taibai, Taiqi Mountain, and Gyeongju Wind Power Generation Facilities on Gyeongshang Road in Gangwon. The keyword 'negation' occurs when looking at wind power facilities at close range. Because of the high angle of the view, viewers can feel overwhelmed seeing the size of the facility and the ridge simultaneously, feeling psychological pressure. On the contrary, positive landscape adjectives are obtained from wind power facilities on flat ground or the sea. Visitors think that the visual volume of the landscape is fully ensured on flat ground or the sea, and it is a symbolic element that can represent the site. This study analyzes landscape awareness based on the opinions of visitors who have experienced wind power facilities. However, wind power facilities are built in different areas. Therefore, landscape characteristics are different, and there are many variables, such as viewpoints and observers, so the research results are difficult to popularize and have limitations. In recent years, landscape damage due to the construction of wind power facilities has become a hot issue, and the domestic methods of landscape evaluation of wind power facilities are unsatisfactory. Therefore, when evaluating the landscape of wind power facilities, the scale of wind power facilities, the inherent natural characteristics of the area where wind power facilities are set up, and the distance between wind power facilities and overlook points are important elements to consider. In addition, wind power facilities are set in the natural environment, which needs to be protected. Therefore, from the landscape perspective, it is necessary to study the landscape of wind power facilities and the surrounding environment.

Clickstream Big Data Mining for Demographics based Digital Marketing (인구통계특성 기반 디지털 마케팅을 위한 클릭스트림 빅데이터 마이닝)

  • Park, Jiae;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.143-163
    • /
    • 2016
  • The demographics of Internet users are the most basic and important sources for target marketing or personalized advertisements on the digital marketing channels which include email, mobile, and social media. However, it gradually has become difficult to collect the demographics of Internet users because their activities are anonymous in many cases. Although the marketing department is able to get the demographics using online or offline surveys, these approaches are very expensive, long processes, and likely to include false statements. Clickstream data is the recording an Internet user leaves behind while visiting websites. As the user clicks anywhere in the webpage, the activity is logged in semi-structured website log files. Such data allows us to see what pages users visited, how long they stayed there, how often they visited, when they usually visited, which site they prefer, what keywords they used to find the site, whether they purchased any, and so forth. For such a reason, some researchers tried to guess the demographics of Internet users by using their clickstream data. They derived various independent variables likely to be correlated to the demographics. The variables include search keyword, frequency and intensity for time, day and month, variety of websites visited, text information for web pages visited, etc. The demographic attributes to predict are also diverse according to the paper, and cover gender, age, job, location, income, education, marital status, presence of children. A variety of data mining methods, such as LSA, SVM, decision tree, neural network, logistic regression, and k-nearest neighbors, were used for prediction model building. However, this research has not yet identified which data mining method is appropriate to predict each demographic variable. Moreover, it is required to review independent variables studied so far and combine them as needed, and evaluate them for building the best prediction model. The objective of this study is to choose clickstream attributes mostly likely to be correlated to the demographics from the results of previous research, and then to identify which data mining method is fitting to predict each demographic attribute. Among the demographic attributes, this paper focus on predicting gender, age, marital status, residence, and job. And from the results of previous research, 64 clickstream attributes are applied to predict the demographic attributes. The overall process of predictive model building is compose of 4 steps. In the first step, we create user profiles which include 64 clickstream attributes and 5 demographic attributes. The second step performs the dimension reduction of clickstream variables to solve the curse of dimensionality and overfitting problem. We utilize three approaches which are based on decision tree, PCA, and cluster analysis. We build alternative predictive models for each demographic variable in the third step. SVM, neural network, and logistic regression are used for modeling. The last step evaluates the alternative models in view of model accuracy and selects the best model. For the experiments, we used clickstream data which represents 5 demographics and 16,962,705 online activities for 5,000 Internet users. IBM SPSS Modeler 17.0 was used for our prediction process, and the 5-fold cross validation was conducted to enhance the reliability of our experiments. As the experimental results, we can verify that there are a specific data mining method well-suited for each demographic variable. For example, age prediction is best performed when using the decision tree based dimension reduction and neural network whereas the prediction of gender and marital status is the most accurate by applying SVM without dimension reduction. We conclude that the online behaviors of the Internet users, captured from the clickstream data analysis, could be well used to predict their demographics, thereby being utilized to the digital marketing.

Availability of Mobile Art in Smartphone Environment of Augmented Reality Content Industrial Technology (증강현실 콘텐츠 산업기술의 스마트폰 환경 모바일 아트 활용 가능성)

  • Kim, Hee-Young;Shin, Chang-Ok
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.5
    • /
    • pp.48-57
    • /
    • 2013
  • Smartphones provide users with environment for communication and sharing information and at the same time play an important role of mobile technology and mobile art development. Smartphone technology-related researches are being accelerated especially with the advent of mobile Augmented Reality(AR) age, but the studies on user participation that is essential for AR content industry were insufficient. In that regard, the assistance from mobile art area that has already developed these characteristics is essential. Thus, this article is to classify mobile art that has not been studied a lot domestically into feature phone usage and smartphone usage and to analyze each example case with the three most used methods. The usage of feature phones which use the sound and images of mobile devices can be divided into three: installation and performing methods, single channel video art method and five senses communication method. On the other hand, the usage of smartphones that use sensors, cameras, GPS and AR can be divided into location-based AR, marker-based AR and markerless AR. Also, as a result of examining mobile AR content utilization technology by industries, combined methods are utilized; tourism and game-related industries use location-based AR, education and medicine-related industries use marker-based AR, and shopping-related industries use markerless AR. The development of AR content industry is expected to be accelerated with mobile art that makes use of combined technology method and constant communication method through active participation of users. The future development direction of mobile AR industry is predicted to have minimized HMD, integration of hologram technology and artificial intelligence and make the most of big data and social network so that we could overcome the technological limitation of AR.

A Comparison Study of RNN, CNN, and GAN Models in Sequential Recommendation (순차적 추천에서의 RNN, CNN 및 GAN 모델 비교 연구)

  • Yoon, Ji Hyung;Chung, Jaewon;Jang, Beakcheol
    • Journal of Internet Computing and Services
    • /
    • v.23 no.4
    • /
    • pp.21-33
    • /
    • 2022
  • Recently, the recommender system has been widely used in various fields such as movies, music, online shopping, and social media, and in the meantime, the recommender model has been developed from correlation analysis through the Apriori model, which can be said to be the first-generation model in the recommender system field. In 2005, many models have been proposed, including deep learning-based models, which are receiving a lot of attention within the recommender model. The recommender model can be classified into a collaborative filtering method, a content-based method, and a hybrid method that uses these two methods integrally. However, these basic methods are gradually losing their status as methodologies in the field as they fail to adapt to internal and external changing factors such as the rapidly changing user-item interaction and the development of big data. On the other hand, the importance of deep learning methodologies in recommender systems is increasing because of its advantages such as nonlinear transformation, representation learning, sequence modeling, and flexibility. In this paper, among deep learning methodologies, RNN, CNN, and GAN-based models suitable for sequential modeling that can accurately and flexibly analyze user-item interactions are classified, compared, and analyzed.

A Folksonomy Ranking Framework: A Semantic Graph-based Approach (폭소노미 사이트를 위한 랭킹 프레임워크 설계: 시맨틱 그래프기반 접근)

  • Park, Hyun-Jung;Rho, Sang-Kyu
    • Asia pacific journal of information systems
    • /
    • v.21 no.2
    • /
    • pp.89-116
    • /
    • 2011
  • In collaborative tagging systems such as Delicious.com and Flickr.com, users assign keywords or tags to their uploaded resources, such as bookmarks and pictures, for their future use or sharing purposes. The collection of resources and tags generated by a user is called a personomy, and the collection of all personomies constitutes the folksonomy. The most significant need of the folksonomy users Is to efficiently find useful resources or experts on specific topics. An excellent ranking algorithm would assign higher ranking to more useful resources or experts. What resources are considered useful In a folksonomic system? Does a standard superior to frequency or freshness exist? The resource recommended by more users with mere expertise should be worthy of attention. This ranking paradigm can be implemented through a graph-based ranking algorithm. Two well-known representatives of such a paradigm are Page Rank by Google and HITS(Hypertext Induced Topic Selection) by Kleinberg. Both Page Rank and HITS assign a higher evaluation score to pages linked to more higher-scored pages. HITS differs from PageRank in that it utilizes two kinds of scores: authority and hub scores. The ranking objects of these pages are limited to Web pages, whereas the ranking objects of a folksonomic system are somewhat heterogeneous(i.e., users, resources, and tags). Therefore, uniform application of the voting notion of PageRank and HITS based on the links to a folksonomy would be unreasonable, In a folksonomic system, each link corresponding to a property can have an opposite direction, depending on whether the property is an active or a passive voice. The current research stems from the Idea that a graph-based ranking algorithm could be applied to the folksonomic system using the concept of mutual Interactions between entitles, rather than the voting notion of PageRank or HITS. The concept of mutual interactions, proposed for ranking the Semantic Web resources, enables the calculation of importance scores of various resources unaffected by link directions. The weights of a property representing the mutual interaction between classes are assigned depending on the relative significance of the property to the resource importance of each class. This class-oriented approach is based on the fact that, in the Semantic Web, there are many heterogeneous classes; thus, applying a different appraisal standard for each class is more reasonable. This is similar to the evaluation method of humans, where different items are assigned specific weights, which are then summed up to determine the weighted average. We can check for missing properties more easily with this approach than with other predicate-oriented approaches. A user of a tagging system usually assigns more than one tags to the same resource, and there can be more than one tags with the same subjectivity and objectivity. In the case that many users assign similar tags to the same resource, grading the users differently depending on the assignment order becomes necessary. This idea comes from the studies in psychology wherein expertise involves the ability to select the most relevant information for achieving a goal. An expert should be someone who not only has a large collection of documents annotated with a particular tag, but also tends to add documents of high quality to his/her collections. Such documents are identified by the number, as well as the expertise, of users who have the same documents in their collections. In other words, there is a relationship of mutual reinforcement between the expertise of a user and the quality of a document. In addition, there is a need to rank entities related more closely to a certain entity. Considering the property of social media that ensures the popularity of a topic is temporary, recent data should have more weight than old data. We propose a comprehensive folksonomy ranking framework in which all these considerations are dealt with and that can be easily customized to each folksonomy site for ranking purposes. To examine the validity of our ranking algorithm and show the mechanism of adjusting property, time, and expertise weights, we first use a dataset designed for analyzing the effect of each ranking factor independently. We then show the ranking results of a real folksonomy site, with the ranking factors combined. Because the ground truth of a given dataset is not known when it comes to ranking, we inject simulated data whose ranking results can be predicted into the real dataset and compare the ranking results of our algorithm with that of a previous HITS-based algorithm. Our semantic ranking algorithm based on the concept of mutual interaction seems to be preferable to the HITS-based algorithm as a flexible folksonomy ranking framework. Some concrete points of difference are as follows. First, with the time concept applied to the property weights, our algorithm shows superior performance in lowering the scores of older data and raising the scores of newer data. Second, applying the time concept to the expertise weights, as well as to the property weights, our algorithm controls the conflicting influence of expertise weights and enhances overall consistency of time-valued ranking. The expertise weights of the previous study can act as an obstacle to the time-valued ranking because the number of followers increases as time goes on. Third, many new properties and classes can be included in our framework. The previous HITS-based algorithm, based on the voting notion, loses ground in the situation where the domain consists of more than two classes, or where other important properties, such as "sent through twitter" or "registered as a friend," are added to the domain. Forth, there is a big difference in the calculation time and memory use between the two kinds of algorithms. While the matrix multiplication of two matrices, has to be executed twice for the previous HITS-based algorithm, this is unnecessary with our algorithm. In our ranking framework, various folksonomy ranking policies can be expressed with the ranking factors combined and our approach can work, even if the folksonomy site is not implemented with Semantic Web languages. Above all, the time weight proposed in this paper will be applicable to various domains, including social media, where time value is considered important.

Korean Word Sense Disambiguation using Dictionary and Corpus (사전과 말뭉치를 이용한 한국어 단어 중의성 해소)

  • Jeong, Hanjo;Park, Byeonghwa
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.1-13
    • /
    • 2015
  • As opinion mining in big data applications has been highlighted, a lot of research on unstructured data has made. Lots of social media on the Internet generate unstructured or semi-structured data every second and they are often made by natural or human languages we use in daily life. Many words in human languages have multiple meanings or senses. In this result, it is very difficult for computers to extract useful information from these datasets. Traditional web search engines are usually based on keyword search, resulting in incorrect search results which are far from users' intentions. Even though a lot of progress in enhancing the performance of search engines has made over the last years in order to provide users with appropriate results, there is still so much to improve it. Word sense disambiguation can play a very important role in dealing with natural language processing and is considered as one of the most difficult problems in this area. Major approaches to word sense disambiguation can be classified as knowledge-base, supervised corpus-based, and unsupervised corpus-based approaches. This paper presents a method which automatically generates a corpus for word sense disambiguation by taking advantage of examples in existing dictionaries and avoids expensive sense tagging processes. It experiments the effectiveness of the method based on Naïve Bayes Model, which is one of supervised learning algorithms, by using Korean standard unabridged dictionary and Sejong Corpus. Korean standard unabridged dictionary has approximately 57,000 sentences. Sejong Corpus has about 790,000 sentences tagged with part-of-speech and senses all together. For the experiment of this study, Korean standard unabridged dictionary and Sejong Corpus were experimented as a combination and separate entities using cross validation. Only nouns, target subjects in word sense disambiguation, were selected. 93,522 word senses among 265,655 nouns and 56,914 sentences from related proverbs and examples were additionally combined in the corpus. Sejong Corpus was easily merged with Korean standard unabridged dictionary because Sejong Corpus was tagged based on sense indices defined by Korean standard unabridged dictionary. Sense vectors were formed after the merged corpus was created. Terms used in creating sense vectors were added in the named entity dictionary of Korean morphological analyzer. By using the extended named entity dictionary, term vectors were extracted from the input sentences and then term vectors for the sentences were created. Given the extracted term vector and the sense vector model made during the pre-processing stage, the sense-tagged terms were determined by the vector space model based word sense disambiguation. In addition, this study shows the effectiveness of merged corpus from examples in Korean standard unabridged dictionary and Sejong Corpus. The experiment shows the better results in precision and recall are found with the merged corpus. This study suggests it can practically enhance the performance of internet search engines and help us to understand more accurate meaning of a sentence in natural language processing pertinent to search engines, opinion mining, and text mining. Naïve Bayes classifier used in this study represents a supervised learning algorithm and uses Bayes theorem. Naïve Bayes classifier has an assumption that all senses are independent. Even though the assumption of Naïve Bayes classifier is not realistic and ignores the correlation between attributes, Naïve Bayes classifier is widely used because of its simplicity and in practice it is known to be very effective in many applications such as text classification and medical diagnosis. However, further research need to be carried out to consider all possible combinations and/or partial combinations of all senses in a sentence. Also, the effectiveness of word sense disambiguation may be improved if rhetorical structures or morphological dependencies between words are analyzed through syntactic analysis.