• 제목/요약/키워드: Word Cloud Method

검색결과 60건 처리시간 0.025초

An expanded Matrix Factorization model for real-time Web service QoS prediction

  • Hao, Jinsheng;Su, Guoping;Han, Xiaofeng;Nie, Wei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권11호
    • /
    • pp.3913-3934
    • /
    • 2021
  • Real-time prediction of Web service of quality (QoS) provides more convenience for web services in cloud environment, but real-time QoS prediction faces severe challenges, especially under the cold-start situation. Existing literatures of real-time QoS predicting ignore that the QoS of a user/service is related to the QoS of other users/services. For example, users/services belonging to the same group of category will have similar QoS values. All of the methods ignore the group relationship because of the complexity of the model. Based on this, we propose a real-time Matrix Factorization based Clustering model (MFC), which uses category information as a new regularization term of the loss function. Specifically, in order to meet the real-time characteristic of the real-time prediction model, and to minimize the complexity of the model, we first map the QoS values of a large number of users/services to a lower-dimensional space by the PCA method, and then use the K-means algorithm calculates user/service category information, and use the average result to obtain a stable final clustering result. Extensive experiments on real-word datasets demonstrate that MFC outperforms other state-of-the-art prediction algorithms.

Systematic Literature Review on Zakat Distribution Studies as Islamic Social Fund

  • Azhar ALAM;Ririn Tri RATNASARI;Ari PRASETYO;Muhamad Nafik Hadi RYANDONO;Umniyati SHOLIHAH
    • 유통과학연구
    • /
    • 제22권2호
    • /
    • pp.21-30
    • /
    • 2024
  • Purpose: This study explores the development of zakat distribution studies and the integration of existing studies. This study is expected to complement a systematic literature review in the field of zakat distribution to inspire further research directions. Research design, data, and methodology: This research method uses a systematic literature review assisted by the Nvivo application and the PRISMA system, which selects from 427 articles to 53 articles to be analyzed based on publication and classification of the theme of its findings. This study describes publications, authors, themes, cited articles, and research themes. Results: This study shows the dominance of Malaysian writers and significant developments in 2020. In addition, the study shows the most popular articles based on the most citations and word cloud analysis. The primary topics of zakat distribution publications are management strategy, development, the zakat institution, and the recipient. Conclusions: The study advises that Future research could focus on zakat distribution's asnaf characteristics. Next, a study on administration expenses and scalability concerns in zakat collection and distribution planning can avoid wasting cash. This topic hinders zakat institutions' distribution services.

텍스트 마이닝으로 OTT 인터랙티브 콘텐츠 다시보기 (Analyzing OTT Interactive Content Using Text Mining Method)

  • 이석창
    • 문화기술의 융합
    • /
    • 제9권5호
    • /
    • pp.859-865
    • /
    • 2023
  • OTT 시장의 과열로 서비스 사업자들이 콘텐츠 개발에 주력하는 상황에서 시청자들의 능동적인 참여를 독려하는 인터랙티브 콘텐츠가 주목받고 있다. 그에 따라 인터랙티브 콘텐츠에 관한 연구 역시 활발히 이루어지고 있다. 본 연구는 온라인상의 비정형 데이터를 중심으로 텍스트 마이닝을 통해 인터랙티브 콘텐츠에 관한 분석을 목적으로 한다. 가중치에 따른 키워드 특징 도출, OTT와 인터랙티브 콘텐츠의 관계, 그리고 인터랙티브 콘텐츠의 트렌드 변화를 객관적인 데이터에 근거하여 '워드클라우드', '관계도 분석', 그리고 '키워드 트렌드'라는 세부 기법을 활용하여 연구 결과 및 함의점을 도출하였다.

HF-IFF: TF-IDF를 응용한 병증-본초 연관성(relevancy) 측정과 본초 특성의 시각화 -청강의감 방제를 대상으로- (HF-IFF: Applying TF-IDF to Measure Symptom-Medicinal Herb Relevancy and Visualize Medicinal Herb Characteristics - Studying Formulations in Cheongkangeuigam -)

  • 오준호
    • 대한본초학회지
    • /
    • 제30권3호
    • /
    • pp.63-68
    • /
    • 2015
  • Objectives : We applied the term weighting method used in the field of data search to quantify relevancy between symptoms and medicinal herbs, and, based on this, we aim to introduce a method of visualizing the characteristics of medicinal herbs. Methods : We proposed HF-IFF, an adaptation of TF-IDF, which is a term weighting measurement method adapted in the field of data search. Using this method, we deduced relevancy between symptoms and medicinal herbs In Cheongkangeuigam that was published in 1984 by organizing the medical theory of Cheongkang, Kim Younghoon, and visualized this as a graph in order to compare the characteristics of medicinal herbs used for different symptoms. Results : HF-IFF is the product of HF and IFF, where HF is the frequency of the relevant medicinal herb for a set of symptoms, and IFF is the inverse of the number of formulations (FF) containing that herb. A total of 251 types of medicinal herb are used in Cheongkangeuigam, and 1538 formulations are classified according to 67 types of symptom. The overall mean for HF-IFF was 0.491, with a maximum of 4.566 and a minimum of 0.013. Conclusions : In spite of several limitations, we were able to use HF-IFF to measure relevancy between symptoms and medicinal herbs, with formulations as an intermediate. We were able to use the quantified results to visually express the characteristics of the herbs used for symptoms by bubble chart and word-cloud from HF-IFF.

Strategies for the Development of Watermelon Industry Using Unstructured Big Data Analysis

  • LEE, Seung-In;SON, Chansoo;SHIM, Joonyong;LEE, Hyerim;LEE, Hye-Jin;CHO, Yongbeen
    • 산경연구논집
    • /
    • 제12권1호
    • /
    • pp.47-62
    • /
    • 2021
  • Purpose: Our purpose in this study was to examine the strategies for the development of watermelon industry using unstructured big data analysis. That is, this study was to look the change of issues and consumer's perception about watermelon using big data and social network analysis and to investigate ways to strengthen the competitiveness of watermelon industry based on that. Methodology: For this purpose, the data was collected from Naver (blog, news) and Daum (blog, news) by TEXTOM 4.5 and the analysis period was set from 2015 to 2016 and from 2017-2018 and from 2019-2020 in order to understand change of issues and consumer's perception about watermelon or watermelon industry. For the data analysis, TEXTOM 4.5 was used to conduct key word frequency analysis, word cloud analysis and extraction of metrics data. UCINET 6.0 and NetDraw function of UCINET 6.0 were utilized to find the connection structure of words and to visualize the network relations, and to make a cluster of words. Results: The keywords related to the watermelon extracted such as 'the stalk end of a watermelon', 'E-mart', 'Haman', 'Gochang', and 'Lotte Mart' (news: 015-2016), 'apple watermelon', 'Haman', 'E-mart', 'Gochang', and' Mudeungsan watermelon' (news: 2017-2018), 'E-mart', 'apple watermelon', 'household', 'chobok', and 'donation' (news: 2019-2020), 'watermelon salad', 'taste', 'the heat', 'baby', and 'effect' (blog: 2015-2016), 'taste', 'watermelon juice', 'method', 'watermelon salad', and 'baby' (blog: 2017-2018), 'taste', 'effect', 'watermelon juice', 'method', and 'apple watermelon' (blog: 2019-2020) and the results from frequency and TF-IDF analysis presented. And in CONCOR analysis, appeared as four types, respectively. Conclusions: Based on the results, the authors discussed the strategies and policies for boosting the watermelon industry and limitations of this study and future research directions. The results of this study will help prioritize strategies and policies for boosting the consumption of the watermelon and contribute to improving the competitiveness of watermelon industry in Korea. Also, it is expected that this study will be used as a very important basis for agricultural big data studies to be conducted in the future and this study will offer watermelon producers and policy-makers practical points helpful in crafting tailor-made marketing strategies.

의미가 다양한 본초 효능 표현에 대한 고찰 - 본초의 殺蟲 효능을 중심으로 - (A Study on Ambiguous Expression for Efficacy of Medicinal Material - Focusing on 'Salchung[殺蟲]' -)

  • 김상현;김상균;남보령;이명구;이승호;장현철
    • 대한본초학회지
    • /
    • 제30권4호
    • /
    • pp.45-49
    • /
    • 2015
  • Objectives : Through this study, it would be confirmed that specific expression for efficacy of medicinal material has multiple meanings. And through the methodology to determine the multi meaning, it could be contributed to lighten ambiguous expressions for efficacy of medicinal material.Methods : The premise is that the efficacy and treatment target data are related to each other. Word cloud has been used analyzing the efficacy and treatment target data for medicinal materials. Then classic and modern documents were analyzed by the search.Results : Even though searching all related references as well as comparing the efficacy and treatment target data were done, some medicinal materials having 'Salchung[殺蟲]' as an efficacy are not expected to treat the disease associated with the parasite. Through the analysis of classic and modern documents, it was found that 'Salchung[殺蟲]' is not used only as a means of anthelmintic efficacy. But through the above analysis method some medicinal materials having 'Guchung[驅蟲]' as an efficacy are expected to treat the disease associated with the parasite, and 'Guchung[驅蟲]' seems to be almost used as a means of anthelmintic efficacy.Conclusions : If a certain expression for efficacy of medicinal material is used as a single meaning obviously, ambiguous expressions need to be clear. And if a certain expression for efficacy of medicinal material seems to have multiple meanings, the additional informations are to be supplemented for exact wording.

탐색적 자료 분석(EDA) 기법을 활용한 국내 11개 대표 온라인 쇼핑몰 BEST 100 비교 (Comparison of Online Shopping Mall BEST 100 using Exploratory Data Analysis)

  • 강지천;강주영
    • 한국빅데이터학회지
    • /
    • 제3권1호
    • /
    • pp.1-12
    • /
    • 2018
  • 초기 온라인 쇼핑몰이 등장할 때부터 지금까지 BEST 100은 모든 쇼핑몰 웹사이트의 핵심 기능으로 제공되고 있다. BEST 100은 소비자들이 한눈에 인기 상품들을 확인할 수 있기 때문에 쇼핑몰의 매출 등에 미치는 영향이 높지만 온라인 쇼핑 선행 연구에서 BEST 100과 관련된 연구는 거의 이루어지지 않고 있다. 따라서 본 연구에서는 현 온라인 쇼핑몰 11곳을 대상으로 선정하여 쇼핑몰별 판매 특징을 분석하였다. 연구 방법으로 각 쇼핑몰 웹 사이트의 BEST 100의 구성요소인 판매문구, 가격, 무료배송의 유/무 확인을 크롤링 하여 탐색적 자료 분석 기법(EDA)을 활용하였다. 분석 결과 쇼핑몰 11곳의 종합 평균 가격은 72,891.41원으로 나타났으며 상품 가격이 저렴할수록 무료배송 비율이 낮음을 확인하였다. 가격 이외에 판매문구에서는 텍스트 마이닝을 통해 8개의 카테고리로 구분하였다. 가장 많은 카테고리는 fashion 부분이었으나 카테고리의 설정이 제품 속성이 아닌 마케팅 문구를 분석한 점에 의의가 있다. 본 연구는 EDA를 활용하여 현 온라인 시장 흐름을 파악하고 향후 방향을 제시하는데 시사점이 있다.

지역화폐 앱 사용자 리뷰 분석을 통한 마케팅 전략 수립 - '동백전'과 '인천e음'을 중심으로 (Establish Marketing Strategy Using Analysis of Local Currency App User Reviews -Focused on 'Dongbackjeon' and 'Incheoneum')

  • 이새미;이태원
    • 한국콘텐츠학회논문지
    • /
    • 제21권4호
    • /
    • pp.111-122
    • /
    • 2021
  • 본 연구는 우리나라 대표적인 지역화폐인 동백전과 인천e음 앱 사용자 리뷰를 분석하여 지역화폐 사용자의 긍정/부정 요인을 파악하고, 이를 바탕으로 마케팅 전략을 수립하였다. 앱 사용자 리뷰를 별점을 기준으로 하여 긍정과 부정으로 분류하고 각각 워드클라우드, 토픽모델링, 소셜 네트워크 분석을 수행하였다. 그 결과, 동백전과 인천e음 부정 리뷰에서는 공통적으로 앱 사용과 카드 발급에 대한 불만이 주로 나타났으며, 긍정 리뷰에서는 '캐시백'에 대한 만족감과 함께 '지역경제'와 '소상공인'과 같은 키워드의 출현으로 지역화폐 사용자들은 자신의 소비가 지역경제 활성화에 도움이 된다고 인식하여 지역화폐를 사용하는 데 있어 만족감을 느끼는 것으로 나타났다. 본 연구의 분석결과로 파악된 만족/불만족 요인을 기반으로 개선해야 할 점과 더욱 강화해야 할 점을 파악하고, 이에 적절한 마케팅 전략을 도출하였다. 본 연구에서 활용한 텍스트 마이닝 방법과 연구 결과는 실질적으로 지역화폐 담당 공무원들과 마케터들에게 지역화폐에 대한 유의미한 정보를 제공해 줄 수 있다.

빅데이터 분석을 이용한 패션 플랫폼과 패션 스마트 팩토리에 대한 인식 연구 (A Study on the Perception of Fashion Platforms and Fashion Smart Factories using Big Data Analysis)

  • 송은영
    • 한국의류산업학회지
    • /
    • 제23권6호
    • /
    • pp.799-809
    • /
    • 2021
  • This study aimed to grasp the perceptions and trends in fashion platforms and fashion smart factories using big data analysis. As a research method, big data analysis, fashion platform, and smart factory were identified through literature and prior studies, and text mining analysis and network analysis were performed after collecting text from the web environment between April 2019 and April 2021. After data purification with Textom, the words of fashion platform (1,0591 pieces) and fashion smart factory (9750 pieces) were used for analysis. Key words were derived, the frequency of appearance was calculated, and the results were visualized in word cloud and N-gram. The top 70 words by frequency of appearance were used to generate a matrix, structural equivalence analysis was performed, and the results were displayed using network visualization and dendrograms. The collected data revealed that smart factory had high social issues, but consumer interest and academic research were insufficient, and the amount and frequency of related words on the fashion platform were both high. As a result of structural equalization analysis, it was found that fashion platforms with strong connectivity between clusters are creating new competitiveness with service platforms that add sharing, manufacturing, and curation functions, and fashion smart factories can expect future value to grow together, according to digital technology innovation and platforms. This study can serve as a foundation for future research topics related to fashion platforms and smart factories.

An Ensemble Classification of Mental Health in Malaysia related to the Covid-19 Pandemic using Social Media Sentiment Analysis

  • Nur 'Aisyah Binti Zakaria Adli;Muneer Ahmad;Norjihan Abdul Ghani;Sri Devi Ravana;Azah Anir Norman
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제18권2호
    • /
    • pp.370-396
    • /
    • 2024
  • COVID-19 was declared a pandemic by the World Health Organization (WHO) on 30 January 2020. The lifestyle of people all over the world has changed since. In most cases, the pandemic has appeared to create severe mental disorders, anxieties, and depression among people. Mostly, the researchers have been conducting surveys to identify the impacts of the pandemic on the mental health of people. Despite the better quality, tailored, and more specific data that can be generated by surveys,social media offers great insights into revealing the impact of the pandemic on mental health. Since people feel connected on social media, thus, this study aims to get the people's sentiments about the pandemic related to mental issues. Word Cloud was used to visualize and identify the most frequent keywords related to COVID-19 and mental health disorders. This study employs Majority Voting Ensemble (MVE) classification and individual classifiers such as Naïve Bayes (NB), Support Vector Machine (SVM), and Logistic Regression (LR) to classify the sentiment through tweets. The tweets were classified into either positive, neutral, or negative using the Valence Aware Dictionary or sEntiment Reasoner (VADER). Confusion matrix and classification reports bestow the precision, recall, and F1-score in identifying the best algorithm for classifying the sentiments.