• Title/Summary/Keyword: Big Data Visualization

Search Result 242, Processing Time 0.023 seconds

Conditions and potentials of Korean history research based on 'big data' analysis: the beginning of 'digital history' ('빅데이터' 분석 기반 한국사 연구의 현황과 가능성: 디지털 역사학의 시작)

  • Lee, Sangkuk
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.6
    • /
    • pp.1007-1023
    • /
    • 2016
  • This paper explores the conditions and potential of newly designed and tried methodology of big data analysis that apply to Korean history subject matter. In order to advance them, we need to pay more attention to quantitative analysis methodologies over pre-existing qualitative analysis. To obtain our new challenge, I propose 'digital history' methods along with associated disciplines such as linguistics and computer science, data science and statistics, and visualization techniques. As one example, I apply interdisciplinary convergence approaches to the principle and mechanism of elite reproduction during the Korean medieval age. I propose how to compensate for a lack of historical material by applying a semi-supervised learning method, how to create a database that utilizes text-mining techniques, how to analyze quantitative data with statistical methods, and how to indicate analytical outcomes with intuitive visualization.

Big Data Analysis on the Perception of Home Training According to the Implementation of COVID-19 Social Distancing

  • Hyun-Chang Keum;Kyung-Won Byun
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.15 no.3
    • /
    • pp.211-218
    • /
    • 2023
  • Due to the implementation of COVID-19 distancing, interest and users in 'home training' are rapidly increasing. Therefore, the purpose of this study is to identify the perception of 'home training' through big data analysis on social media channels and provide basic data to related business sector. Social media channels collected big data from various news and social content provided on Naver and Google sites. Data for three years from March 22, 2020 were collected based on the time when COVID-19 distancing was implemented in Korea. The collected data included 4,000 Naver blogs, 2,673 news, 4,000 cafes, 3,989 knowledge IN, and 953 Google channel news. These data analyzed TF and TF-IDF through text mining, and through this, semantic network analysis was conducted on 70 keywords, big data analysis programs such as Textom and Ucinet were used for social big data analysis, and NetDraw was used for visualization. As a result of text mining analysis, 'home training' was found the most frequently in relation to TF with 4,045 times. The next order is 'exercise', 'Homt', 'house', 'apparatus', 'recommendation', and 'diet'. Regarding TF-IDF, the main keywords are 'exercise', 'apparatus', 'home', 'house', 'diet', 'recommendation', and 'mat'. Based on these results, 70 keywords with high frequency were extracted, and then semantic indicators and centrality analysis were conducted. Finally, through CONCOR analysis, it was clustered into 'purchase cluster', 'equipment cluster', 'diet cluster', and 'execute method cluster'. For the results of these four clusters, basic data on the 'home training' business sector were presented based on consumers' main perception of 'home training' and analysis of the meaning network.

Storm-Based Dynamic Tag Cloud for Real-Time SNS Data (실시간 SNS 데이터를 위한 Storm 기반 동적 태그 클라우드)

  • Son, Siwoon;Kim, Dasol;Lee, Sujeong;Gil, Myeong-Seon;Moon, Yang-Sae
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.6
    • /
    • pp.309-314
    • /
    • 2017
  • In general, there are many difficulties in collecting, storing, and analyzing SNS (social network service) data, since those data have big data characteristics, which occurs very fast with the mixture form of structured and unstructured data. In this paper, we propose a new data visualization framework that works on Apache Storm, and it can be useful for real-time and dynamic analysis of SNS data. Apache Storm is a representative big data software platform that processes and analyzes real-time streaming data in the distributed environment. Using Storm, in this paper we collect and aggregate the real-time Twitter data and dynamically visualize the aggregated results through the tag cloud. In addition to Storm-based collection and aggregation functionalities, we also design and implement a Web interface that a user gives his/her interesting keywords and confirms the visualization result of tag cloud related to the given keywords. We finally empirically show that this study makes users be able to intuitively figure out the change of the interested subject on SNS data and the visualized results be applied to many other services such as thematic trend analysis, product recommendation, and customer needs identification.

Machine Learning based Prediction of The Value of Buildings

  • Lee, Woosik;Kim, Namgi;Choi, Yoon-Ho;Kim, Yong Soo;Lee, Byoung-Dai
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.8
    • /
    • pp.3966-3991
    • /
    • 2018
  • Due to the lack of visualization services and organic combinations between public and private buildings data, the usability of the basic map has remained low. To address this issue, this paper reports on a solution that organically combines public and private data while providing visualization services to general users. For this purpose, factors that can affect building prices first were examined in order to define the related data attributes. To extract the relevant data attributes, this paper presents a method of acquiring public information data and real estate-related information, as provided by private real estate portal sites. The paper also proposes a pretreatment process required for intelligent machine learning. This report goes on to suggest an intelligent machine learning algorithm that predicts buildings' value pricing and future value by using big data regarding buildings' spatial information, as acquired from a database containing building value attributes. The algorithm's availability was tested by establishing a prototype targeting pilot areas, including Suwon, Anyang, and Gunpo in South Korea. Finally, a prototype visualization solution was developed in order to allow general users to effectively use buildings' value ranking and value pricing, as predicted by intelligent machine learning.

Design of a Disaster Big Data Platform for Collecting and Analyzing Social Media (소셜미디어 수집과 분석을 위한 재난 빅 데이터 플랫폼의 설계)

  • Nguyen, Van-Quyet;Nguyen, Sinh-Ngoc;Nguyen, Giang-Truong;Kim, Kyungbaek
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.661-664
    • /
    • 2017
  • Recently, during disasters occurrence, dealing with emergencies has been handled well by the early transmission of disaster relating notifications on social media networks (e.g., Twitter or Facebook). Intuitively, with their characteristics (e.g., real-time, mobility) and big communities whose users could be regarded as volunteers, social networks are proved to be a crucial role for disasters response. However, the amount of data transmitted during disasters is an obstacle for filtering informative messages; because the messages are diversity, large and very noise. This large volume of data could be seen as Social Big Data (SBD). In this paper, we proposed a big data platform for collecting and analyzing disasters' data from SBD. Firstly, we designed a collecting module; which could rapidly extract disasters' information from the Twitter; by big data frameworks supporting streaming data on distributed system; such as Kafka and Spark. Secondly, we developed an analyzing module which learned from SBD to distinguish the useful information from the irrelevant one. Finally, we also designed a real-time visualization on the web interface for displaying the results of analysis phase. To show the viability of our platform, we conducted experiments of the collecting and analyzing phases in 10 days for both real-time and historical tweets, which were about disasters happened in South Korea. The results prove that our big data platform could be applied to disaster information based systems, by providing a huge relevant data; which can be used for inferring affected regions and victims in disaster situations, from 21.000 collected tweets.

Big data comparison between Chinese and Korean Libraries (중한 도서관 빅데이터의 비교)

  • Dong, Jingwen
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2019.05a
    • /
    • pp.413-414
    • /
    • 2019
  • 빅데이터는 초기에는 개념적인 접근으로 대용량의 데이터로 정의하기도 하였으나 지금은 데이터를 수집, 저장, 처리, 분석하여 가치 창출까지의 개념으로 확산되고, 최근에는 정확성(Veracity), 가변성(Variability), 시각화(Visualization) 개념까지 새롭게 추가되어 7V로 제시되기도 한다.

  • PDF

Development and Implementation of Smart Manufacturing Big-Data Platform Using Opensource for Failure Prognostics and Diagnosis Technology of Industrial Robot (제조로봇 고장예지진단을 위한 오픈소스기반 스마트 제조 빅데이터 플랫폼 구현)

  • Chun, Seung-Man;Suk, Soo-Young
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.14 no.4
    • /
    • pp.187-195
    • /
    • 2019
  • In the fourth industrial revolution era, various commercial smart platforms for smart system implementation are being developed and serviced. However, since most of the smart platforms have been developed for general purposes, they are difficult to apply / utilize because they cannot satisfy the requirements of real-time data management, data visualization and data storage of smart factory system. In this paper, we implemented an open source based smart manufacturing big data platform that can manage highly efficient / reliable data integration for the diagnosis diagnostic system of manufacturing robots.

Analysis of Meta Fashion Meaning Structure using Big Data: Focusing on the keywords 'Metaverse' + 'Fashion design' (빅데이터를 활용한 메타패션 의미구조 분석에 관한 연구: '메타버스' + '패션디자인' 키워드를 중심으로)

  • Ji-Yeon Kim;Shin-Young Lee
    • Fashion & Textile Research Journal
    • /
    • v.25 no.5
    • /
    • pp.549-559
    • /
    • 2023
  • Along with the transition to the fourth industrial revolution, the possibility of metaverse-based innovation in the fashion field has been confirmed, and various applications are being sought. Therefore, this study performs meaning structure analysis and discusses the prospects of meta fashion using big data. From 2020 to 2022, data including the keyword "metaverse + fashion design" were collected from portal sites (Naver, Daum, and Google), and the results of keyword frequency, N-gram, and TF-IDF analyses were derived using text mining. Furthermore, network visualization and CONCOR analysis were performed using Ucinet 6 to understand the interconnected structure between keywords and their essential meanings. The results were as follows: The main keywords appeared in the following order: fashion, metaverse, design, 3D, platform, apparel, and virtual. In the N-gram analysis, the density between fashion and metaverse words was high, and in the TF-IDF analysis results, the importance of content- and technology-related words such as 3D, apparel, platform, NFT, education, AI, avatar, MCM, and meta-fashion was confirmed. Through network visualization and CONCOR analysis using Ucinet 6, three cluster results were derived from the top emerging words: "metaverse fashion design and industry," "metaverse fashion design and education," and "metaverse fashion design platform." CONCOR analysis was also used to derive differentiated analysis results for middle and lower words. The results of this study provide useful information to strengthen competitiveness in the field of metaverse fashion design.

Data Processing and Visualization Method for Retrospective Data Analysis and Research Using Patient Vital Signs (환자의 활력 징후를 이용한 후향적 데이터의 분석과 연구를 위한 데이터 가공 및 시각화 방법)

  • Kim, Su Min;Yoon, Ji Young
    • Journal of Biomedical Engineering Research
    • /
    • v.42 no.4
    • /
    • pp.175-185
    • /
    • 2021
  • Purpose: Vital sign are used to help assess the general physical health of a person, give clues to possible diseases, and show progress toward recovery. Researchers are using vital sign data and AI(artificial intelligence) to manage a variety of diseases and predict mortality. In order to analyze vital sign data using AI, it is important to select and extract vital sign data suitable for research purposes. Methods: We developed a method to visualize vital sign and early warning scores by processing retrospective vital sign data collected from EMR(electronic medical records) and patient monitoring devices. The vital sign data used for development were obtained using the open EMR big data MIMIC-III and the wearable patient monitoring device(CareTaker). Data processing and visualization were developed using Python. We used the development results with machine learning to process the prediction of mortality in ICU patients. Results: We calculated NEWS(National Early Warning Score) to understand the patient's condition. Vital sign data with different measurement times and frequencies were sampled at equal time intervals, and missing data were interpolated to reconstruct data. The normal and abnormal states of vital sign were visualized as color-coded graphs. Mortality prediction result with processed data and machine learning was AUC of 0.892. Conclusion: This visualization method will help researchers to easily understand a patient's vital sign status over time and extract the necessary data.

A Study on the Meaning of The First Slam Dunk Based on Text Mining and Semantic Network Analysis

  • Kyung-Won Byun
    • International journal of advanced smart convergence
    • /
    • v.12 no.1
    • /
    • pp.164-172
    • /
    • 2023
  • In this study, we identify the recognition of 'The First Slam Dunk', which is gaining popularity as a sports-based cartoon through big data analysis of social media channels, and provide basic data for the development and development of various contents in the sports industry. Social media channels collected detailed social big data from news provided on Naver and Google sites. Data were collected from January 1, 2023 to February 15, 2023, referring to the release date of 'The First Slam Dunk' in Korea. The collected data were 2,106 Naver news data, and 1,019 Google news data were collected. TF and TF-IDF were analyzed through text mining for these data. Through this, semantic network analysis was conducted for 60 keywords. Big data analysis programs such as Textom and UCINET were used for social big data analysis, and NetDraw was used for visualization. As a result of the study, the keyword with the high frequency in relation to the subject in consideration of TF and TF-IDF appeared 4,079 times as 'The First Slam Dunk' was the keyword with the high frequency among the frequent keywords. Next are 'Slam Dunk', 'Movie', 'Premiere', 'Animation', 'Audience', and 'Box-Office'. Based on these results, 60 high-frequency appearing keywords were extracted. After that, semantic metrics and centrality analysis were conducted. Finally, a total of 6 clusters(competing movie, cartoon, passion, premiere, attention, Box-Office) were formed through CONCOR analysis. Based on this analysis of the semantic network of 'The First Slam Dunk', basic data on the development plan of sports content were provided.