• Title/Summary/Keyword: Big Data Cluster

Search Result 209, Processing Time 0.024 seconds

Management of Distributed Nodes for Big Data Analysis in Small-and-Medium Sized Hospital (중소병원에서의 빅데이터 분석을 위한 분산 노드 관리 방안)

  • Ryu, Wooseok
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2016.05a
    • /
    • pp.376-377
    • /
    • 2016
  • Performance of Hadoop, which is a distributed data processing framework for big data analysis, is affected by several characteristics of each node in distributed cluster such as processing power and network bandwidth. This paper analyzes previous approaches for heterogeneous hadoop clusters, and presents several requirements for distributed node clustering in small-and-medium sized hospitals by considering computing environments of the hospitals.

  • PDF

Aircraft Recognition from Remote Sensing Images Based on Machine Vision

  • Chen, Lu;Zhou, Liming;Liu, Jinming
    • Journal of Information Processing Systems
    • /
    • v.16 no.4
    • /
    • pp.795-808
    • /
    • 2020
  • Due to the poor evaluation indexes such as detection accuracy and recall rate when Yolov3 network detects aircraft in remote sensing images, in this paper, we propose a remote sensing image aircraft detection method based on machine vision. In order to improve the target detection effect, the Inception module was introduced into the Yolov3 network structure, and then the data set was cluster analyzed using the k-means algorithm. In order to obtain the best aircraft detection model, on the basis of our proposed method, we adjusted the network parameters in the pre-training model and improved the resolution of the input image. Finally, our method adopted multi-scale training model. In this paper, we used remote sensing aircraft dataset of RSOD-Dataset to do experiments, and finally proved that our method improved some evaluation indicators. The experiment of this paper proves that our method also has good detection and recognition ability in other ground objects.

National Awareness of the 2019 World Swimming Championships using Big Data from Social Network Analysis (소셜네트워크 분석의 빅데이터를 활용한 2019세계수영선수권 대회의 국내 인식조사)

  • Kim, Gi-Tak
    • Journal of Korea Entertainment Industry Association
    • /
    • v.13 no.4
    • /
    • pp.173-184
    • /
    • 2019
  • The data processing of this study is based on the word data search in social media through textom and the big data analysis is carried out and three areas (2019 Gwangju World Swimming Championships, 2019 Gwangju World Swimming Masters Competition, 2019 World Swimming Championships Problem) was consistently handled through data collection and refinement in the web environment. We applied the collected words to the program of Ucinet6, visualized them, and conducted a CONCOR analysis to grasp the similar relationship of words and to identify the cluster of common factors. As a result of the analysis, the clusters related to the 2019 Gwangju World Swimming Championships mainly consisted of four major areas of recognition and perception, mainly searching for operational aspects related to the swimming championship, and the community related to the 2019 Gwangju World Swimming Masters Competition Is mainly searched for the promotion of the Masters Competition and the aspect of the competition divided into two areas of major recognition and peripheral recognition. The cluster related to the problems of the 2019 Gwangju World Swimming Championships is divided into five areas, And they are mainly searching for the place, operation, institution, event, etc. of the problem of the swimming championship.

An Efficient Implementation of Mobile Raspberry Pi Hadoop Clusters for Robust and Augmented Computing Performance

  • Srinivasan, Kathiravan;Chang, Chuan-Yu;Huang, Chao-Hsi;Chang, Min-Hao;Sharma, Anant;Ankur, Avinash
    • Journal of Information Processing Systems
    • /
    • v.14 no.4
    • /
    • pp.989-1009
    • /
    • 2018
  • Rapid advances in science and technology with exponential development of smart mobile devices, workstations, supercomputers, smart gadgets and network servers has been witnessed over the past few years. The sudden increase in the Internet population and manifold growth in internet speeds has occasioned the generation of an enormous amount of data, now termed 'big data'. Given this scenario, storage of data on local servers or a personal computer is an issue, which can be resolved by utilizing cloud computing. At present, there are several cloud computing service providers available to resolve the big data issues. This paper establishes a framework that builds Hadoop clusters on the new single-board computer (SBC) Mobile Raspberry Pi. Moreover, these clusters offer facilities for storage as well as computing. Besides the fact that the regular data centers require large amounts of energy for operation, they also need cooling equipment and occupy prime real estate. However, this energy consumption scenario and the physical space constraints can be solved by employing a Mobile Raspberry Pi with Hadoop clusters that provides a cost-effective, low-power, high-speed solution along with micro-data center support for big data. Hadoop provides the required modules for the distributed processing of big data by deploying map-reduce programming approaches. In this work, the performance of SBC clusters and a single computer were compared. It can be observed from the experimental data that the SBC clusters exemplify superior performance to a single computer, by around 20%. Furthermore, the cluster processing speed for large volumes of data can be enhanced by escalating the number of SBC nodes. Data storage is accomplished by using a Hadoop Distributed File System (HDFS), which offers more flexibility and greater scalability than a single computer system.

Distributed Processing of Big Data Analysis based on R using SparkR (SparkR을 이용한 R 기반 빅데이터 분석의 분산 처리)

  • Ryu, Woo-Seok
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.1
    • /
    • pp.161-166
    • /
    • 2022
  • In this paper, we analyze the problems that occur when performing the big data analysis using R as a data analysis tool, and present the usefulness of the data analysis with SparkR which connects R and Spark to support distributed processing of big data effectively. First, we study the memory allocation problem of R which occurs when loading large amounts of data and performing operations, and the characteristics and programming environment of SparkR. And then, we perform the comparison analysis of the execution performance when linear regression analysis is performed in each environment. As a result of the analysis, it was shown that R can be used for data analysis through SparkR without additional language learning, and the code written in R can be effectively processed distributedly according to the increase in the number of nodes in the cluster.

A Semantic Diagnosis and Tracking System to Prevent the Spread of COVID-19 (COVID-19 확산 방지를 위한 시맨틱 진단 및 추적시스템)

  • Xiang, Sun Yu;Lee, Yong-Ju
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.15 no.3
    • /
    • pp.611-616
    • /
    • 2020
  • In order to prevent the further spread of the COVID-19 virus in big cities, this paper proposes a semantic diagnosis and tracking system based on Linked Data through the cluster analysis of the infection situation in Seoul, South Korea. This paper is mainly composed of three sections, information of infected people in Seoul is collected for the cluster analysis, important infected patient attributes are extracted to establish a diagnostic model based on random forest, and a tracking system based on Linked Data is designed and implemented. Experimental results show that the accuracy of our diagnostic model is more than 80%. Moreover, our tracking system is more flexible and open than existing systems and supports semantic queries.

A study on the Domestic Consumer's Perception of "Hansik" with Big Data Analysis : Using Text Mining and Semantic Network Analysis (빅데이터를 통한 내국인의 '한식' 인식 연구 : 텍스트마이닝과 의미연결망 중심으로)

  • Park, Kyeong-Won;Yun, Hee-Kyoung
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.6
    • /
    • pp.145-151
    • /
    • 2020
  • 'Hansik', or Korean cuisine is one of Korea national brands. To understand the domestic consumer awareness of Korean cuisine, data was gathered under the keyword search, 'Hansik.' Textom 3.5 was used to gather data from blogs, news media found on Naver from November 1, 2018, to October 31, 2019. The results from frequency and TF-IDF analysis indicate that the 'buffet' had the largest proportion in terms of consumer awareness to Hansik. Also, broadcasting contents starring star chefs had a great influence. The Hansik awareness did not remain in the domains of its traditionality, but also branched into extents into areas such as fusional and gourmet cuisine. UCINET6 and NetDraw were used to conduct CONCOR analysis. Four cluster formations have been found; various food cultural cluster, high-end restaurant cluster referring to aired restaurants on media, Hansik brand cluster, and Hansik buffet cluster. This study proposes presenting a various menu of Hansik which use a multiple number of ingredients. Also, a promotion that introduces fine Hansik and a development of marketing views and media contents about the convenient HMRs make the associated imagery of Hansik to be strengthen.

Personal Recommendation Service Design Through Big Data Analysis on Science Technology Information Service Platform (과학기술정보 서비스 플랫폼에서의 빅데이터 분석을 통한 개인화 추천서비스 설계)

  • Kim, Dou-Gyun
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.28 no.4
    • /
    • pp.501-518
    • /
    • 2017
  • Reducing the time it takes for researchers to acquire knowledge and introduce them into research activities can be regarded as an indispensable factor in improving the productivity of research. The purpose of this research is to cluster the information usage patterns of KOSEN users and to suggest optimization method of personalized recommendation service algorithm for grouped users. Based on user research activities and usage information, after identifying appropriate services and contents, we applied a Spark based big data analysis technology to derive a personal recommendation algorithm. Individual recommendation algorithms can save time to search for user information and can help to find appropriate information.

An Analysis of Causes of Marine Incidents at sea Using Big Data Technique (빅데이터 기법을 활용한 항해 중 준해양사고 발생원인 분석에 관한 연구)

  • Kang, Suk-Young;Kim, Ki-Sun;Kim, Hong-Beom;Rho, Beom-Seok
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.24 no.4
    • /
    • pp.408-414
    • /
    • 2018
  • Various studies have been conducted to reduce marine accidents. However, research on marine incidents is only marginal. There are many reports of marine incidents, but the main content of existing studies has been qualitative, which makes quantitative analysis difficult. However, quantitative analysis of marine accidents is necessary to reduce marine incidents. The purpose of this paper is to analyze marine incident data quantitatively by applying big data techniques to predict marine incident trends and reduce marine accident. To accomplish this, about 10,000 marine incident reports were prepared in a unified format through pre-processing. Using this preprocessed data, we first derived major keywords for the Marine incidents at sea using text mining techniques. Secondly, time series and cluster analysis were applied to major keywords. Trends for possible marine incidents were predicted. The results confirmed that it is possible to use quantified data and statistical analysis to address this topic. Also, we have confirmed that it is possible to provide information on preventive measures by grasping objective tendencies for marine incidents that may occur in the future through big data techniques.

Classifying and Characterizing the Types of Gentrified Commercial Districts Based on Sense of Place Using Big Data: Focusing on 14 Districts in Seoul (빅데이터를 활용한 젠트리피케이션 상권의 장소성 분류와 특성 분석 -서울시 14개 주요상권을 중심으로-)

  • Young-Jae Kim;In Kwon Park
    • Journal of the Korean Regional Science Association
    • /
    • v.39 no.1
    • /
    • pp.3-20
    • /
    • 2023
  • This study aims to categorize the 14 major gentrified commercial areas of Seoul and analyze their characteristics based on their sense of place. To achieve this, we conducted hierarchical cluster analysis using text data collected from Naver Blog. We divided the districts into two dimensions: "experience" and "feature" and analyzed their characteristics using LDA (Latent Dirichlet Allocation) of the text data and statistical data collected from Seoul Open Data Square. As a result, we classified the commercial districts of Seoul into 5 categories: 'theater district,' 'traditional cultural district,' 'female-beauty district,' 'exclusive restaurant and medical district,' and 'trend-leading district.' The findings of this study are expected to provide valuable insights for policy-makers to develop more efficient and suitable commercial policies.