• Title/Summary/Keyword: BIG DATA

Search Result 6,199, Processing Time 0.033 seconds

Applications and Issues of Medical Big Data (의료 빅데이터의 활용과 해결과제)

  • Woo, SungHee
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2016.05a
    • /
    • pp.545-548
    • /
    • 2016
  • Big data is all data generated in the digital environment which has a variety of large and a short life cycle. The amount and type of data are becoming more and more produced on a larger scale, as a smart phone and the internet are popular, and consequently it has been converted into time for users to take advantage and extract only the valuable and useful data from the generated big data. Big data can also be applied to the medical industry and health sectors. It has created the synergy to be fused with ICT such as IoT, smart healthcare, and so on. However, there will be challenges like data security in order securely to use a meaningful and useful vast amounts of data. In this study, we analyze the future prospects of the healthcare, applications and issues of medical big data, and the expected challenges.

  • PDF

Efficient K-Anonymization Implementation with Apache Spark

  • Kim, Tae-Su;Kim, Jong Wook
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.11
    • /
    • pp.17-24
    • /
    • 2018
  • Today, we are living in the era of data and information. With the advent of Internet of Things (IoT), the popularity of social networking sites, and the development of mobile devices, a large amount of data is being produced in diverse areas. The collection of such data generated in various area is called big data. As the importance of big data grows, there has been a growing need to share big data containing information regarding an individual entity. As big data contains sensitive information about individuals, directly releasing it for public use may violate existing privacy requirements. Thus, privacy-preserving data publishing (PPDP) has been actively studied to share big data containing personal information for public use, while preserving the privacy of the individual. K-anonymity, which is the most popular method in the area of PPDP, transforms each record in a table such that at least k records have the same values for the given quasi-identifier attributes, and thus each record is indistinguishable from other records in the same class. As the size of big data continuously getting larger, there is a growing demand for the method which can efficiently anonymize vast amount of dta. Thus, in this paper, we develop an efficient k-anonymity method by using Spark distributed framework. Experimental results show that, through the developed method, significant gains in processing time can be achieved.

Design and Implementation of Hadoop-based Big-data processing Platform for IoT Environment (사물인터넷 환경을 위한 하둡 기반 빅데이터 처리 플랫폼 설계 및 구현)

  • Heo, Seok-Yeol;Lee, Ho-Young;Lee, Wan-Jik
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.2
    • /
    • pp.194-202
    • /
    • 2019
  • In the information society represented by the Fourth Industrial Revolution, various types of data and information that are difficult to see are produced, processed, and processed and circulated to enhance the value of existing goods. The IoT(Internet of Things) paradigm will change the appearance of individual life, industry, disaster, safety and public service fields. In order to implement the IoT paradigm, several elements of technology are required. It is necessary that these various elements are efficiently connected to constitute one system as a whole. It is also necessary to collect, provide, transmit, store and analyze IoT data for implementation of IoT platform. We designed and implemented a big data processing IoT platform for IoT service implementation. Proposed platform system is consist of IoT sensing/control device, IoT message protocol, unstructured data server and big data analysis components. For platform testing, fixed IoT devices were implemented as solar power generation modules and mobile IoT devices as modules for table tennis stroke data measurement. The transmission part uses the HTTP and the CoAP, which are based on the Internet. The data server is composed of Hadoop and the big data is analyzed using R. Through the emprical test using fixed and mobile IoT devices we confirmed that proposed IoT platform system normally process and operate big data.

A Context-Awareness Modeling User Profile Construction Method for Personalized Information Retrieval System

  • Kim, Jee Hyun;Gao, Qian;Cho, Young Im
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.14 no.2
    • /
    • pp.122-129
    • /
    • 2014
  • Effective information gathering and retrieval of the most relevant web documents on the topic of interest is difficult due to the large amount of information that exists in various formats. Current information gathering and retrieval techniques are unable to exploit semantic knowledge within documents in the "big data" environment; therefore, they cannot provide precise answers to specific questions. Existing commercial big data analytic platforms are restricted to a single data type; moreover, different big data analytic platforms are effective at processing different data types. Therefore, the development of a common big data platform that is suitable for efficiently processing various data types is needed. Furthermore, users often possess more than one intelligent device. It is therefore important to find an efficient preference profile construction approach to record the user context and personalized applications. In this way, user needs can be tailored according to the user's dynamic interests by tracking all devices owned by the user.

Can Big Data Help Predict Financial Market Dynamics?: Evidence from the Korean Stock Market

  • Pyo, Dong-Jin
    • East Asian Economic Review
    • /
    • v.21 no.2
    • /
    • pp.147-165
    • /
    • 2017
  • This study quantifies the dynamic interrelationship between the KOSPI index return and search query data derived from the Naver DataLab. The empirical estimation using a bivariate GARCH model reveals that negative contemporaneous correlations between the stock return and the search frequency prevail during the sample period. Meanwhile, the search frequency has a negative association with the one-week- ahead stock return but not vice versa. In addition to identifying dynamic correlations, the paper also aims to serve as a test bed in which the existence of profitable trading strategies based on big data is explored. Specifically, the strategy interpreting the heightened investor attention as a negative signal for future returns appears to have been superior to the benchmark strategy in terms of the expected utility over wealth. This paper also demonstrates that the big data-based option trading strategy might be able to beat the market under certain conditions. These results highlight the possibility of big data as a potential source-which has been left largely untapped-for establishing profitable trading strategies as well as developing insights on stock market dynamics.

Modeling of Policy Making for Big Data (빅데이터를 위한 정책결정 설계)

  • Lee, Sangwon;Park, Sungbum;Kim, Sunghyun;Chae, Seong Wook
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2015.01a
    • /
    • pp.281-282
    • /
    • 2015
  • Data, by itself, will not reveal the optimal policy choice. Nor will data alone tell us what problems to focus on or how to direct resources. It should be recognized upfront that data-driven policy making cannot provide all the answers to the challenges of good governance. Policy decisions always depend on a combination of facts, analysis, judgment, and values. In this paper, we research on factors to design an organizational policy making for Big Data.

  • PDF

Development of the design methodology for large-scale database based on MongoDB

  • Lee, Jun-Ho;Joo, Kyung-Soo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.11
    • /
    • pp.57-63
    • /
    • 2017
  • The recent sudden increase of big data has characteristics such as continuous generation of data, large amount, and unstructured format. The existing relational database technologies are inadequate to handle such big data due to the limited processing speed and the significant storage expansion cost. Thus, big data processing technologies, which are normally based on distributed file systems, distributed database management, and parallel processing technologies, have arisen as a core technology to implement big data repositories. In this paper, we propose a design methodology for large-scale database based on MongoDB by extending the information engineering methodology based on E-R data model.

A Study on the Policy Trends for the Revitalization of Medical Big Data Industry (의료 빅데이터 산업 활성화를 위한 정책 동향 고찰)

  • Kim, Hyejin;Yi, Myongho
    • Journal of Digital Convergence
    • /
    • v.18 no.4
    • /
    • pp.325-340
    • /
    • 2020
  • Today's rapidly developing health technology is accumulating vast amounts of data through medical devices based on the Internet of Things in addition to data generated in hospitals. The collected data is a raw material that can create a variety of values, but our society lacks legal and institutional mechanisms to support medical Big Data. Therefore, in this study, we looked at four major factors that hinder the use of medical Big Data to find ways to enhance use of the Big Data based healthcare industry, and also derived implications for expanding domestic medical Big Data by identifying foreign policies and technological trends. As a result of the study, it was concluded that it is necessary to improve the regulatory system that satisfies the security and usability of healthcare Big Data as well as establish Big Data governance. For this, it is proposed to refer to the Big Data De-identification Guidelines adopted by the United States and the United Kingdom to reorganize the regulatory system. In the future, it is expected that it will be necessary to have a study that has measures of the conclusions and implications of this study and to supplement the institutional needs to play a positive role in the use of medical Big Data.

Design of RBF Neural Networks Based on Recursive Weighted Least Square Estimation for Processing Massive Meteorological Radar Data and Its Application (방대한 기상 레이더 데이터의 원할한 처리를 위한 순환 가중최소자승법 기반 RBF 뉴럴 네트워크 설계 및 응용)

  • Kang, Jeon-Seong;Oh, Sung-Kwun
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.64 no.1
    • /
    • pp.99-106
    • /
    • 2015
  • In this study, we propose Radial basis function Neural Network(RBFNN) using Recursive Weighted Least Square Estimation(RWLSE) to effectively deal with big data class meteorological radar data. In the condition part of the RBFNN, Fuzzy C-Means(FCM) clustering is used to obtain fitness values taking into account characteristics of input data, and connection weights are defined as linear polynomial function in the conclusion part. The coefficients of the polynomial function are estimated by using RWLSE in order to cope with big data. As recursive learning technique, RWLSE which is based on WLSE is carried out to efficiently process big data. This study is experimented with both widely used some Machine Learning (ML) dataset and big data obtained from meteorological radar to evaluate the performance of the proposed classifier. The meteorological radar data as big data consists of precipitation echo and non-precipitation echo, and the proposed classifier is used to efficiently classify these echoes.

Study on Big Data Linkage Method for Managing Port Infrastructure Disasters and Aging (항만 인프라 재해 및 노후화 관리를 위한 빅데이터 연계 방안 연구)

  • Choi, Woo-geun;Park, Sun-ho
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.10a
    • /
    • pp.134-137
    • /
    • 2021
  • This study aims to develop a digital twin and big data-based port infrastructure control system that reflects smart maintenance technology. It is a technology that can evaluate aging and disaster risk by converting heterogeneous data such as sensing data and image data acquired from port infrastructure into big data, visualized in a digital twin-based control system, and comprehensively analyzed. The meaning of big data to express the physical world and processes by combining data, which are the core components of the virtual world, and the matters to be reflected in each stage of securing, processing, storing, analyzing and utilizing necessary big data, and we would like to define methods for linking with IT resources.

  • PDF