• Title/Summary/Keyword: Big Data Structure

Search Result 383, Processing Time 0.024 seconds

Healthcare service analysis using big data

  • Park, Arum;Song, Jaemin;Lee, Sae Bom
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.4
    • /
    • pp.149-156
    • /
    • 2020
  • In the Fourth Industrial Revolution, successful cases using big data in various industries are reported. This paper examines cases that successfully use big data in the medical industry to develop the service and draws implications in value that big data create. The related work introduces big data technology in the medical field and cases of eight innovative service in the big data service are explained. In the introduction, the overall structure of the study is mentioned by describing the background and direction of this study. In the literature study, we explain the definition and concept of big data, and the use of big data in the medical industry. Next, this study describes the several cases, such as technologies using national health information and personal genetic information for the study of diseases, personal health services using personal biometric information, use of medical data for efficiency of business processes, and medical big data for the development of new medicines. In the conclusion, we intend to provide direction for the academic and business implications of this study, as well as how the results of the study can help the domestic medical industry.

Big Data Platform Based on Hadoop and Application to Weight Estimation of FPSO Topside

  • Kim, Seong-Hoon;Roh, Myung-Il;Kim, Ki-Su;Oh, Min-Jae
    • Journal of Advanced Research in Ocean Engineering
    • /
    • v.3 no.1
    • /
    • pp.32-40
    • /
    • 2017
  • Recently, the amount of data to be processed and the complexity thereof have been increasing due to the development of information and communication technology, and industry's interest in such big data is increasing day by day. In the shipbuilding and offshore industry also, there is growing interest in the effective utilization of data, since various and vast amounts of data are being generated in the process of design, production, and operation. In order to effectively utilize big data in the shipbuilding and offshore industry, it is necessary to store and process large amounts of data. In this study, it was considered efficient to apply Hadoop and R, which are mostly used in big data related research. Hadoop is a framework for storing and processing big data. It provides the Hadoop Distributed File System (HDFS) for storing big data, and the MapReduce function for processing. Meanwhile, R provides various data analysis techniques through the language and environment for statistical calculation and graphics. While Hadoop makes it is easy to handle big data, it is difficult to finely process data; and although R has advanced analysis capability, it is difficult to use to process large data. This study proposes a big data platform based on Hadoop for applications in the shipbuilding and offshore industry. The proposed platform includes the existing data of the shipyard, and makes it possible to manage and process the data. To check the applicability of the platform, it is applied to estimate the weights of offshore structure topsides. In this study, we store data of existing FPSOs in Hadoop-based Hortonworks Data Platform (HDP), and perform regression analysis using RHadoop. We evaluate the effectiveness of large data processing by RHadoop by comparing the results of regression analysis and the processing time, with the results of using the conventional weight estimation program.

Research of Knowledge Management and Reusability in Streaming Big Data with Privacy Policy through Actionable Analytics (스트리밍 빅데이터의 프라이버시 보호 동반 실용적 분석을 통한 지식 활용과 재사용 연구)

  • Paik, Juryon;Lee, Youngsook
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.12 no.3
    • /
    • pp.1-9
    • /
    • 2016
  • The current meaning of "Big Data" refers to all the techniques for value eduction and actionable analytics as well management tools. Particularly, with the advances of wireless sensor networks, they yield diverse patterns of digital records. The records are mostly semi-structured and unstructured data which are usually beyond of capabilities of the management tools. Such data are rapidly growing due to their complex data structures. The complex type effectively supports data exchangeability and heterogeneity and that is the main reason their volumes are getting bigger in the sensor networks. However, there are many errors and problems in applications because the managing solutions for the complex data model are rarely presented in current big data environments. To solve such problems and show our differentiation, we aim to provide the solution of actionable analytics and semantic reusability in the sensor web based streaming big data with new data structure, and to empower the competitiveness.

Capturing Data from Untapped Sources using Apache Spark for Big Data Analytics (빅데이터 분석을 위해 아파치 스파크를 이용한 원시 데이터 소스에서 데이터 추출)

  • Nichie, Aaron;Koo, Heung-Seo
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.65 no.7
    • /
    • pp.1277-1282
    • /
    • 2016
  • The term "Big Data" has been defined to encapsulate a broad spectrum of data sources and data formats. It is often described to be unstructured data due to its properties of variety in data formats. Even though the traditional methods of structuring data in rows and columns have been reinvented into column families, key-value or completely replaced with JSON documents in document-based databases, the fact still remains that data have to be reshaped to conform to certain structure in order to persistently store the data on disc. ETL processes are key in restructuring data. However, ETL processes incur additional processing overhead and also require that data sources are maintained in predefined formats. Consequently, data in certain formats are completely ignored because designing ETL processes to cater for all possible data formats is almost impossible. Potentially, these unconsidered data sources can provide useful insights when incorporated into big data analytics. In this project, using big data solution, Apache Spark, we tapped into other sources of data stored in their raw formats such as various text files, compressed files etc and incorporated the data with persistently stored enterprise data in MongoDB for overall data analytics using MongoDB Aggregation Framework and MapReduce. This significantly differs from the traditional ETL systems in the sense that it is compactible regardless of the data formats at source.

Applying Service Quality to Big Data Quality (빅데이터 품질 확장을 위한 서비스 품질 연구)

  • Park, Jooseok;Kim, Seunghyun;Ryu, Hocheol;Lee, Zoonky;Lee, Jangho;Lee, Junyong
    • The Journal of Bigdata
    • /
    • v.2 no.2
    • /
    • pp.87-93
    • /
    • 2017
  • The research on data quality has been performed for a long time. However, the research focused on structured data. With the recent digital revolution or the fourth industrial revolution, quality control of big data is becoming more important. In this paper, we analyze and classify big data quality types through previous research. The types of big data quality can be classified into value, data structure, process, value chain, and maturity model. Based on these comparative studies, this paper proposes a new standard, service quality of big data.

  • PDF

Design and Implementation of Incremental Learning Technology for Big Data Mining

  • Min, Byung-Won;Oh, Yong-Sun
    • International Journal of Contents
    • /
    • v.15 no.3
    • /
    • pp.32-38
    • /
    • 2019
  • We usually suffer from difficulties in treating or managing Big Data generated from various digital media and/or sensors using traditional mining techniques. Additionally, there are many problems relative to the lack of memory and the burden of the learning curve, etc. in an increasing capacity of large volumes of text when new data are continuously accumulated because we ineffectively analyze total data including data previously analyzed and collected. In this paper, we propose a general-purpose classifier and its structure to solve these problems. We depart from the current feature-reduction methods and introduce a new scheme that only adopts changed elements when new features are partially accumulated in this free-style learning environment. The incremental learning module built from a gradually progressive formation learns only changed parts of data without any re-processing of current accumulations while traditional methods re-learn total data for every adding or changing of data. Additionally, users can freely merge new data with previous data throughout the resource management procedure whenever re-learning is needed. At the end of this paper, we confirm a good performance of this method in data processing based on the Big Data environment throughout an analysis because of its learning efficiency. Also, comparing this algorithm with those of NB and SVM, we can achieve an accuracy of approximately 95% in all three models. We expect that our method will be a viable substitute for high performance and accuracy relative to large computing systems for Big Data analysis using a PC cluster environment.

The Effect of AI and Big Data on an Entry Firm: Game Theoretic Approach (인공지능과 빅데이터가 시장진입 기업에 미치는 영향관계 분석, 게임이론 적용을 중심으로)

  • Jeong, Jikhan
    • Journal of Digital Convergence
    • /
    • v.19 no.7
    • /
    • pp.95-111
    • /
    • 2021
  • Despite the innovation of AI and Big Data, theoretical research bout the effect of AI and Big Data on market competition is still in early stages; therefore, this paper analyzes the effect of AI, Big Data, and data sharing on an entry firm by using game theory. In detail, the firms' business environments are divided into internal and external ones. Then, AI algorithms are divided into algorithms for (1) customer marketing, (2) cost reduction without automation, and (3) cost reduction with automation. Big Data is also divided into external and internal data. this study shows that the sharing of external data does not affect the incumbent firm's algorithms for consumer marketing while lessening the entry firm's entry barrier. Improving the incumbent firm's algorithms for cost reduction (with and without automation) and external data can be an entry barrier for the entry firm. These findings can be helpful (1) to analyze the effect of AI, Big Data, and data sharing on market structure, market competition, and firm behaviors and (2) to design policy for AI and Big Data.

Modeling and Implementation of Public Open Data in NoSQL Database

  • Min, Meekyung
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.10 no.3
    • /
    • pp.51-58
    • /
    • 2018
  • In order to utilize various data provided by Korea public open data portal, data should be systematically managed using a database. Since the range of open data is enormous, and the amount of data continues to increase, it is preferable to use a database capable of processing big data in order to analyze and utilize the data. This paper proposes data modeling and implementation method suitable for public data. The target data is subway related data provided by the public open data portal. Schema of the public data related to Seoul metro stations are analyzed and problems of the schema are presented. To solve these problems, this paper proposes a method to normalize and structure the subway data and model it in NoSQL database. In addition, the implementation result is shown by using MongDB which is a document-based database capable of processing big data.

Developing a Big Data Analytics Platform Architecture for Smart Factory (스마트공장을 위한 빅데이터 애널리틱스 플랫폼 아키텍쳐 개발)

  • Shin, Seung-Jun;Woo, Jungyub;Seo, Wonchul
    • Journal of Korea Multimedia Society
    • /
    • v.19 no.8
    • /
    • pp.1516-1529
    • /
    • 2016
  • While global manufacturing is becoming more competitive due to variety of customer demand, increase in production cost and uncertainty in resource availability, the future ability of manufacturing industries depends upon the implementation of Smart Factory. With the convergence of new information and communication technology, Smart Factory enables manufacturers to respond quickly to customer demand and minimize resource usage while maximizing productivity performance. This paper presents the development of a big data analytics platform architecture for Smart Factory. As this platform represents a conceptual software structure needed to implement data-driven decision-making mechanism in shop floors, it enables the creation and use of diagnosis, prediction and optimization models through the use of data analytics and big data. The completion of implementing the platform will help manufacturers: 1) acquire an advanced technology towards manufacturing intelligence, 2) implement a cost-effective analytics environment through the use of standardized data interfaces and open-source solutions, 3) obtain a technical reference for time-efficiently implementing an analytics modeling environment, and 4) eventually improve productivity performance in manufacturing systems. This paper also presents a technical architecture for big data infrastructure, which we are implementing, and a case study to demonstrate energy-predictive analytics in a machine tool system.

Developing Graphic Interface for Efficient Online Searching and Analysis of Graph-Structured Bibliographic Big Data (그래프 구조를 갖는 서지 빅데이터의 효율적인 온라인 탐색 및 분석을 지원하는 그래픽 인터페이스 개발)

  • You, Youngseok;Park, Beomjun;Jo, Sunhwa;Lee, Suan;Kim, Jinho
    • The Journal of Bigdata
    • /
    • v.5 no.1
    • /
    • pp.77-88
    • /
    • 2020
  • Recently, many researches habe been done to organize and analyze various complex relationships in real world, represented in the form of graphs. In particular, the computer field literature data system, such as DBLP, is a representative graph data in which can be composed of papers, their authors, and citation among papers. Becasue graph data is very complex in storage structure and expression, it is very difficult task to search, analysis, and visualize a large size of bibliographic big data. In this paper, we develop a graphic user interface tool, called EEUM, which visualizes bibliographic big data in the form of graphs. EEUM provides the features to browse bibliographic big data according to the connected graph structure by visually displaying graph data, and implements search, management and analysis of the bibliographc big data. It also shows that EEUM can be conveniently used to search, explore, and analyze by applying EEUM to the bibliographic graph big data provided by DBLP. Through EEUM, you can easily find influential authors or papers in every research fields, and conveniently use it as a search and analysis tool for complex bibliographc big data, such as giving you a glimpse of all the relationships between several authors and papers.