• Title/Summary/Keyword: big data analysis technology

Search Result 1,123, Processing Time 0.033 seconds

Study of In-Memory based Hybrid Big Data Processing Scheme for Improve the Big Data Processing Rate (빅데이터 처리율 향상을 위한 인-메모리 기반 하이브리드 빅데이터 처리 기법 연구)

  • Lee, Hyeopgeon;Kim, Young-Woon;Kim, Ki-Young
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.12 no.2
    • /
    • pp.127-134
    • /
    • 2019
  • With the advancement of IT technology, the amount of data generated has been growing exponentially every year. As an alternative to this, research on distributed systems and in-memory based big data processing schemes has been actively underway. The processing power of traditional big data processing schemes enables big data to be processed as fast as the number of nodes and memory capacity increases. However, the increase in the number of nodes inevitably raises the frequency of failures in a big data infrastructure environment, and infrastructure management points and infrastructure operating costs also increase accordingly. In addition, the increase in memory capacity raises infrastructure costs for a node configuration. Therefore, this paper proposes an in-memory-based hybrid big data processing scheme for improve the big data processing rate. The proposed scheme reduces the number of nodes compared to traditional big data processing schemes based on distributed systems by adding a combiner step to a distributed system processing scheme and applying an in-memory based processing technology at that step. It decreases the big data processing time by approximately 22%. In the future, realistic performance evaluation in a big data infrastructure environment consisting of more nodes will be required for practical verification of the proposed scheme.

Study of Mental Disorder Schizophrenia, based on Big Data

  • Hye-Sun Lee
    • International Journal of Advanced Culture Technology
    • /
    • v.11 no.4
    • /
    • pp.279-285
    • /
    • 2023
  • This study provides academic implications by considering trends of domestic research regarding therapy for Mental disorder schizophrenia and psychosocial. For the analysis of this study, text mining with the use of R program and social network analysis method have been used and 65 papers have been collected The result of this study is as follows. First, collected data were visualized through analysis of keywords by using word cloud method. Second, keywords such as intervention, schizophrenia, research, patients, program, effect, society, mind, ability, function were recorded with highest frequency resulted from keyword frequency analysis. Third, LDA (latent Dirichlet allocation) topic modeling result showed that classified into 3 keywords: patient, subjects, intervention of psychosocial, efficacy of interventions. Fourth, the social network analysis results derived connectivity, closeness centrality, betweennes centrality. In conclusion, this study presents significant results as it provided basic rehabilitation data for schizophrenia and psychosocial therapy through new research methods by analyzing with big data method by proposing the results through visualization from seeking research trends of schizophrenia and psychosocial therapy through text mining and social network analysis.

Big Data Platform Based on Hadoop and Application to Weight Estimation of FPSO Topside

  • Kim, Seong-Hoon;Roh, Myung-Il;Kim, Ki-Su;Oh, Min-Jae
    • Journal of Advanced Research in Ocean Engineering
    • /
    • v.3 no.1
    • /
    • pp.32-40
    • /
    • 2017
  • Recently, the amount of data to be processed and the complexity thereof have been increasing due to the development of information and communication technology, and industry's interest in such big data is increasing day by day. In the shipbuilding and offshore industry also, there is growing interest in the effective utilization of data, since various and vast amounts of data are being generated in the process of design, production, and operation. In order to effectively utilize big data in the shipbuilding and offshore industry, it is necessary to store and process large amounts of data. In this study, it was considered efficient to apply Hadoop and R, which are mostly used in big data related research. Hadoop is a framework for storing and processing big data. It provides the Hadoop Distributed File System (HDFS) for storing big data, and the MapReduce function for processing. Meanwhile, R provides various data analysis techniques through the language and environment for statistical calculation and graphics. While Hadoop makes it is easy to handle big data, it is difficult to finely process data; and although R has advanced analysis capability, it is difficult to use to process large data. This study proposes a big data platform based on Hadoop for applications in the shipbuilding and offshore industry. The proposed platform includes the existing data of the shipyard, and makes it possible to manage and process the data. To check the applicability of the platform, it is applied to estimate the weights of offshore structure topsides. In this study, we store data of existing FPSOs in Hadoop-based Hortonworks Data Platform (HDP), and perform regression analysis using RHadoop. We evaluate the effectiveness of large data processing by RHadoop by comparing the results of regression analysis and the processing time, with the results of using the conventional weight estimation program.

The Preliminary Feasibility on Big Data Analytic Application in Construction

  • Ko, Yongho;Han, Seungwoo
    • International conference on construction engineering and project management
    • /
    • 2015.10a
    • /
    • pp.276-279
    • /
    • 2015
  • Along with the increase of the quantity of data in various industries, the construction industry has also developed various systems focusing on collecting data related to the construction performance such as productivity and costs achieved in construction job sites. Numerous researchers worldwide have been focusing on developing efficient methodologies to analyze such data. However, applications of such methodologies have shown serious limitations on practical applications due to lack of data and difficulty in finding appropriate analytic methodologies which were capable of implementing significant insights. With development of information technology, the new trend in analytic methodologies has been introduced and steeply developed with the new name of "big data analysis" in various fields in academia and industry. The new concept of big data can be applied for significant analysis on various formats of construction data such as structured, semi-structured, or non-structured formats. This study investigates preliminary application methods based on data collected from actual construction site. This preliminary investigation in this study expects to assess fundamental feasibility of big data analytic applications in construction.

  • PDF

An Analysis of Big Data Structure Based on the Ecological Perspective (생태계 관점에서의 빅데이터 활성화를 위한 구조 연구)

  • Cho, Jiyeon;Kim, Taisiya;Park, Keon Chul;Lee, Bong Gyou
    • Journal of Information Technology Services
    • /
    • v.11 no.4
    • /
    • pp.277-294
    • /
    • 2012
  • The purpose of this research is to analyze big data structure and various objects in big data industry based on ecological perspective. Big data is rapidly emerging as a highly valuable resource to secure competitiveness of enterprise and government. Accordingly, the main issues in big data are to find ways of creating economic value and solving various problems. However big data is not systematically organized, and hard to utilize as it constantly expands to related industry such as telecommunications, finance and manufacturing. Under this circumstance, it is crucial to understand range of big data industry and to which stakeholders are related. The ecological approach is useful to understand comprehensive industry structure. Therefore this study aims at confirming big data structure and finding issues from interaction among objects. Results of this study show main framework of big data ecosystem including relationship among object elements composing of the ecosystem. This study has significance as an initial study on big data ecosystem. The results of the study can be useful guidelines to the government for making systemized big data ecosystem and the entrepreneur who is considering launching big data business.

A Meta Analysis of the Edible Insects (식용곤충 연구 메타 분석)

  • Yu, Ok-Kyeong;Jin, Chan-Yong;Nam, Soo-Tai;Lee, Hyun-Chang
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.10a
    • /
    • pp.182-183
    • /
    • 2018
  • Big data analysis is the process of discovering a meaningful correlation, pattern, and trends in large data set stored in existing data warehouse management tools and creating new values. In addition, by extracts new value from structured and unstructured data set in big volume means a technology to analyze the results. Most of the methods of Big data analysis technology are data mining, machine learning, natural language processing, pattern recognition, etc. used in existing statistical computer science. Global research institutes have identified Big data as the most notable new technology since 2011.

  • PDF

Process and Quality Data Integrated Analysis Platform for Manufacturing SMEs (중소중견 제조기업을 위한 공정 및 품질데이터 통합형 분석 플랫폼)

  • Choe, Hye-Min;Ahn, Se-Hwan;Lee, Dong-Hyung;Cho, Yong-Ju
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.41 no.3
    • /
    • pp.176-185
    • /
    • 2018
  • With the recent development of manufacturing technology and the diversification of consumer needs, not only the process and quality control of production have become more complicated but also the kinds of information that manufacturing facilities provide the user about process have been diversified. Therefore the importance of big data analysis also has been raised. However, most small and medium enterprises (SMEs) lack the systematic infrastructure of big data management and analysis. In particular, due to the nature of domestic manufacturing companies that rely on foreign manufacturers for most of their manufacturing facilities, the need for their own data analysis and manufacturing support applications is increasing and research has been conducted in Korea. This study proposes integrated analysis platform for process and quality analysis, considering manufacturing big data database (DB) and data characteristics. The platform is implemented in two versions, Web and C/S, to enhance accessibility which perform template based quality analysis and real-time monitoring. The user can upload data from their local PC or DB and run analysis by combining single analysis module in template in a way they want since the platform is not optimized for a particular manufacturing process. Also Java and R are used as the development language for ease of system supplementation. It is expected that the platform will be available at a low price and evolve the ability of quality analysis in SMEs.

Basic research to analyze construction policy and industrial issues based on Big Data (빅데이터 기반의 건설기술용역분야 정책 및 산업이슈 분석 기초연구)

  • Han, Jae-Goo;Lee, Kyo-Sun
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2018.05a
    • /
    • pp.290-291
    • /
    • 2018
  • The purpose of this study is to analyze the trends and changes in the environment of construction technology and industry through big data analysis and to draw out implications. Based on this research, this study will be used as a basic research for the vision of industrial competitiveness in the field of construction engineering technology and the policy task.

  • PDF

Big Data Analytics Case Study from the Marketing Perspective : Emphasis on Banking Industry (마케팅 관점으로 본 빅 데이터 분석 사례연구 : 은행업을 중심으로)

  • Park, Sung Soo;Lee, Kun Chang
    • Journal of Information Technology Services
    • /
    • v.17 no.2
    • /
    • pp.207-218
    • /
    • 2018
  • Recently, it becomes a big trend in the banking industry to apply a big data analytics technique to extract essential knowledge from their customer database. Such a trend is based on the capability to analyze the big data with powerful analytics software and recognize the value of big data analysis results. However, there exits still a need for more systematic theory and mechanism about how to adopt a big data analytics approach in the banking industry. Especially, there is no study proposing a practical case study in which big data analytics is successfully accomplished from the marketing perspective. Therefore, this study aims to analyze a target marketing case in the banking industry from the view of big data analytics. Target database is a big data in which about 3.5 million customers and their transaction records have been stored for 3 years. Practical implications are derived from the marketing perspective. We address detailed processes and related field test results. It proved critical for the big data analysts to consider a sense of Veracity and Value, in addition to traditional Big Data's 3V (Volume, Velocity, and Variety), so that more significant business meanings may be extracted from the big data results.

Design of Client-Server Model For Effective Processing and Utilization of Bigdata (빅데이터의 효과적인 처리 및 활용을 위한 클라이언트-서버 모델 설계)

  • Park, Dae Seo;Kim, Hwa Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.109-122
    • /
    • 2016
  • Recently, big data analysis has developed into a field of interest to individuals and non-experts as well as companies and professionals. Accordingly, it is utilized for marketing and social problem solving by analyzing the data currently opened or collected directly. In Korea, various companies and individuals are challenging big data analysis, but it is difficult from the initial stage of analysis due to limitation of big data disclosure and collection difficulties. Nowadays, the system improvement for big data activation and big data disclosure services are variously carried out in Korea and abroad, and services for opening public data such as domestic government 3.0 (data.go.kr) are mainly implemented. In addition to the efforts made by the government, services that share data held by corporations or individuals are running, but it is difficult to find useful data because of the lack of shared data. In addition, big data traffic problems can occur because it is necessary to download and examine the entire data in order to grasp the attributes and simple information about the shared data. Therefore, We need for a new system for big data processing and utilization. First, big data pre-analysis technology is needed as a way to solve big data sharing problem. Pre-analysis is a concept proposed in this paper in order to solve the problem of sharing big data, and it means to provide users with the results generated by pre-analyzing the data in advance. Through preliminary analysis, it is possible to improve the usability of big data by providing information that can grasp the properties and characteristics of big data when the data user searches for big data. In addition, by sharing the summary data or sample data generated through the pre-analysis, it is possible to solve the security problem that may occur when the original data is disclosed, thereby enabling the big data sharing between the data provider and the data user. Second, it is necessary to quickly generate appropriate preprocessing results according to the level of disclosure or network status of raw data and to provide the results to users through big data distribution processing using spark. Third, in order to solve the problem of big traffic, the system monitors the traffic of the network in real time. When preprocessing the data requested by the user, preprocessing to a size available in the current network and transmitting it to the user is required so that no big traffic occurs. In this paper, we present various data sizes according to the level of disclosure through pre - analysis. This method is expected to show a low traffic volume when compared with the conventional method of sharing only raw data in a large number of systems. In this paper, we describe how to solve problems that occur when big data is released and used, and to help facilitate sharing and analysis. The client-server model uses SPARK for fast analysis and processing of user requests. Server Agent and a Client Agent, each of which is deployed on the Server and Client side. The Server Agent is a necessary agent for the data provider and performs preliminary analysis of big data to generate Data Descriptor with information of Sample Data, Summary Data, and Raw Data. In addition, it performs fast and efficient big data preprocessing through big data distribution processing and continuously monitors network traffic. The Client Agent is an agent placed on the data user side. It can search the big data through the Data Descriptor which is the result of the pre-analysis and can quickly search the data. The desired data can be requested from the server to download the big data according to the level of disclosure. It separates the Server Agent and the client agent when the data provider publishes the data for data to be used by the user. In particular, we focus on the Big Data Sharing, Distributed Big Data Processing, Big Traffic problem, and construct the detailed module of the client - server model and present the design method of each module. The system designed on the basis of the proposed model, the user who acquires the data analyzes the data in the desired direction or preprocesses the new data. By analyzing the newly processed data through the server agent, the data user changes its role as the data provider. The data provider can also obtain useful statistical information from the Data Descriptor of the data it discloses and become a data user to perform new analysis using the sample data. In this way, raw data is processed and processed big data is utilized by the user, thereby forming a natural shared environment. The role of data provider and data user is not distinguished, and provides an ideal shared service that enables everyone to be a provider and a user. The client-server model solves the problem of sharing big data and provides a free sharing environment to securely big data disclosure and provides an ideal shared service to easily find big data.