• Title/Summary/Keyword: Engineering Big Data

Search Result 1,845, Processing Time 0.037 seconds

A Study on the Big Data Analysis System for Searching of the Flooded Road Areas (도로 침수영역의 탐색을 위한 빅데이터 분석 시스템 연구)

  • Song, Youngmi;Kim, Chang Soo
    • Journal of Korea Multimedia Society
    • /
    • v.18 no.8
    • /
    • pp.925-934
    • /
    • 2015
  • The frequency of natural disasters because of global warming is gradually increasing, risks of flooding due to typhoon and torrential rain have also increased. Among these causes, the roads are flooded by suddenly torrential rain, and then vehicle and personal injury are happening. In this respect, because of the possibility that immersion of a road may occur in a second, it is necessary to study the rapid data collection and quick response system. Our research proposes a big data analysis system based on the collected information and a variety of system information collection methods for searching flooded road areas by torrential rains. The data related flooded roads are utilized the SNS data, meteorological data and the road link data, etc. And the big data analysis system is implemented the distributed processing system based on the Hadoop platform.

An Efficient Implementation of Mobile Raspberry Pi Hadoop Clusters for Robust and Augmented Computing Performance

  • Srinivasan, Kathiravan;Chang, Chuan-Yu;Huang, Chao-Hsi;Chang, Min-Hao;Sharma, Anant;Ankur, Avinash
    • Journal of Information Processing Systems
    • /
    • v.14 no.4
    • /
    • pp.989-1009
    • /
    • 2018
  • Rapid advances in science and technology with exponential development of smart mobile devices, workstations, supercomputers, smart gadgets and network servers has been witnessed over the past few years. The sudden increase in the Internet population and manifold growth in internet speeds has occasioned the generation of an enormous amount of data, now termed 'big data'. Given this scenario, storage of data on local servers or a personal computer is an issue, which can be resolved by utilizing cloud computing. At present, there are several cloud computing service providers available to resolve the big data issues. This paper establishes a framework that builds Hadoop clusters on the new single-board computer (SBC) Mobile Raspberry Pi. Moreover, these clusters offer facilities for storage as well as computing. Besides the fact that the regular data centers require large amounts of energy for operation, they also need cooling equipment and occupy prime real estate. However, this energy consumption scenario and the physical space constraints can be solved by employing a Mobile Raspberry Pi with Hadoop clusters that provides a cost-effective, low-power, high-speed solution along with micro-data center support for big data. Hadoop provides the required modules for the distributed processing of big data by deploying map-reduce programming approaches. In this work, the performance of SBC clusters and a single computer were compared. It can be observed from the experimental data that the SBC clusters exemplify superior performance to a single computer, by around 20%. Furthermore, the cluster processing speed for large volumes of data can be enhanced by escalating the number of SBC nodes. Data storage is accomplished by using a Hadoop Distributed File System (HDFS), which offers more flexibility and greater scalability than a single computer system.

Subnet Selection Scheme based on probability to enhance process speed of Big Data (빅 데이터의 처리속도 향상을 위한 확률기반 서브넷 선택 기법)

  • Jeong, Yoon-Su;Kim, Yong-Tae;Park, Gil-Cheol
    • Journal of Digital Convergence
    • /
    • v.13 no.9
    • /
    • pp.201-208
    • /
    • 2015
  • With services such as SNS and facebook, Big Data popularize the use of small size such as micro blogs are increasing. However, the problem of accuracy and computational cost of the search result of big data of a small size is unresolved. In this paper, we propose a subnet selection techniques based probability to improve the browsing speed of the small size of the text information from big data environments, such as micro-blogs. The proposed method is to configure the subnets to give to the attribute information of the data increased the probability data search speed. In addition, the proposed method improves the accessibility of the data by processing a pair of the connection information between the probability of the data constituting the subnet to easily access the distributed data. Experimental results showed the proposed method is 6.8% higher detection rates than CELF algorithm, the average processing time was reduced by 8.2%.

Attention-based word correlation analysis system for big data analysis (빅데이터 분석을 위한 어텐션 기반의 단어 연관관계 분석 시스템)

  • Chi-Gon, Hwang;Chang-Pyo, Yoon;Soo-Wook, Lee
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.27 no.1
    • /
    • pp.41-46
    • /
    • 2023
  • Recently, big data analysis can use various techniques according to the development of machine learning. Big data collected in reality lacks an automated refining technique for the same or similar terms based on semantic analysis of the relationship between words. Since most of the big data is described in general sentences, it is difficult to understand the meaning and terms of the sentences. To solve these problems, it is necessary to understand the morphological analysis and meaning of sentences. Accordingly, NLP, a technique for analyzing natural language, can understand the word's relationship and sentences. Among the NLP techniques, the transformer has been proposed as a way to solve the disadvantages of RNN by using self-attention composed of an encoder-decoder structure of seq2seq. In this paper, transformers are used as a way to form associations between words in order to understand the words and phrases of sentences extracted from big data.

Evaluation of Predictive Models for Early Identification of Dropout Students

  • Lee, JongHyuk;Kim, Mihye;Kim, Daehak;Gil, Joon-Min
    • Journal of Information Processing Systems
    • /
    • v.17 no.3
    • /
    • pp.630-644
    • /
    • 2021
  • Educational data analysis is attracting increasing attention with the rise of the big data industry. The amounts and types of learning data available are increasing steadily, and the information technology required to analyze these data continues to develop. The early identification of potential dropout students is very important; education is important in terms of social movement and social achievement. Here, we analyze educational data and generate predictive models for student dropout using logistic regression, a decision tree, a naïve Bayes method, and a multilayer perceptron. The multilayer perceptron model using independent variables selected via the variance analysis showed better performance than the other models. In addition, we experimentally found that not only grades but also extracurricular activities were important in terms of preventing student dropout.

An Exploratory Study on Application Plan of Big Data to Manufacturing Execution System (제조실행시스템에의 빅데이터 적용방안에 대한 탐색적 연구)

  • Noh, Kyoo-Sung;Park, Sanghwi
    • Journal of Digital Convergence
    • /
    • v.12 no.1
    • /
    • pp.305-311
    • /
    • 2014
  • The manufacturing industry early have been introducing automation and information systems of the engineering and production process for getting competitive advantage. one of the typical information systems is MES(Manufacturing Execution System) and it keeps evolving. As Big Data showed up nowadays, application method of Big Data to MES is also being sought. First, this study will do preceding research and cases study on the application of Big Data in the manufacturing industry. Then, it will suggest application Plan of Big Data to MES.

Efficient Complex Event Processing Scheme through Similar Operation Processing in Duplicate Events (중복 이벤트 유사 연산 처리를 통한 효율적인 복합 이벤트 처리 기법)

  • Kim, Daeyun;Kim, Byounghoon;Ko, Geonsik;Noh, Yeonwoo;Choi, Dojin;Lim, Jongtae;Bok, Kyoungsoo;Yoo, Jaesoo
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2016.05a
    • /
    • pp.59-60
    • /
    • 2016
  • 사물통신 기기의 발달로 다양한 응용에서 대용량의 스트림 데이터의 실시간 복합 이벤트 처리 기법에 대한 중요성이 증가되고 있다. 본 논문에서는 유사 연산 처리 비용을 감소시키기 위한 다수의 복합 이벤트 처리 기법을 제안한다. 제안하는 기법은 다수의 복합 이벤트를 처리하기 위한 연산자를 그래프로 표현하고 중복적인 연산을 감소시킨다.

  • PDF

A Model of Vital Signs Analysis based on Big Data using OCL (OCL을 이용한 빅데이터 기반의 생체신호 분석 모델)

  • Kim, Tae-Woong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.12
    • /
    • pp.1485-1491
    • /
    • 2019
  • As the type and size of vital signs become extensive at the moment lately, a research is actively progressing to define vital signs as big data and analyze it. We generally use a similar method of processing big data on social network as a way to treat vital signs as big data. Vital Sign Big Data should be extracted as feature data, stored separately, and analyzed with various analytical instruments. In other words, it should ensure interoperability and compatability of data, and the index expression in analytical tools should be concise. For this end, I defined the vital sign as the standard meta-model base of HL7 in this dissertation, and I propose a model for analyzing vital signs using OCL, the OMG's standard mathematical specification language. In addition, the proposed model can be confirmed the applicability by figuring out the consumption of calories using ECG data.

A Study on Satisfaction Survey Based on Regression Analysis to Improve Curriculum for Big Data Education (빅데이터 양성 교육 교과과정 개선을 위한 회귀분석 기반의 만족도 조사에 관한 연구)

  • Choi, Hyun
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.22 no.6
    • /
    • pp.749-756
    • /
    • 2019
  • Big data is structured and unstructured data that is so difficult to collect, store, and so on due to the huge amount of data. Many institutions, including universities, are building student convergence systems to foster talents for data science and AI convergence, but there is an absolute lack of research on what kind of education is needed and what kind of education is required for students. Therefore, in this paper, after conducting the correlation analysis based on the questionnaire on basic surveys and courses to improve the curriculum by grasping the satisfaction and demands of the participants in the "2019 Big Data Youth Talent Training Course" held at K University, Regression analysis was performed. As a result of the study, the higher the satisfaction level, the satisfaction with class or job connection, and the self-development, the more positive the evaluation of program efficiency.

On Implementing a Learning Environment for Big Data Processing using Raspberry Pi (라즈베리파이를 이용한 빅 데이터 처리 학습 환경 구축)

  • Hwang, Boram;Kim, Seonggyu
    • Journal of Digital Convergence
    • /
    • v.14 no.4
    • /
    • pp.251-258
    • /
    • 2016
  • Big data processing is a broad term for processing data sets so large or complex that traditional data processing applications are inadequate. Widespread use of smart devices results in a huge impact on the way we process data. Many organizations are contemplating how to incorporate or integrate those devices into their enterprise data systems. We have proposed a way to process big data by way of integrating Raspberry Pi into a Hadoop cluster as a computational grid. We have then shown the efficiency through several experiments and the ease of scaling of the proposed system.